Machine Learning PhD Intern

June 12, 2025

Are you applying to the internship?

Job Description

About the Company:

Truveta is the world’s first health provider led data platform with a vision of Saving Lives with Data. Its mission is to enable researchers to find cures faster, empower every clinician to be an expert, and help families make the most informed decisions about their care. Truveta is headquartered in the greater Seattle area, but embraces a remote culture.

Job Description:

Truveta is seeking a highly motivated and talented Machine Learning PhD Intern to join its AI research team. The intern will contribute to innovative projects in the field of Large Language Modeling (LLM) and clinical data analysis. This internship is designed for PhD candidates who have completed their coursework and are primarily focused on research, with less than a year remaining until graduation.

Internship Details:

• Minimum of 10 weeks, with the potential for extension.
• Involves defining research strategies within a domain and applying innovative solutions to Truveta’s products.

Responsibilities:

• Collaborate with researchers and engineers to design, develop, and refine large language models and generative models.
• Develop novel algorithms and methodologies for generative modeling tasks.
• Implement, train, and fine-tune LLM and GPT-like models on large-scale datasets.
• Stay up-to-date with the latest research advancements in language modeling, generative modeling, and machine learning.
• Deliver innovation in trustworthy healthcare.

Key Qualifications:

• Currently pursuing a Ph.D. in Computer Science, Electrical Engineering, or a related field, with a focus on machine learning, natural language processing (NLP), Large Language Models (LLMs), multi-modal foundation models, and generative AI.
• Strong theoretical and practical background in NLP including experience with state-of-the-art architectures.
• Proficiency in deep learning frameworks (e.g., PyTorch, TensorFlow, etc.) and libraries commonly used in NLP and Generative AI.
• Solid programming skills in Python.
• Excellent problem-solving and troubleshooting abilities.
• Strong communication skills.

Preferred Qualifications:

• Experience with distributed parallel training, large-scale multi-modal foundation and generative models
• Familiarity with parameter-efficient tuning techniques, Reinforcement Learning from Human Feedback (RLHF), and prompt engineering techniques
• Familiarity with training multi-modal foundation models
• Familiarity with cloud-based infrastructure and experience deploying large-scale machine learning models in production environments
• A track record of publications and contributions to the machine learning and natural language processing communities