PhD Intern, System Performance Optimization for Generative Models

Software Development

San Jose, CA

Posted 2 months ago

$60 / hour

Internship

Apply Now

Are you applying to the internship?

Job Description

Research Scientist Intern (TikTok-Privacy Innovation Lab-GPU Systems & Model Optimization) | TikTok

The Tone:
This is a PhD internship at TikTok, a leading destination for short-form mobile video. TikTok’s mission is to inspire creativity and bring joy through its global products. This role matters as it directly contributes to exploring the next frontier of privacy technology and theory, ensuring user privacy is a top priority in the design and implementation of next-generation generative foundation models. Interns actively contribute to products, research, and emerging technologies that shape the future of a privacy-friendly digital experience.

The TL;DR
• Role: Internship (PhD)
• Location: Flexible (Remote/In-person options available)
• Pay: $60 hourly
• Team: Privacy Innovation (PI) Lab
• Mission: Design and optimize GPU systems and models for privacy-preserving, large-scale generative foundation models.
• Tech Stack: Triton, CUDA, CUTLASS, PyTorch, XLA, Nsight, nvprof, nsys

What You’ll Actually Do
• Design and implement high-performance GPU kernels for core components such as Transformer, Attention, MoE, and Diffusion models.
• Perform end-to-end optimization for large model training workloads, focusing on efficiency and performance.
• Conduct in-depth analysis of GPU execution bottlenecks, including compute, memory, and scheduling issues.
• Use and extend Triton, CUDA, and CUTLASS, integrating optimized kernels with PyTorch, XLA, or custom runtimes.
• Collaborate closely with model research teams to translate new model architectures into efficient, production-ready implementations.

The Must-Haves
• Background: Currently pursuing a PhD in Computer Science, Computer Engineering, or a related technical discipline.
• Experience: Solid understanding of GPU architecture and execution models; strong familiarity with Transformer / Attention computation patterns and performance bottlenecks.
• Skills: Proficiency in CUDA C++ or Triton, with the ability to independently write and optimize kernels; ability to read, reproduce, and reason about systems papers or open-source implementations.
• Bonus: Hands-on experience with large-scale model training; familiarity with PyTorch internals (e.g., Autograd, dispatcher, ATen); experience with kernel profiling and performance tuning (e.g., Nsight, nvprof, nsys); publications, open-source contributions, or performance benchmark results.

Software Engineer Intern – Low-Latency Trading Systems Internship

Tower Research Capital

Posted 3 days ago New York, NY Software Development $3.5K - $5.7K / week

View

Engineering Internship – AI Engineering Internship

Notion

Posted 1 week ago San Francisco, CA Software Development $57 - $61 / hour

View

Full Stack Engineering Intern – Mobile App Development Full TimeInternship

CloutCred

Posted 1 week ago Remote Software Development

View

Machine Learning Engineer Full TimeInternship

Tesla

Posted 1 week ago Fremont, CA Software Development $40 - $56 / hour

View

Date Posted

2 months ago
Location

San Jose, CA
Offered Salary:

$60 / hour
Expiration date

August 5, 2026
Gender

Neutral
Qualification

Doctorate Degree
Career Level

Student

AI Resume Builder

LinkedIn Optimizer

AI Cover Letter Trending

AI Mock Interview Trending

EzApply Chrome Extension

AI Pitch Generator New

PhD Intern, System Performance Optimization for Generative Models

Are you applying to the internship?

Job Description

Related Jobs

Products

For Candidates

Company

Welcome to Internexxus

Reset Password

Welcome to Internexxus

PhD Intern, System Performance Optimization for Generative Models

Are you applying to the internship?

Job Description

Share this post

Related Jobs

Login to Internexxus

Reset Password

Create a free Internexxus account

Products

For Candidates

Company