GPU/AI System Performance Engineering Intern

May 24, 2026
$45 / hour

Are you applying to the internship?

Job Description

GPU/AI Application System Software Engineer Intern (System Technologies and Engineering) – 2026 Summer (BS/MS) | TikTok

The Tone:
This is a 12-week full-time internship at TikTok, located in Los Angeles, CA. TikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to a global audience. The GPU/AI System Technology and Engineering Team develops highly optimized OS and system software for deep learning and high-performance computing (HPC) workloads in large-scale data centers. This role is crucial for designing and implementing performance benchmarks and optimization tools, ensuring peak performance across the entire hardware/software stack for the next generation of AI and HPC platforms. Interns will gain remarkable experience in GPU architecture, system software development, and GPU validation within advanced hardware infrastructure on a massive scale.

The TL;DR
• Role: Internship
• Type: Full-time
• Location: In-person, Los Angeles, CA
• Pay: $45 hourly
• Team: GPU/AI System Technology and Engineering Team
• Mission: Develop highly optimized OS and system software to support deep learning and high-performance computing (HPC) workloads in large-scale data centers, focusing on core software components for next-generation AI and HPC platforms.
• Tech Stack: Python, C/C++, TensorFlow, PyTorch, Linux based systems, MPI, NCCL, UCX, NVSHMEM, CUDA, git

What You’ll Actually Do
• Benchmarking: Design and implement performance benchmarks and testing methodologies to evaluate system performance effectively.
• Optimization: Develop benchmark tools and optimize the performance of AI workloads, specifically tailored for large-scale LLM training and inference, as well as High-Performance Computing (HPC).
• Automation: Develop Python scripts to automate the testing processes for various benchmark tools, enhancing efficiency.
• Collaboration: Collaborate with internal teams to identify system bottlenecks, debug, and improve performance issues across the hardware/software stack.

The Must-Haves
• Background: Currently pursuing a Bachelor’s, Master’s, or PhD degree in Computer Engineering, Electrical Engineering, Computer Science or related majors. Must have a background with GPU/CPU benchmarking and be familiar with ML/DL techniques, algorithms, and frameworks like TensorFlow or PyTorch.
• Experience: Exposure to testing automation for various applications. Hands-on experience with Linux-based systems.
• Skills: Proficiency in Python and C/C++. Ability to work independently and complete projects from beginning to end in a timely manner.
• Bonus: Strong background in High Performance Computing, ML Hardware Acceleration (e.g., GPU/TPU/RDMA), or ML for Systems, and Distributed Storage. Experience in AI model development, training, evaluation, and deployment on Cloud, Cluster, or on-premises. Experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM), or experience with development applications using CUDA programming. Linux kernel development experience, such as networking and device drivers, familiarity with git workflow, and experience with complex system-level debugging are also highly valued.