ML Systems Engineering Intern

September 29, 2025
$60 / hour

Are you applying to the internship?

Job Description

About Company

ByteDance, founded in 2012, has a core mission to inspire creativity and enrich life. The company boasts a diverse suite of over a dozen products, including globally recognized platforms like TikTok, Lemon8, CapCut, and Pico, as well as popular platforms specific to the China market such as Toutiao, Douyin, and Xigua. These products are designed to make it easier and more enjoyable for people to connect with, consume, and create content.

ByteDance emphasizes a culture of innovation, collaboration, and continuous growth. Employees, referred to as “ByteDancers,” are encouraged to be curious, humble, and impactful within a rapidly expanding tech environment. The company fosters an “Always Day 1” mindset, aiming for meaningful breakthroughs for its employees, the company, and its users.

Diversity & Inclusion are central to ByteDance’s values. The company is committed to creating an inclusive space where employees are valued for their diverse skills, experiences, and perspectives, reflecting the global communities its platforms serve. They are dedicated to celebrating diverse voices and creating an environment that mirrors the many communities they reach. ByteDance also provides reasonable accommodations in recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other protected reasons.

Job Description

This is an Internship position for 2026 within the AML-MLsys team, focusing on Machine Learning Systems engineering. The internship aims to provide students with industry exposure, hands-on experience, fundamental skill development, and career exploration. It’s a 12-week full-time program during Summer or Fall 2026, combining real-world scenario application with development workshops and social events.

The AML-MLsys team is responsible for combining system engineering and machine learning to develop and maintain massively distributed ML training and inference systems/services globally. These systems are designed to provide high-performance, highly reliable, and scalable solutions for LLM/AIGC/AGI (Large Language Models/AI-Generated Content/Artificial General Intelligence). Interns will have the opportunity to work on large-scale heterogeneous systems integrating GPU/NPU/RDMA/Storage and ensure their stability and reliability. The role involves enriching expertise in coding, performance analysis, and distributed systems, and being involved in decision-making processes within a global team spread across the United States, China, and Singapore.

Responsibilities:

Participating in online architecture design and optimization centered around LLM inference tasks, with the goal of achieving high concurrency and throughput in large-scale online systems.
Participating in the establishment of a comprehensive system covering stability, disaster recovery, R&D efficiency, and cost, thereby enhancing overall system stability.
Participating in the design and implementation of end-to-end online pipeline systems featuring multiple models, plugins, and storage-computation components, enabling agile, flexible, and observable continuous delivery.
Collaborating closely with Machine Learning Engineers (MLEs) for the optimization of algorithms and systems.
• Demonstrating qualities such as being proactive, optimistic, highly responsible, possessing a meticulous work ethic, and exhibiting strong team communication and collaboration skills.

Minimum Qualifications:

• Currently pursuing an Undergraduate or Master’s degree in Computer Science or a related technical discipline.
Excellent coding skills, a strong understanding of data structures, and fundamental knowledge of algorithms.
• Proficiency in programming languages such as C/C++, Java, Go, Python, etc.
• Rich experience in online architecture, coupled with the ability to troubleshoot independently.
• Strong sense of responsibility, good learning ability, communication skills, and self-motivation.
• Must be able to commit to a 12-week full-time work period during Summer or Fall 2026.

Preferred Qualifications:

• Understanding of GPU hardware architecture, familiarity with the GPU software stack (CUDA, cuDNN), and experience in GPU performance analysis.
• Knowledge of LLM models and experience in accelerating LLM model optimization is preferred.

Application Process & Logistics:

• Candidates can apply to a maximum of two positions globally across ByteDance and its affiliates. Applications are considered in the order they are submitted.
• Applications are reviewed on a rolling basis, so early application is encouraged.
• Applicants must clearly state their availability (Start date, End date) in their resume.
• Candidates who pass the resume screening will be invited to participate in ByteDance’s technical online assessment.
Summer 2026 Start Dates: May 11th, May 18th, May 26th, June 8th, June 22nd.

Compensation & Benefits:

• The hourly rate range for this position in the selected city is $45 – $60.
• Benefits may vary by employment nature and country work location.
• Interns receive day-one access to health insurance, life insurance, and wellbeing benefits.
• They also get 10 paid holidays per year and paid sick time (56 hours if hired in the first half of the year, 40 if hired in the second half).
• Interns not working 100% remotely may also be eligible for a housing allowance.
• The company reserves the right to modify or change benefits programs at any time.

Fair Chance Ordinance (For Los Angeles County Unincorporated Candidates):

• Qualified applicants with arrest or conviction records will be considered in accordance with federal, state, and local laws, including the Los Angeles County Fair Chance Ordinance. The company believes criminal history may directly and negatively relate to specific job duties (interacting with clients/colleagues, handling confidential information, exercising sound judgment), potentially leading to a conditional offer withdrawal.