Are you applying to the internship?
Job Description
About the Job: AML-MLsys Engineer
The AML-MLsys team is at the forefront of combining system engineering and the art of machine learning to develop and maintain massively distributed ML training and Inference systems/services around the world. Our mission is to provide high-performance, highly reliable, and scalable systems for cutting-edge applications such as LLM/AIGC/AGI.
Joining our team offers a unique opportunity to build large-scale heterogeneous systems, integrating technologies like GPU, NPU, RDMA, and Storage, ensuring their stable and reliable operation. You will enrich your expertise in coding, performance analysis, and distributed systems, and play an active role in the decision-making process. This is a chance to be part of a global team with members spanning the United States, China, and Singapore, collaborating towards a unified project direction.
Application Information
Successful candidates must be able to commit to an onboarding date by the end of year 2026. Please clearly state your availability and graduation date in your resume.
Candidates can apply to a maximum of two positions and will be considered for jobs in the order you apply. This application limit is applicable to ByteDance and its affiliates’ jobs globally. Applications will be reviewed on a rolling basis, so we encourage you to apply early.
Responsibilities
- Responsible for developing and optimizing LLM training, inference, and Reinforcement Learning (RL) frameworks.
- Working closely with model researchers to scale LLM training and RL to the next level.
- Responsible for GPU and CUDA Performance optimization to create an industry-leading high-performance LLM training, inference, and RL engine.
Qualifications
Minimum Qualifications
- Bachelor’s degree or above, majoring in computer science, electronics, automation, software, or a related field.
- Proficient in algorithms and data structures, familiar with Python.
- Understand the basic principles of deep learning algorithms, be familiar with the basic architecture of neural networks, and understand deep learning training frameworks such as PyTorch.
Preferred Qualifications
- Proficient in GPU high-performance computing optimization technology on CUDA, with an in-depth understanding of computer architecture.
- Familiar with parallel computing optimization, memory access optimization, low-bit computing, and other related techniques.
- Familiar with technologies and frameworks such as FSDP, DeepSpeed, JAX SPMD, Megatron-LM, Verl, TensorRT-LLM, ORCA, VLLM, SGLang, etc.
- Knowledge of LLM models, with experience in accelerating LLM model optimization being preferred.
By submitting an application for this role, you accept and agree to our global applicant privacy policy, which may be accessed here.
Job Information: Compensation & Benefits
Compensation Description (Annually)
The base salary range for this position in the selected city is $122,574 – $187,200 annually.
Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies, experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives and restricted stock units.
Benefits
Benefits may vary depending on the nature of employment and the country work location. Employees have day-one access to:
- Medical, dental, and vision insurance
- A 401(k) savings plan with company match
- Paid parental leave
- Short-term and long-term disability coverage
- Life insurance
- Wellbeing benefits, among others
Employees also receive 10 paid holidays per year, 10 paid sick days per year, and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure). The Company reserves the right to modify or change these benefits programs at any time, with or without notice.
For Los Angeles County (unincorporated) Candidates
Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse, and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment:
- Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;
- Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and
- Exercising sound judgment.
About Us
Founded in 2012, ByteDance’s mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut, and Pico, as well as platforms specific to the China market such as Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
Why Join ByteDance
Inspiring creativity is at the core of ByteDance’s mission. Our innovative products are built to help people authentically express themselves, discover, and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity, and enrich life – a mission we work towards every day.
As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make an impact in a rapidly growing tech company. By constantly iterating and fostering an “Always Day 1” mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.
Diversity & Inclusion
ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.
Reasonable Accommodation
ByteDance is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://tinyurl.com/RA-request.