AI Inference Stack Intern

January 8, 2025
$150000 / year

Are you applying to the internship?

Job Description

About the Company: Tesla is a leading company in electric vehicles, and this internship is specifically within their AI team focused on optimizing the performance of neural networks within their vehicles and Optimus (their robot). The team works closely with AI and hardware engineers, focusing on the internal workings of the AI inference stack and compiler. Their work directly impacts the performance of their vehicles and the ability to deploy increasingly complex models. They utilize a cutting-edge co-designed MLIR compiler and runtime architecture.

Job Description:

This is a full-time, on-site internship in Palo Alto, CA, lasting a minimum of 12 weeks (May/June – Aug/Sept). The intern will be responsible for working on parts of the AI inference stack, specifically the export, compiler, or runtime components (depending on skills and interests). The work is highly collaborative, requiring close interaction with:

The AI team: Guiding them on designing and developing neural networks for production.
The hardware team: Understanding the current hardware architecture and proposing improvements.

Specific responsibilities include:

Taking ownership of parts of the AI inference stack.
Collaborating with the AI team to guide them on designing and developing Neural Networks for production.
Collaborating with the HW team to understand current HW architecture and propose future improvements.
Developing algorithms to improve performance and reduce compiler overhead.
Debugging functional and performance issues on massively-parallel systems.
Working on architecture-specific neural network optimization algorithms for high-performance computing.

Required Skills and Qualifications:

• Pursuing a degree in Computer Science, Computer Engineering, or a relevant field, graduating between 2025 and 2026.
• Ability to relocate and work on-site in Palo Alto, CA.
• Strong C++ programming skills and familiarity with Python.
• Solid understanding of machine learning concepts and fundamentals.
• Ability to deliver results with minimal oversight.

Highly Desirable Skills:

• Experience with quantization, MLIR, CUDA, and LLMs.