PhD Research Intern – Multimedia Systems & AI

June 22, 2026
$57 / hour

Are you applying to the internship?

Job Description

Research Intern (Video Data Compression and Application) – 2026 Start (PhD) | ByteDance

The Tone:
This is a PhD internship at ByteDance, located in Los Angeles, CA. ByteDance is a global technology company known for a suite of products including TikTok, Lemon8, CapCut, and Pico, as well as platforms like Toutiao, Douyin, and Xigua for the China market. This role is crucial for the Multimedia Lab’s mission to explore and lead cutting-edge multimedia technologies, developing software and hardware solutions that enhance content generation, analysis, processing, and innovative interaction to improve business cost, experience, and capabilities.

The TL;DR
• Role: Internship
• Location: In-person, Los Angeles, CA
• Pay: $57 hourly
• Team: Multimedia Lab
• Mission: To contribute to developing cutting-edge multimedia technologies and empower ByteDance’s business through advanced data compression, processing, and large multimodal model applications.
• Tech Stack: Python, PyTorch, C/C++, TensorFlow, YOLO, CUDA

What You’ll Actually Do
• Algorithm Design: Design, develop, and optimize innovative algorithms for data compression, processing, and large multimodal model applications, covering areas like 2D video, multiview video, point clouds, Gaussian splatting-based coding, token compression, and KV cache optimization.
• Technology Monitoring: Stay updated with state-of-the-art techniques through active participation in standardization activities or by following leading conference and journal publications.
• Prototype Development: Build functional prototypes and demonstrations for new algorithms and technologies.
• Research Contribution: Contribute to the creation of technical reports, publications for academic dissemination, and patent filings to protect intellectual property.

The Must-Haves
• Background: Current Ph.D. student in computer science, electrical engineering, mathematics, statistics, data science, or related fields.
• Experience: Strong foundational knowledge in computer science fundamentals including algorithms, data structures, and software design, coupled with solid problem-solving skills. Familiarity with token, image, or video coding and processing, or large multimodal models.
• Skills: Proficiency in Python and PyTorch. Working knowledge of C/C++. Collaborative mindset supported by strong written and verbal communication abilities.
• Bonus: Good understanding and practical experience with state-of-the-art compression algorithms and multimedia coding standards (e.g., H.26x, MPEG, JPEG, AV1, AV2, AVS). Hands-on experience with deep learning frameworks like TensorFlow and YOLO, and large language models (LLMs) and vision-language models (VLMs) including techniques like LoRA, diffusion models, and VQA tasks. Experience with model training, fine-tuning, and evaluation pipelines, and familiarity with GPU acceleration and CUDA environment configuration.