PhD Intern – Multimedia Lab Research

June 26, 2026
$57 / hour

Are you applying to the internship?

Job Description

Research Intern (Video Data Compression and Application) – 2026 Start (PhD) | ByteDance

The Tone:
This is a PhD internship at ByteDance. While the specific US city for this in-person role is not explicitly stated in the job description, ByteDance is a leading global technology company founded in 2012, best known for products like TikTok, Lemon8, CapCut, and Pico, along with platforms specific to the China market, including Toutiao, Douyin, and Xigua. This role contributes to the Multimedia Lab’s mission to explore and advance cutting-edge multimedia technologies, provide essential software and hardware solutions, and participate in international standardization efforts, ultimately enhancing business capabilities, user experience, and cost efficiency across the company’s diverse product suite.

The TL;DR
• Role: Internship
• Type: Temporary (12 weeks)
• Location: In-person (specific US city not specified in job listing)
• Pay: $57 hourly
• Team: Multimedia Lab
• Mission: Design, develop, and optimize innovative algorithms for data compression, processing, and large multimodal model applications to advance multimedia technologies.
• Tech Stack: Python, PyTorch, C/C++, TensorFlow, YOLO, CUDA

What You’ll Actually Do
• Design and develop innovative algorithms for data compression, processing, and large multimodal model applications.
• Optimize algorithms specifically for 2D video, multiview video, point clouds, Gaussian splatting–based coding, NN-based coding, token compression, and KV cache optimization.
• Stay informed about state-of-the-art techniques through active participation in standardization activities and by following leading conference and journal publications.
• Build prototypes and demonstrations to showcase new technologies and research outcomes.
• Contribute to technical reports, academic publications, and the filing of patents.

The Must-Haves
• Background: Current Ph.D. student in computer science, electrical engineering, mathematics, statistics, data science, or related disciplines.
• Experience: Familiarity with token, image, or video coding and processing, or experience with large multimodal models; strong Computer Science fundamentals including algorithms, data structures, and software design.
• Skills: Proficiency in Python, PyTorch, and C/C++; solid written and verbal communication skills; collaborative mindset.
• Bonus: Good understanding of state-of-the-art compression algorithms; rich experience with video/image coding standards (e.g., H.263/264/265/266, MPEG-2/4, JPEG, JPEG 2000, AV1, AV2, AVS1/2/3); proficiency in deep learning frameworks like TensorFlow and YOLO; hands-on experience with large language models (LLMs) and vision-language models (VLMs), including LoRA, diffusion models, and VQA tasks; experience with model training, fine-tuning, and evaluation pipelines; familiarity with GPU acceleration and CUDA environment configuration.