Are you applying to the internship?
Job Description
Intelligent Sensing Intern – Low-power Agent Sensing and Computing Systems – Global Frontier Tech Recruitment Program – 2027 Start (PHD) | ByteDance
The Tone:
This is a PhD internship at ByteDance, with opportunities for start dates in 2026 or 2027, located in the US. ByteDance, a company focused on inspiring creativity, builds innovative products that help people express themselves, discover, and connect globally. This role is crucial for contributing to the company’s products, research, future plans, and emerging technologies by exploring next-generation intelligent hardware. The internship experience emphasizes hands-on learning, community building, development events, and collaboration with industry experts.
The TL;DR
• Role: PhD Internship
• Location: US-based
• Pay: $55 hourly
• Team: VST/camera team
• Mission: Overcome the limitations of conventional visual perception systems for AI Agents by deeply integrating sensing and computing, enabling round-the-clock environmental awareness and efficient user intent capture.
• Tech Stack: VLM/LLM, world models, large-scale vision foundation models
What You’ll Actually Do
• Design: Design and prototype novel sensor or imaging architectures that integrate computation closer to the sensing front-end, such as near-sensor processing, event-driven capture, or learned pixel-level compression.
• Build: Construct and characterize end-to-end imaging pipelines, from optical and sensor physics through ISP, to downstream perception models, identifying inefficiencies and opportunities for intelligence injection.
• Inform: Utilize knowledge of VLM/LLM and world models to guide decisions on what information the sensing front-end must preserve, discard, or transform, bridging foundation model requirements with hardware design.
• Develop: Create or adapt machine vision models that are co-optimized with hardware constraints, including considerations for power, bandwidth, and latency.
The Must-Haves
• Background: PhD candidate in Electrical Engineering, Physics, Computer Engineering, AI, or a related discipline.
• Experience: Hands-on experience with VLM/LLM, world models, or large-scale vision foundation models, including understanding of their data and architectural requirements. A track record of high-impact publications such as Nature, Science, CVPR, ICCV, ECCV, NeurIPS, or SIGGRAPH.
• Skills: Strong foundation in at least one of the following areas: imaging systems, sensor technology, or machine learning/computer vision. Ability to bridge hardware and algorithm thinking, from device-level concepts to model training workflows. Strong communication skills with demonstrated ability to collaborate effectively in teams.
• Bonus: Deep experience working across both hardware (e.g., sensors, imaging pipelines) and software/model development. Comfortable operating in ambiguous environments with an “explorer mindset”. Experience with AI-assisted development tools in research or engineering workflows. Multidisciplinary project experience and ability to quickly ramp up in unfamiliar domains.