Are you applying to the internship?
Job Description
Summer Research Intern | Abaka AI
The Tone:
This is a Summer Research Intern internship at Abaka AI, located in Palo Alto, CA. Abaka AI specializes in building high-quality datasets, benchmarks, and evaluation pipelines across various domains of artificial intelligence, including LLMs, vision, video, 3D/4D, multimodal reasoning, agentic systems, and world models. In this role, interns will directly contribute to research artifacts that are actively utilized by prominent AI laboratories and academic groups, gaining high-ownership experience in shaping critical evaluation pipelines.
The TL;DR
• Role: Internship
• Type: Seasonal
• Location: In-person, Palo Alto, CA
• Pay: $25–$60 hourly
• Team: Works closely with Abaka AI’s internal research team and external collaborators from the 2077AI Foundation.
• Mission: Help build high-quality datasets, benchmarks, and evaluation pipelines across various AI domains to advance AI research.
• Tech Stack: Python, PyTorch, LM Eval Harness, OpenCompass, Blender, COLMAP, PyTorch3D, Open3D
What You’ll Actually Do
• Dataset Design: Design and construct high-quality datasets and benchmarks for LLM reasoning and QA, vision and vision-language modeling, video understanding, or 3D/4D perception.
• Model Evaluation: Evaluate LLMs, VLMs, Video-LLMs, and multimodal models on reasoning, factuality, temporal understanding, and spatial tasks.
• Pipeline Development: Develop and maintain evaluation pipelines, metrics, and quality-control criteria for expert-level data generation.
• Performance Analysis: Analyze model outputs, conduct error taxonomy and failure analysis, and summarize insights for internal reports and research papers.
• Research Support: Support research on long-context modeling, data efficiency, compression strategies, and benchmark standardization.
The Must-Haves
• Background: Student with a strong background in computer science, artificial intelligence, robotics, data engineering, or related fields.
• Experience: Hands-on experience with machine learning or multimodal systems, including LLMs, vision models, or video models.
• Skills: Proficient in Python; experience with PyTorch or similar frameworks. Possesses strong analytical reasoning skills and the ability to reason about model behavior and data quality. Excellent written and verbal English communication skills.
• Bonus: Experience with LLM or multimodal evaluation frameworks such as LM Eval Harness or OpenCompass. Has a background in computer vision, video understanding, or multimodal learning. Experience with 3D/4D data pipelines, graphics, or robotics tools (e.g., Blender, COLMAP, PyTorch3D, Open3D). Familiarity with NeRFs, Gaussian Splatting, SLAM, or embodied AI datasets and simulators. Experience with video QA, action recognition, or long-context transformer models. Relevant research experience or publications in top-tier conferences.