Summer Research Intern – AI Evaluation and Benchmarking

Information Technology

Palo Alto, CA

June 13, 2026

$25 - $60 / hour

Internship

Apply Now

Are you applying to the internship?

Job Description

Summer Research Intern | Abaka AI

The Tone:
This is a Summer Research Intern internship at Abaka AI, located in Palo Alto, CA. Abaka AI specializes in building high-quality datasets, benchmarks, and evaluation pipelines across various domains of artificial intelligence, including LLMs, vision, video, 3D/4D, multimodal reasoning, agentic systems, and world models. In this role, interns will directly contribute to research artifacts that are actively utilized by prominent AI laboratories and academic groups, gaining high-ownership experience in shaping critical evaluation pipelines.

The TL;DR
• Role: Internship
• Type: Seasonal
• Location: In-person, Palo Alto, CA
• Pay: $25–$60 hourly
• Team: Works closely with Abaka AI’s internal research team and external collaborators from the 2077AI Foundation.
• Mission: Help build high-quality datasets, benchmarks, and evaluation pipelines across various AI domains to advance AI research.
• Tech Stack: Python, PyTorch, LM Eval Harness, OpenCompass, Blender, COLMAP, PyTorch3D, Open3D

What You’ll Actually Do
• Dataset Design: Design and construct high-quality datasets and benchmarks for LLM reasoning and QA, vision and vision-language modeling, video understanding, or 3D/4D perception.
• Model Evaluation: Evaluate LLMs, VLMs, Video-LLMs, and multimodal models on reasoning, factuality, temporal understanding, and spatial tasks.
• Pipeline Development: Develop and maintain evaluation pipelines, metrics, and quality-control criteria for expert-level data generation.
• Performance Analysis: Analyze model outputs, conduct error taxonomy and failure analysis, and summarize insights for internal reports and research papers.
• Research Support: Support research on long-context modeling, data efficiency, compression strategies, and benchmark standardization.

The Must-Haves
• Background: Student with a strong background in computer science, artificial intelligence, robotics, data engineering, or related fields.
• Experience: Hands-on experience with machine learning or multimodal systems, including LLMs, vision models, or video models.
• Skills: Proficient in Python; experience with PyTorch or similar frameworks. Possesses strong analytical reasoning skills and the ability to reason about model behavior and data quality. Excellent written and verbal English communication skills.
• Bonus: Experience with LLM or multimodal evaluation frameworks such as LM Eval Harness or OpenCompass. Has a background in computer vision, video understanding, or multimodal learning. Experience with 3D/4D data pipelines, graphics, or robotics tools (e.g., Blender, COLMAP, PyTorch3D, Open3D). Familiarity with NeRFs, Gaussian Splatting, SLAM, or embodied AI datasets and simulators. Experience with video QA, action recognition, or long-context transformer models. Relevant research experience or publications in top-tier conferences.

Date Posted

June 13, 2026
Location

Palo Alto, CA
Offered Salary:

$25 - $60 / hour
Expiration date

--
Gender

Neutral
Career Level

Student

AI Resume Builder

LinkedIn Optimizer

AI Cover Letter Trending

AI Mock Interview Trending

EzApply Chrome Extension

AI Pitch Generator New

Summer Research Intern – AI Evaluation and Benchmarking

Are you applying to the internship?

Job Description

Related Jobs

Data Scientist Intern – Sports Modeling

IT Intern specializing in Application Development – Application Development

Internship- Business Data Analyst

SAP iXp Intern – Data & AI

Products

For Candidates

Company

Welcome to Internexxus

Reset Password

Welcome to Internexxus

Summer Research Intern – AI Evaluation and Benchmarking

Are you applying to the internship?

Job Description

Share this post

Related Jobs

Login to Internexxus

Reset Password

Create a free Internexxus account

Products

For Candidates

Company