NLP Research Engineer PHD Intern

July 29, 2025

Are you applying to the internship?

Job Description

About Company

USC Michelson Center CSI-Cancer is aimed at integrating patient, model system, and high-content single cell data to translate clinically observed correlations into a mechanistic understanding of the physical and biological underpinnings of cancer dynamics. The company was founded in an unspecified year and is headquartered in Los Angeles, CA, US. It has a team of 51-200 employees and is currently in the Growth Stage.

Job Description

Job Title: NLP Research Engineer PHD Intern

Job Summary:
The CSI-Cancer: USC Michelson Center Convergent Science Institute in Cancer is seeking an NLP Research Engineer PHD Intern. This role primarily involves conducting advanced research in Natural Language Processing and developing scalable NLP systems, with a strong focus on practical applications and model optimization.

Responsibilities:
• Conduct advanced research and stay current with state-of-the-art techniques in Natural Language Processing (NLP), Generative AI, and Large Language Models (LLMs), emphasizing practical applications and scalable solutions.
• Design, develop, and optimize NLP models and algorithms for various tasks, including semantic understanding, knowledge extraction, summarization, question answering, dialogue systems, and tool-augmented reasoning.
• Collect, clean, and preprocess textual and multimodal data for training and evaluation purposes.
• Contribute to the development of scalable, production-ready NLP systems, covering model deployment, serving, monitoring, and lifecycle management, across cloud, on-premise, or edge infrastructure.
• Evaluate and refine existing NLP systems by incorporating new research, benchmarking performance, and addressing failure cases.
• Experiment with modern frameworks, open-source tools, and state-of-the-art foundation models using efficient adaptation techniques such as low-rank adaptation, quantization-aware tuning, or other parameter-efficient fine-tuning methods.
• Participate in internal research reviews, code walkthroughs, and ideation sessions.

Qualifications:

Required:
Solid grasp of NLP fundamentals, encompassing both classical and modern techniques.
Proficient in Python and familiar with ML frameworks like PyTorch, JAX.
• Experience with NLP libraries such as HuggingFace Transformers, spaCy, OpenNLP, or similar.
• Familiarity with prompt engineering, retrieval-augmented generation (RAG), and fine-tuning techniques for LLMs.
• Understanding of contemporary architectures, including agentic orchestration and coordination frameworks, and tool-augmented systems.
Strong collaboration, communication, and documentation skills.
• Experience with efficient training/fine-tuning strategies (e.g., quantization, distillation, parameter-efficient tuning) is desired.
• Familiarity with agent-based systems, multi-agent collaboration, and structured reasoning techniques such as multi-hop inference and knowledge grounding is desired.
• Exposure to multimodal AI models and pipelines integrating text with images, audio, or structured data is desired.
• Familiarity with containerization, cloud platforms (AWS, GCP) or on-premise / edge deployments is desired.
• Contributions to open-source NLP/ML projects or relevant publications in top-tier venues (e.g., ACL, EMNLP, NeurIPS, ICML, ICLR, TMLR) is desired.
• Knowledge of responsible AI practices, including fairness, explainability, and privacy in NLP systems is desired.
• Pursuing or holding an M.S. or Ph.D. in Computer Science, Machine Learning, Artificial Intelligence, Computational Linguistics, Applied Mathematics, Electrical Engineering, or a related technical field.
Must be enrolled in an educational program to be considered as an intern.
U.S. citizenship or permanent resident status required.