Are you applying to the internship?
Job Description
About the Job
Location: On-site / hybrid in the El Paso, TX – Las Cruces, NM corridor, supporting work at White Sands Missile Range (WSMR) and nearby government sites. Open to relocation candidates.
Travel: Local travel to WSMR/Fort Bliss; occasional CONUS travel for test events. May include weekends and nights.
Clearance: Active Secret required. TS/SCI upgrade supported.
Employment: Full-time
About Torch
Torch is an AI-powered interview intelligence platform used by government customers to generate structured reporting, detect leads, recommend follow-up questions, and analyze interviews at scale. Your job is to make those outputs more accurate, more defensible, and more useful in real-world conditions that include offline and edge-constrained environments.
You’ll report to the onsite Project Team leader and work closely with a small exercise support team. You’ll own problems end-to-end: designing evaluation frameworks, improving prompts and retrieval pipelines, analyzing interview data, and shipping measurable improvements into production.
Responsibilities
LLM Quality & Evaluation
- Improve prompting strategies and structured outputs across reporting formats including law enforcement, intelligence, after-action, interview summaries, and survey analysis.
- Design evaluation sets, scoring rubrics, and automated evaluation pipelines (including LLM-as-judge approaches) for relevance, coherence, completeness, and error modes.
- Reduce hallucinations and improve traceability and attribution.
RAG & Knowledge Pipelines
- Build and iterate on RAG pipelines, curated knowledge packs, and question-tree triggers.
- Create and maintain base datasets (follow-up triggers, Essential Elements of Information/Critical Information Requirements, glossaries, watchlist cues) with versioning and documentation.
- Tune retrieval and reranking to perform reliably under edge constraints (limited compute, memory, and connectivity).
Interview Analytics
- Analyze transcripts to surface evasiveness, inconsistencies, and actionable leads.
- Develop labeling strategies, analytic rubrics, and ground-truth datasets.
- Conduct quantitative and qualitative analysis of interview data to identify patterns and support operational decisions.
Measurement & Documentation
- Build lightweight dashboards and metrics for model performance and field reliability.
- Document methods and maintain audit trails so outputs remain defensible for government end users.
- Partner with engineering to validate and ship improvements into production.
Required Qualifications
- Active Secret clearance.
- 2–5 years of applied data science or NLP experience.
- Strong Python skills (pandas, NumPy, scikit-learn) with comfort standing up experiments and pipelines.
- Hands-on experience with LLMs: prompt engineering, output evaluation, safety, and quality controls.
- Experience with unstructured text data — cleaning, labeling, building evaluation metrics.
- Proficiency in data analysis, reporting, and visualization for technical and non-technical audiences.
- Ability to work on-site in the El Paso / Las Cruces / WSMR area.
Preferred Qualifications
- RAG implementation experience (vector databases, embeddings, reranking).
- Experience with structured evaluation frameworks (RAGAS, custom LLM-as-judge, or equivalent).
- Familiarity with edge or offline deployment constraints.
- Exposure to interview analytics, structured debriefing, structured reporting, or HUMINT-adjacent workflows.
- Experience delivering in classified or regulated environments.
What We Offer
- Opportunity to work on a meaningful mission with amazing teammates.
- Competitive base salary.
- 12.5% flexible benefits allowance on top of base pay: use it for health coverage, retirement, additional time off, or other personal priorities.
- 6% employer 401(k) contribution (non-elective) with a 4-year rolling vesting schedule.
- 20 days PTO plus 11 paid federal holidays, with option to purchase up to 10 additional days.
- TS/SCI clearance upgrade pathway.