AI Research Jobs in the United States (Remote, Full-Time)
You will run applied AI research projects for US-based customers via Rex.zone, translating open-ended research questions into measurable experiments across LLM evaluation, RLHF data design, prompt evaluation, and model performance improvement.
What You Will Do
• Own end-to-end applied research cycles: problem framing, baselines, ablations, and reporting
• Build and evaluate LLM systems using offline metrics and human-in-the-loop evaluation
• Design RLHF workflows: preference data specs, rater instructions, prompt sets, and rubric-based grading
• Create evaluation datasets and test suites: prompt evaluation, red-teaming prompts, and content safety labeling protocols
• Collaborate with data labeling teams on taxonomy, edge-case coverage, and training data quality
• Perform error analysis and model debugging to improve robustness, safety, and helpfulness
• Document methodology and results for reproducibility and auditability
Required Qualifications
• Mid-Senior experience delivering applied ML research or productionized ML evaluation
• Strong Python skills; experience with PyTorch (or similar)
• Hands-on LLM evaluation, prompt evaluation, or RLHF experience
• Experiment design, metrics selection, and statistically sound interpretation
• Familiarity with dataset development: data labeling, QA evaluation, and guideline compliance checks
• Strong written communication for research artifacts and cross-functional alignment
Preferred Qualifications
• RAG/NER/structured output evaluation experience
• Exposure to computer vision or multimodal evaluation
• Content safety labeling taxonomies and policy-aligned rubrics
• MLOps for evaluation pipelines, dataset versioning, and reproducible runs
Remote Work and Collaboration
Remote, FULL_TIME role supporting United States-based projects with distributed teams across research, engineering, and data operations.
Compensation
Hourly base pay range: $30–$50/hr.