$70-$100 per hour
Hourly contract Remote Early applicant
About the Project
We're building a large-scale evaluation benchmark for advanced AI reasoning across scientific and engineering domains. Our task designers create challenging computational problems that test whether AI systems can use real scientific software tools to solve research-grade problems from querying simulations and interpreting outputs to designing experimental strategies and recovering hidden information from data.
This is not a typical annotation or labeling role. You'll be designing original, graduate-level computational problems grounded in real scientific workflows, calibrating them against frontier AI models, and iterating on problem design until the difficulty is right.
What You'll Do
You'll design problems that require sophisticated use of domain-specific scientific software libraries. Some problems will require computing precise outputs from fully specified setups — testing whether a solver can correctly implement complex multi-step scientific workflows. Others will require something harder: designing a sequence of queries or experiments to uncover information that isn't directly visible, demanding strategic reasoning about what to measure, how to interpret partial observations, and how to narrow down possibilities efficiently.
Each task goes through a calibration loop where it's tested against state-of-the-art AI models, and you'll refine the problem design until the difficulty hits the target range.
Domains & Tools We're Hiring For
We're especially interested in experts with deep, hands-on experience in the following area:
Particle & Nuclear Physics Working with scikit-hep and related HEP Python tools for particle physics data analysis, cross-section computations, renormalization group calculations, and perturbative QCD. Experience with Monte Carlo event generation or collider phenomenology is a plus.
*experience with other specialized software for the above domain will also be considered
What Makes a Strong Candidate
You have graduate-level expertise (MS or PhD preferred) in the domain listed above, with real hands-on experience using the specific software tools, not just theoretical knowledge of the field. You've written code that calls these libraries to solve actual research problems, and you understand where they break, what their edge cases are, and what makes a problem genuinely hard versus superficially complex.
Beyond domain expertise, the strongest candidates will be able to think like a puzzle designer: constructing problems where the difficulty comes from reasoning strategy rather than brute computation, where there are multiple plausible approaches but only careful analysis reveals the right one, and where surface-level pattern matching won't get you to the answer.
Requirements
Nice to Have
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.
Mercor partners with leading AI labs and enterprises to train frontier models using human expertise. You will work on projects that focus on training and enhancing AI systems. You will be paid competitively, collaborate with leading researchers, and help shape the next generation of AI systems in your area of expertise.
Share the referral link below, and earn up to $400 for each successful referral through this unique link. There's no limit on how many people you can refer. Restrictions may apply. Learn more Don't know who to refer? Find relevant LinkedIn connections here. One Interview, Real Results AI experts share how Mercor made hiring faster, fairer, and easier — with just one interview.
Tagged as: Chemistry, Computer Science, Data Science, Earth Science, Engineering, Environmental Science, Life Sciences, Mathematics, Physics
Quality Control Scientist As part of the Thermo Fisher Scientific team, you'll discover meaningful work that makes a positive impact...
ApplyManager, Clinical Scientist We are seeking a Manager, Clinical Scientist to join our team. The Manager Clinical Scientist assists in...
ApplyClinical and Translational Researcher The Department of Psychiatry and Behavioral Sciences at the University of California, San Francisco is dedicated...
ApplyPrincipal Data Scientist, R&D Oncology Johnson & Johnson Innovative Medicine is recruiting for a Principal Data Scientist, R&D Oncology to...
ApplySenior Manager, Us Medical Promotional Review Scientist (Immunology) Working with us is anything but usual. Here, uniquely interesting work happens...
ApplyMammalian Cell Culture Scientist At Thermo Fisher Scientific, you'll discover meaningful work that makes a positive impact on a global...
ApplyPlease visit work.mercor.com.
Don't forget to mention that you found the position on jobRxiv!
