S
SuperAIDevs
IQ

Senior AI Scientist

IQVIA · ·

Full-timeDurham, NCPosted 29 days agoSalary estimated
$0K–$0K est.Apply Now →

About the Role

Role Overview We are seeking a Senior AI Scientist to lead the design, development, and operationalization of evaluation frameworks for Generative AI systems, with a primary focus on Large Language Models (LLMs) and agentic AI solutions. This role will be responsible for defining and implementing robust methods to assess quality, safety, reliability, and business impact across LLM-powered applications and multi-agent workflows. The position operates within regulated environments such as life sciences, clinical research, and regulatory domains, ensuring that AI systems meet enterprise and compliance standards. Key Responsibilities LLM Evaluation & Benchmarking Design and implement scalable evaluation frameworks for LLMs across use cases including: Question answering, summarization, information extraction, and reasoning Clinical and regulatory document generation (e.g., ICFs, CSRs, protocols) Develop both automated and human-in-the-loop evaluation pipelines Define, measure, and monitor key performance metrics, including: Accuracy, factuality, faithfulness, and hallucination rate Robustness, consistency, latency, and cost-performance trade-offs Build domain-specific benchmarks using real-world clinical, regulatory, and RWD data Agentic AI & Multi-Agent Evaluation Establish evaluation strategies for agent-based and multi-agent systems Measure and analyze: Task completion success rates Planning and reasoning quality Tool usage accuracy Inter-agent coordination and failure patterns Develop scenario-based and simulation-driven evaluation environments Evaluate orchestration frameworks (e.g., LangGraph, Semantic Kernel, Claude Agents) End-to-End System Evaluation Define evaluation strategies for complete AI pipelines, including: Retrieval-Augmented Generation (RAG) systems Tool-augmented agents Knowledge graph + LLM architectures Implement offline and online evaluation mechanisms, such as: A/B testing and canary releases Production monitoring and model drift detection Enable observability and traceability using tools such as LangSmith and OpenTelemetry Responsible AI & Compliance Ensure all evaluation practices align with Responsible AI principles and regulatory requirements (e.g., GxP) Assess and mitigate risks related to: Bias, fairness, safety, and explainability Data leakage and privacy concerns Develop audit-ready evaluation frameworks suitable for regulated environments (e.g., FDA, EMA) Tooling & Platform Development Build and scale evaluation tooling, including: Automated evaluation pipelines Prompt and version tracking systems Experiment management platforms Integrate evaluation frameworks with enterprise AI platforms (e.g., Azure, Databricks, AWS, on-prem GPU environments) Leadership & Collaboration Collaborate with cross-functional teams including: AI/ML engineers, product teams, and domain scientists Clinical, regulatory, and real-world evidence stakeholders Establish enterprise-wide evaluation standards and best practices Mentor junior team members and contribute to strategic AI initiatives Required Qualifications Master’s or PhD in Computer Science, Artificial Intelligence, Machine Learning, or a related field 5+ years of experience in applied AI/ML, with a strong focus on Generative AI and LLMs Demonstrated experience in: LLM evaluation (both automated and human-in-the-loop) Prompt engineering and model behavior analysis Python programming using frameworks such as PyTorch or TensorFlow Hands-on experience with: RAG systems, embeddings, and vector databases Agent frameworks (e.g., LangChain, LangGraph, Semantic Kernel) Strong understanding of: Evaluation metrics and experimental design Model limitations, failure modes, and debugging techniques Preferred Qualifications Experience in life sciences, healthcare, or regulated environments Familiarity with: Clinical trial workflows (ICF, CSR, TMF, regulatory submissions) Knowledge graphs and biomedical data systems Experience with evaluation tools such as LangSmith, Promptfoo, DeepEval, HELM, or OpenAI Evals Exposure to Responsible AI frameworks and regulatory compliance standards What Success Looks Like Standardized evaluation frameworks adopted across AI teams Measurable improvements in LLM reliability and agent performance Defined quality gates to support production-ready AI deployment decisions Strong alignment between AI model performance and business/clinical outcomes IQVIA is a leading global provider of clinical research services, commercial insights and healthcare intelligence to the life sciences and healthcare industries. We create intelligent connections to accelerate the development and commercialization of innovative medical treatments to help improve patient outcomes and population health worldwide. Learn more at https://jobs.iqvia.com IQVIA is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other status protected by applicable law. https://jobs.iqvia.com/eoe IQVIA is committed to integrity in our hiring process and maintains a zero tolerance policy for candidate fraud. All information and credentials submitted in your application must be truthful and complete. Any false statements, misrepresentations, or material omissions during the recruitment process will result in immediate disqualification of your application, or termination of employment if discovered later, in accordance with applicable law. We appreciate your honesty and professionalism. The potential base pay range for this role, when annualized, is $108,000.00 - $270,000.00. The actual base pay offered may vary based on a number of factors including job-related qualifications such as knowledge, skills, education, and experience; location; and/or schedule (full or part-time). Dependent on the position offered, incentive plans, bonuses, and/or other forms of compensation may be offered, in addition to a range of health and welfare and/or other benefits. Show more Show less

Ready to apply?

Takes you directly to IQVIA's application page

Apply Now →

About IQVIA

Size
Stage
Glassdoor
AI Seriousness
/5

Get similar jobs in your inbox

Weekly digest of AI engineering roles matched to your stack.

Subscribe — Free

Hiring AI Engineers?

Post your role and reach engineers who actually build with AI.

Post a Job — $49