Data Scientist 2
Explicitly requires vibe coding skills and mentions using Cursor, Claude Code, and Windsurf.
About the Role
Optum is hiring a Data Scientist 2 to design, develop, and deploy applied machine learning solutions for healthcare problems, driving end-to-end model lifecycle from feature engineering to production monitoring. The role is fully remote within the U.S. and focuses on building interpretable, compliant, production-grade ML systems using Python and distributed data processing.
Job Description
Role
Optum seeks a Data Scientist 2 to build and deploy applied machine learning solutions that address complex clinical and business problems using large-scale healthcare data. The role involves end-to-end ownership of modeling work, from problem framing and feature engineering to deployment and post-production monitoring, with an emphasis on interpretability, regulatory compliance, and production-quality code.
Key Responsibilities
- Participate in design, development, and deployment of applied ML solutions for healthcare.
- Drive end-to-end model lifecycle: problem framing, feature engineering, model development, evaluation, validation, explainability, deployment, and monitoring.
- Develop and review production-grade Python code following software engineering best practices (testing, modularization, version control, CI/CD).
- Architect scalable data science workflows using Python, SQL, and distributed data processing frameworks in cloud or enterprise environments.
- Apply classical ML, deep learning, time-series modeling, and survival analysis techniques as appropriate.
- Ensure models are interpretable, explainable, and meet enterprise governance, regulatory, and ethical standards (bias, fairness, auditability).
- Partner with engineering, product, clinical, and business stakeholders to translate ambiguous problems into actionable analytical solutions.
- Review and approve modeling approaches and influence architectural and methodological decisions.
- Communicate insights, risks, and tradeoffs to technical and executive audiences.
Requirements
- 4+ years building production-quality, maintainable, and testable code.
- 4+ years with machine learning and statistical modeling fundamentals, including feature engineering, model training/tuning/evaluation, and model interpretability/explainability (e.g., SHAP).
- 4+ years hands-on with deep learning architectures where appropriate.
- 4+ years experience with time-series analysis and survival analysis.
- 4+ years experience with vibe coding tools such as Cursor, Claude Code, and Windsurf.
- 4+ years healthcare data literacy: claims, EHR, lab, pharmacy data; coding systems (ICD, CPT, NDC, SNOMED, LOINC); interoperability standards (FHIR, HL7); reasoning about data quality, missingness, bias, and confounding.
- 4+ years contributing to complex applied data science initiatives from concept to production and working in cross-functional environments.
- Advanced proficiency in Python for data science and ML (Pandas, NumPy, scikit-learn, PyTorch or equivalent).
- Advanced SQL skills for complex data transformations and analytical workflows.
Preferred Qualifications
- Experience with MLOps practices (deployment, monitoring, retraining, drift detection).
- Prior experience in regulated or highly governed environments.
- Familiarity with cloud platforms and distributed computing (e.g., Spark, Databricks, AWS, GCP, Azure).
Soft Skills
- Excellent written and verbal communication; able to convey technical concepts to non-technical stakeholders.
- Ability to balance model sophistication, interpretability, scalability, and business impact.
Location & Telecommute
- Role allows telecommuting from anywhere within the United States; telecommuters must adhere to UnitedHealth Groupβs Telecommuter Policy.
Compensation & Benefits
- Salary range: $72,800 to $130,000 annually (full-time basis).
- Benefits include a comprehensive benefits package, incentive and recognition programs, equity stock purchase, 401(k) contribution, and career development opportunities.