Sr Manager AI/ML Engineering - Hybrid in MN or DC, remote elsewhere
Explicitly requires experience with AI-assisted development / 'vibe coding' tools (examples: Codex, Claude Code, Cursor, Windsurf).
About the Role
Lead and scale AI/ML engineering teams to build and operate enterprise-grade ML and GenAI platforms, model lifecycle systems, and MLOps pipelines that productionize machine learning across Optum. Drive architecture, cloud-native infrastructure, monitoring, governance, and collaboration with data science and engineering teams to deliver reliable, compliant ML systems for healthcare.
Job Description
Role
As Senior Manager, AI/ML Engineering you will lead teams responsible for building and operating scalable machine learning platforms and production ML systems across the enterprise. You will drive design and implementation of ML infrastructure, model lifecycle management, and MLOps/LLMOps platforms that support experimentation, deployment, monitoring, and governance of machine learning and generative AI models.
Key Responsibilities
- Lead and scale AI/ML engineering teams building ML platforms, model pipelines, and AI infrastructure.
- Architect enterprise ML and GenAI platforms for experimentation, training, evaluation, deployment, monitoring, and lifecycle management.
- Productionize ML and generative AI models using batch and real-time inference architectures.
- Build and operate MLOps and LLMOps pipelines including CI/CT/CD workflows for model testing, validation, versioning, and promotion across environments.
- Develop scalable cloud-native ML infrastructure using Docker and Kubernetes and cloud ML platforms (e.g., SageMaker, Azure ML, Vertex AI).
- Implement model monitoring and lifecycle management systems to track drift, latency, bias, and data quality and enable automated retraining.
- Ensure governance, security, lineage, auditability, reproducibility, and observability of ML systems.
- Partner with data scientists, data engineers, and software engineers to define production ML standards and scalable AI solutions.
Required Qualifications
- 8+ years in machine learning engineering, MLOps, or AI platform engineering building production ML systems and scalable model pipelines.
- 5+ years with ML lifecycle platforms such as MLflow, Kubeflow, SageMaker, Azure ML, or GCP Vertex AI.
- 5+ years building cloud-native ML platforms using Docker, Kubernetes, and distributed systems.
- 6+ years programming in Python for ML systems; familiarity with PyTorch, TensorFlow, or scikit-learn.
- 5+ years with distributed data processing and orchestration tools such as Spark, Ray, Airflow, Dagster, or Prefect.
- 1+ year using AI-assisted development / “vibe coding” tools (examples listed include Codex, Claude Code, Cursor, Windsurf).
Preferred Qualifications
- Master’s degree in Computer Science, Engineering, Data Science, or related discipline.
- Experience building low-latency inference systems and online feature serving architectures.
- Experience implementing Responsible AI practices, bias detection, and model explainability.
- Experience operating multi-cloud or hybrid ML platforms.
- Contributions to open-source ML or MLOps tooling.
Location & Work Arrangement
- Remote work available from anywhere within the U.S.; hires in the Minneapolis or Washington, D.C. area are required to work in the office a minimum of four days per week (hybrid requirement for those locations).
Compensation & Benefits (excerpt)
- Salary range: $112,700 to $193,200 annually (based on full-time employment).
- Additional benefits include comprehensive benefits package, incentive and recognition programs, equity stock purchase, and 401(k) contribution (subject to eligibility).