Senior Data Integration Specialist ML Engineer
Explicitly requires vibe coding skills (mentions Cursor AI and vibe coding).
About the Role
Senior Data Integration Specialist focused on designing and building scalable, rule-based data ingestion and validation systems for large-scale cloud platforms. The role leads integration with APIs and platforms (e.g., MDS API, Snowflake, Databricks), implements validation engines and monitoring, and collaborates with data architects and scientists to ensure data quality and reliability.
Job Description
Role
Senior Data Integration Specialist / ML Engineer responsible for sourcing, validating and building rule-based data ingestion systems and validation engines for large-scale data platforms. The role focuses on integration with internal and external systems, error monitoring and root-cause analysis, and creating robust pipelines for data consumption.
Key Responsibilities
- Design, build and maintain efficient data pipelines and integration setups for large-scale platforms (AWS, Azure, Snowflake).
- Integrate with systems such as MDS API, CRM, DataBricks and proprietary data sources.
- Design and implement validation engines for monitoring sourced data and ensuring data integrity.
- Implement error monitoring, perform RCA (root-cause analysis) and maintain disaster recovery procedures.
- Perform statistical analysis, process/cleanse data, and enhance data collection for analytics needs.
- Collaborate with data architects, IT teams and data scientists on project goals and data requirements.
- Recommend and implement improvements to data reliability, quality, security and governance.
- Operate and improve DevOps/DataOps practices including CI/CD, containerization and pipeline automation.
Requirements
- 5+ years of software engineering / data engineering experience; experience administering enterprise data platforms.
- Strong experience with cloud platforms and large-scale data platforms (AWS, Azure, Snowflake).
- Proficiency in Python and modern frameworks/tools (FastAPI, Temporal); experience with LLMs noted.
- Experience with databases and search technologies (MongoDB, Elasticsearch/OpenSearch), NoSQL and vector databases.
- Experience with DevOps/DataOps tooling: CI/CD, GitHub Actions, Docker, podman.
- Proven problem-solving, communication and collaboration skills.
- Background in designing validation engines, data cleansing, and statistical analysis for data quality.
- Bachelor’s degree in engineering, technology, applied mathematics, physics, statistics or related field preferred.
Technologies & Integrations (explicitly mentioned)
MDS API, CRM integrations, Snowflake, DataBricks, AWS, Azure, Python, FastAPI, MongoDB, Elasticsearch/OpenSearch, Temporal, GitHub Actions, Docker, podman, LLMs, NoSQL & Vector databases.