Explicitly mentions vibe coding using GitHub Copilot and enterprise AI tools, plus prompt engineering to speed development.
About the Role
Senior Data Engineer responsible for designing, building, and maintaining large-scale batch and real-time data pipelines across cloud platforms to support analytics, reporting, and AI/ML use cases. The role focuses on translating business requirements into technical specifications, implementing ETL/ELT processes, and applying automation and AI-assisted development (vibe coding) to accelerate delivery.
Job Description
Role
Senior Data Engineer focused on building and operating large-scale batch and real-time data pipelines, data ecosystems, and integrations to support analytics, reporting, and AI/ML initiatives across cloud platforms (AWS, Azure, GCP).
Key Responsibilities
- Translate business requirements into technical specifications and detailed designs.
- Participate in project planning, identify milestones and deliverables, and track execution.
- Produce design, development, and test plans, functional specifications, UI/process flow charts.
- Develop data pipelines and APIs using Python, SQL, and potentially Spark across cloud platforms.
- Build large-scale batch and real-time data pipelines using AWS services, Snowflake, and DBT.
- Move and convert data from on-premises systems to cloud platforms.
- Implement data management, governance, and integration of structured and unstructured data.
- Leverage automation and AI-assisted techniques (including enterprise AI tools) to manage data, predict scenarios, and prescribe actions.
- Maintain operational efficiency of data ecosystems and provide analytics-as-a-service for ongoing insights.
Requirements
- 10+ years of experience in data engineering with emphasis on analytics and reporting.
- 6+ years of experience with cloud platforms (Amazon Web Services/AWS, Azure); GCP experience referenced.
- 10+ years of experience with SQL, data transformations, ETL/ELT and database platforms such as Snowflake and Fabric; extensive S3 experience.
- 10+ years designing and building data extraction, transformation, and loading processes and custom pipelines.
- 6+ years with scripting languages and tooling (Python, SQL, Shell Scripting).
- 6+ years designing and building real-time pipelines using services such as S3, Kinesis, RDS, Lambda, Glue, API Gateway, SQS, SNS, CloudWatch, CloudFormation, DBT, etc.
- 5+ years implementing REST API data integrations, webhooks, and Snowflake integrations.
- 4+ years of data modeling experience to support descriptive analytics (Power BI) and 4+ years in data cataloging and metadata practices to enable self-service.
- Strong understanding of data management principles, governance, and conceptual/logical/physical data modeling.
- Experience building pipelines/APIs leveraging Python, SQL, and cloud platform services; familiarity with Spark noted as potential.
- Familiarity with LLMs, prompt engineering, and use of AI-assisted coding tools (e.g., GitHub Copilot) to accelerate development.
Technical Environment (examples mentioned)
AWS, Azure, GCP, Snowflake, Microsoft Fabric / Azure Fabric, S3, Kinesis, RDS, Lambda, Glue, API Gateway, SQS, SNS, CloudWatch, CloudFormation, DBT, Python, SQL, Shell scripting, Spark, REST APIs, webhooks, Power BI, GitHub Copilot, LLMs