Service Reliability Engineer, G&A Solutions Engineering
Explicitly requires vibe coding skills.
About the Role
Join Apple's G&A Solutions Engineering team as a Service Reliability Engineer to ensure the reliability, scalability, and performance of mission-critical production services. You will apply SRE principles, automate operational tasks, lead incident response, and collaborate with engineers and operations teams to maintain service health and resiliency.
Job Description
Role
As a Service Reliability Engineer on Apple’s General and Administrative (G&A) Solutions Engineering team, you will maintain the health, stability, and efficiency of global, mission-critical production services. The role focuses on applying SRE principles to ensure scalability and performance, automating operational work, leading incident response, and partnering with development and operations teams to design for operational excellence.
Key Responsibilities
- Proactively monitor service performance, identify bottlenecks, and implement solutions to optimize efficiency and resilience
- Lead incident response efforts, drive rapid resolution, and perform thorough root cause analysis (RCA)
- Develop and implement automation strategies to reduce manual intervention and improve service resilience
- Apply SRE principles to maintain and scale service infrastructure
- Collaborate with development teams to design services with monitoring, alerting, and scalability best practices
- Create and maintain documentation, runbooks, and service level objectives (SLOs)
- Participate in on-call rotations to provide 24/7 support for critical services
- Define and supervise key service level indicators (SLIs) and measure service reliability
- Identify process improvement opportunities and drive continuous improvement initiatives
- Promote a culture of continuous learning and knowledge sharing within the team
Requirements
Minimum Qualifications
- 4+ years of experience in Site Reliability Engineering, DevOps, or a related role supporting large-scale enterprise services
- Strong proficiency in at least one programming language (e.g., Python, Java, Go) and scripting languages (e.g., Bash, PowerShell)
- Experience with cloud platforms (e.g., AWS, Azure, GCP) and cloud-native technologies (e.g., Kubernetes, Docker)
- Hands-on experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Splunk, Datadog)
- Bachelor’s degree in Computer Science or equivalent work experience
Preferred Qualifications
- Familiarity with CI/CD pipelines and DevOps practices
- Experience with database technologies (MySQL, PostgreSQL, NoSQL)
- Knowledge of ITIL frameworks and incident management processes
- Experience with vibe coding
- Understanding of Linux/Unix system administration
- Experience with configuration management tools (Ansible, Chef, Puppet)
Technologies and Tools Mentioned
Prometheus, Grafana, Splunk, Datadog, Kubernetes, Docker, AWS, Azure, GCP, Python, Java, Go, Bash, PowerShell, MySQL, PostgreSQL, NoSQL, Ansible, Chef, Puppet, Linux/Unix, CI/CD