About the Role

Join Apple's G&A Solutions Engineering team as a Service Reliability Engineer to ensure the reliability, scalability, and performance of mission-critical production services. You will apply SRE principles, automate operational work, lead incident response, and collaborate across engineering, DBA, and network teams to maintain service health and resilience.

Job Description

Role

As a Service Reliability Engineer on Apple’s G&A Solutions Engineering team, you will be responsible for maintaining the health, stability, and efficiency of global production services. The role focuses on applying SRE principles to improve reliability, automate repetitive tasks, and collaborate with cross-functional teams to design services for operational excellence.

Key Responsibilities

Proactively monitor service performance, identify bottlenecks, and implement optimization solutions
Lead incident response, drive rapid resolution, and perform thorough root cause analysis (RCA)
Develop and implement automation strategies to streamline operations and reduce manual intervention
Apply SRE principles to maintain highly reliable and scalable service infrastructure
Work with development teams to ensure new services include best practices for monitoring, alerting, and scalability
Create and maintain documentation, including run-books and service level objectives (SLOs)
Participate in on-call rotations to provide 24/7 support for critical services
Identify process improvement opportunities and drive initiatives to enhance team effectiveness
Define and supervise key service level indicators (SLIs) to measure and improve reliability

Minimum Qualifications

4+ years of experience in Site Reliability Engineering, DevOps, or a related role supporting large-scale, enterprise services
Strong proficiency in at least one programming language (examples: Python, Java, Go) and scripting (examples: Bash, PowerShell)
Experience with cloud platforms (examples: AWS, Azure, GCP) and cloud-native technologies (examples: Kubernetes, Docker)
Hands-on experience with monitoring and alerting tools (examples: Prometheus, Grafana, Splunk, Datadog)
Bachelor’s degree in Computer Science or equivalent work experience
Experience supporting enterprise-level services and participating in on-call rotations

Preferred Qualifications

Familiarity with CI/CD pipelines and DevOps practices
Experience with database technologies (examples: MySQL, PostgreSQL, NoSQL databases)
Knowledge of ITIL frameworks and incident management processes
Experience with configuration management tools (examples: Ansible, Chef, Puppet)
Understanding of Linux/Unix system administration
Experience with “vibe coding”

Service Reliability Engineer, G&A Solutions Engineering

About the Role

Job Description

Role

Key Responsibilities

Minimum Qualifications

Preferred Qualifications

Tech Stack

Skills

Experience Level