About the Role

Apple's G&A Solutions Engineering team seeks a Service Reliability Engineer to ensure the reliability, scalability, and performance of mission-critical production services by applying SRE principles, automating operations, and leading incident response efforts. The role collaborates with engineers, DBAs, and network specialists to monitor systems, drive continuous improvement, and support 24/7 on-call duties.

Job Description

Role

Join Apple’s General & Administrative Solutions Engineering team as a Service Reliability Engineer to maintain the health, stability, and efficiency of global production services. You will apply Site Reliability Engineering principles, automate operational tasks, lead incident response and root cause analysis, and collaborate with cross-functional teams to ensure services are designed and operated for reliability and scalability.

Key Responsibilities

Proactively monitor service performance and identify bottlenecks
Lead incident response and conduct thorough root cause analysis (RCA)
Develop and implement automation strategies to reduce manual intervention
Apply SRE principles to maintain scalable, highly available infrastructure
Collaborate with development teams on monitoring, alerting, and operational design
Create and maintain documentation such as run-books and service level objectives (SLOs)
Participate in 24/7 on-call rotations and respond to incidents promptly
Define and supervise service level indicators (SLIs) and drive process improvements
Champion continuous learning and knowledge sharing within the team

Minimum Qualifications

4+ years of experience in Site Reliability Engineering, DevOps, or related roles supporting large-scale enterprise services
Strong proficiency in at least one programming language (e.g., Python, Java, Go) and scripting (e.g., Bash, PowerShell)
Experience with cloud platforms (e.g., AWS, Azure, GCP) and cloud-native technologies (e.g., Kubernetes, Docker)
Hands-on experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Splunk, Datadog)
Understanding of Linux/Unix system administration
Bachelor’s degree in Computer Science or equivalent work experience

Preferred Qualifications

Familiarity with CI/CD pipelines and DevOps practices
Experience with database technologies (MySQL, PostgreSQL, NoSQL)
Knowledge of ITIL frameworks and incident management
Experience with configuration management tools (Ansible, Chef, Puppet)
Experience with vibe coding

Service Reliability Engineer, G&A Solutions Engineering

About the Role

Job Description

Role

Key Responsibilities

Minimum Qualifications

Preferred Qualifications

Tech Stack

Skills

Experience Level