About the Role

Join Apple’s G&A Solutions Engineering team as a Service Reliability Engineer responsible for ensuring the reliability, scalability, and performance of mission-critical services. The role focuses on applying SRE principles, automating operational tasks, leading incident response, and collaborating across engineering, database, and network teams to maintain production health.

Job Description

Role

Apple’s G&A Solutions Engineering team is seeking a Service Reliability Engineer to maintain the health, stability, and efficiency of global production services. The engineer will apply SRE principles to improve reliability, scalability, and performance while automating operational work and collaborating with cross-functional teams.

Key Responsibilities

Proactively monitor service performance, identify bottlenecks, and implement solutions to improve efficiency and resilience
Lead incident response, drive rapid resolution, and perform thorough root cause analysis (RCA)
Develop and implement automation strategies to reduce manual intervention and improve service resilience
Apply SRE principles to operate and maintain highly reliable, scalable service infrastructure
Collaborate with development teams to design services with monitoring, alerting, and scalability in mind
Create and maintain documentation, runbooks, and service level objectives (SLOs)
Participate in on-call rotations to provide 24/7 support for critical services
Define and supervise service level indicators (SLIs) and drive process improvement initiatives
Champion continuous learning and knowledge sharing within the team

Minimum Qualifications

4+ years of experience in Site Reliability Engineering, Production Support, or related roles supporting large-scale enterprise services
Strong proficiency in at least one programming language (examples: Python, Java, Go) and scripting languages (examples: Bash, PowerShell)
Experience with cloud platforms (examples: AWS, Azure, GCP) and cloud-native technologies (examples: Kubernetes, Docker)
Hands-on experience with monitoring and alerting tools (examples: Prometheus, Grafana, Splunk, Datadog)
Bachelor’s degree in Computer Science or equivalent work experience

Preferred Qualifications

Familiarity with CI/CD pipelines and DevOps practices
Experience with database technologies (e.g., MySQL, PostgreSQL, NoSQL)
Knowledge of ITIL frameworks and incident management processes
Experience with vibe coding
Understanding of Linux/Unix system administration
Experience with configuration management tools such as Ansible, Chef, or Puppet

Service Reliability Engineer, G&A Solutions Engineering

About the Role

Job Description

Role

Key Responsibilities

Minimum Qualifications

Preferred Qualifications

Tech Stack

Skills

Experience Level