Service Reliability Engineer, G&A Solutions Engineering
Mentions "vibe coding" as a preferred skill, indicating familiarity with vibe coding is a plus.
About the Role
Apple's G&A Solutions Engineering team seeks a Service Reliability Engineer to ensure the reliability, scalability, and performance of mission-critical production services by applying SRE principles, automating operations, and leading incident response efforts. The role collaborates with engineers, DBAs, and network specialists to monitor systems, drive continuous improvement, and support 24/7 on-call duties.
Job Description
Role
Join Apple’s General & Administrative Solutions Engineering team as a Service Reliability Engineer to maintain the health, stability, and efficiency of global production services. You will apply Site Reliability Engineering principles, automate operational tasks, lead incident response and root cause analysis, and collaborate with cross-functional teams to ensure services are designed and operated for reliability and scalability.
Key Responsibilities
- Proactively monitor service performance and identify bottlenecks
- Lead incident response and conduct thorough root cause analysis (RCA)
- Develop and implement automation strategies to reduce manual intervention
- Apply SRE principles to maintain scalable, highly available infrastructure
- Collaborate with development teams on monitoring, alerting, and operational design
- Create and maintain documentation such as run-books and service level objectives (SLOs)
- Participate in 24/7 on-call rotations and respond to incidents promptly
- Define and supervise service level indicators (SLIs) and drive process improvements
- Champion continuous learning and knowledge sharing within the team
Minimum Qualifications
- 4+ years of experience in Site Reliability Engineering, DevOps, or related roles supporting large-scale enterprise services
- Strong proficiency in at least one programming language (e.g., Python, Java, Go) and scripting (e.g., Bash, PowerShell)
- Experience with cloud platforms (e.g., AWS, Azure, GCP) and cloud-native technologies (e.g., Kubernetes, Docker)
- Hands-on experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Splunk, Datadog)
- Understanding of Linux/Unix system administration
- Bachelor’s degree in Computer Science or equivalent work experience
Preferred Qualifications
- Familiarity with CI/CD pipelines and DevOps practices
- Experience with database technologies (MySQL, PostgreSQL, NoSQL)
- Knowledge of ITIL frameworks and incident management
- Experience with configuration management tools (Ansible, Chef, Puppet)
- Experience with vibe coding