← Back to Jobs
Apple logo

Service Reliability Engineer, G&A Solutions Engineering

Apple
4.1(13994)
👥10k+
Software Engineering
Austin, TX
3 weeks ago
🤖 AI-First🛠️ Cursor-friendly
Apply →

Mentions "vibe coding" as a preferred skill, indicating familiarity with vibe coding is a plus.

About the Role

Apple's G&A Solutions Engineering team seeks a Service Reliability Engineer to ensure the reliability, scalability, and performance of mission-critical production services by applying SRE principles, automating operations, and leading incident response efforts. The role collaborates with engineers, DBAs, and network specialists to monitor systems, drive continuous improvement, and support 24/7 on-call duties.

Job Description

Role

Join Apple’s General & Administrative Solutions Engineering team as a Service Reliability Engineer to maintain the health, stability, and efficiency of global production services. You will apply Site Reliability Engineering principles, automate operational tasks, lead incident response and root cause analysis, and collaborate with cross-functional teams to ensure services are designed and operated for reliability and scalability.

Key Responsibilities

  • Proactively monitor service performance and identify bottlenecks
  • Lead incident response and conduct thorough root cause analysis (RCA)
  • Develop and implement automation strategies to reduce manual intervention
  • Apply SRE principles to maintain scalable, highly available infrastructure
  • Collaborate with development teams on monitoring, alerting, and operational design
  • Create and maintain documentation such as run-books and service level objectives (SLOs)
  • Participate in 24/7 on-call rotations and respond to incidents promptly
  • Define and supervise service level indicators (SLIs) and drive process improvements
  • Champion continuous learning and knowledge sharing within the team

Minimum Qualifications

  • 4+ years of experience in Site Reliability Engineering, DevOps, or related roles supporting large-scale enterprise services
  • Strong proficiency in at least one programming language (e.g., Python, Java, Go) and scripting (e.g., Bash, PowerShell)
  • Experience with cloud platforms (e.g., AWS, Azure, GCP) and cloud-native technologies (e.g., Kubernetes, Docker)
  • Hands-on experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Splunk, Datadog)
  • Understanding of Linux/Unix system administration
  • Bachelor’s degree in Computer Science or equivalent work experience

Preferred Qualifications

  • Familiarity with CI/CD pipelines and DevOps practices
  • Experience with database technologies (MySQL, PostgreSQL, NoSQL)
  • Knowledge of ITIL frameworks and incident management
  • Experience with configuration management tools (Ansible, Chef, Puppet)
  • Experience with vibe coding

Tech Stack

MySQLPostgreSQLNoSQLAnsibleChefPuppetPythonJavaGoBashPowerShellAWSAzureGCPKubernetesDockerPrometheusGrafanaSplunkDatadogLinux/UnixCI/CDITILvibe coding

Skills

Site Reliability EngineeringAutomationIncident ResponseRoot Cause AnalysisMonitoring and AlertingCI/CD and DevOps PracticesLinux/Unix System AdministrationConfiguration ManagementCollaborationDocumentationOn-call OperationsProcess ImprovementContinuous Learning

Experience Level

Mid