Senior Site Reliability Engineer at Jahnel Group

Source: https://remotive.com/remote-jobs/devops/senior-site-reliability-engineer-2034178

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Senior Site Reliability Engineer at Jahnel Group. Location Information: USA. Jahnel Group’s mission is to provide the absolute best environment for software creators to pursue their passion by connecting them with great clients doing meaningful work. . This is a full time position with one of our closest clients. . Who We're Looking For. The Senior Site Reliability Engineer (SRE) ensures the reliability, scalability, and performance of the client's cloud-based software solutions. This role blends software engineering and systems administration to support and enhance critical infrastructure, working closely with development and operations teams to deliver secure and cost-effective cloud environments.. Primary Responsibilities . Cloud Infrastructure Architecture and Implementation: . Designs, builds, and maintains robust cloud infrastructure solutions using AWS and other cloud technologies.. Mentorship and Team Development: Provides. technical guidance and mentorship to junior SREs, promoting a culture of continuous learning and improvement.. Operational Efficiency and Automation: . Identifies and implements process improvements through automation and optimization to enhance reliability and reduce manual effort.. Performance and Reliability Management: . Develops and executes strategies to meet and exceed Service Level Objectives (SLOs) and Service Level Agreements (SLAs).. Incident Management:. Leads incident response efforts, perform root cause analysis, and implement preventive measures to minimize downtime.. Capacity Planning and System Optimization. : Proactively identifies performance bottlenecks, optimize resource utilization, and ensure system scalability.. Security and Compliance: . Implements cloud security best practices, including least-privilege IAM policies, secrets management, and evidence generation for compliance frameworks (e.g., SOC 2, ISO 27001).. Other duties and projects as assigned. Some Must-Haves. 5+ years in Site Reliability Engineering or a similar role.. Extensive expertise in AWS (Amazon Web Services) cloud platform and services.. Experience with GitOps practices and CI/CD tooling (e.g., GitHub Actions, Jenkins, ArgoCD, or similar).. Experience with Infrastructure as Code (e.g., Terraform).. Experience designing and maintaining observability stacks (e.g., Prometheus, Grafana, ELK) with a focus on actionable metrics, alerting, and SLOs.. Strong problem-solving, troubleshooting, and analytical skills.. Excellent communication and collaboration abilities.. Organizational skills with attention to detail.. Ability to manage time and prioritize tasks.. Proficiency in scripting languages (e.g., Python, PowerShell).. In-depth knowledge of Linux systems, networking, load balancing, and security principles.. Where We're Looking For It. Texas or New York (Flexibility to work remotely). Compensation. $100,000.00 to $150,000.00. Salary is established based on various factors, including, but not limited to, prior employment history, job-related knowledge, education and training, skills, and geographic location..