Senior Site Reliability Engineer (SRE) (Remote - US) at Jobgether

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Senior Site Reliability Engineer (SRE) (Remote - US) at Jobgether. This position is posted by Jobgether on behalf of a partner company. We are currently looking for a . Senior Site Reliability Engineer (SRE). in . United States. .. This role is perfect for an experienced engineering professional who thrives in high-growth, fast-paced environments. You will design, implement, and maintain scalable, secure, and highly available cloud infrastructure while driving automation and performance optimization across systems. Collaborating closely with cross-functional teams, you will enhance system reliability, monitor performance, and lead incident response initiatives. This position allows you to mentor junior engineers, contribute to architectural decisions, and help shape a culture of continuous improvement. Your work will directly impact platform stability, scalability, and overall user experience, ensuring critical systems operate seamlessly at scale.. . Accountabilities:. . Design, implement, and maintain highly available, scalable, and secure cloud-based infrastructure.. . Develop and maintain automation tools for deployment, monitoring, and management of infrastructure and applications.. . Collaborate with development, operations, and QA teams to promote a reliability-focused engineering culture.. . Monitor system performance, identify bottlenecks, and optimize resource utilization.. . Implement and manage monitoring, alerting, and logging systems to detect and respond to incidents efficiently.. . Lead incident response during outages, perform root cause analysis, and implement preventive measures.. . Evaluate and adopt best practices, tools, and technologies to improve system reliability, scalability, and security.. . Mentor junior engineers and foster a culture of collaboration, innovation, and continuous learning.. . . . 5+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role in a fast-paced startup environment.. . Deep knowledge of cloud computing and experience with AWS, Azure, or GCP.. . Proficiency with infrastructure as code tools such as Terraform, CloudFormation, Ansible, or Jenkins.. . Strong programming skills in Python, Go, or Java.. . Experience with containerization (Docker) and container orchestration platforms (Kubernetes).. . Solid understanding of networking concepts, TCP/IP, and network troubleshooting.. . Excellent problem-solving skills and ability to troubleshoot complex systems under pressure.. . Strong communication and collaboration skills to work effectively in cross-functional teams.. . Company Location: United States.