Site Reliability Engineer (Remote - Latam) at Jobgether

Source: https://jobs.workable.com/view/2LXuvQ6qw3W4avkFngb9t2/site-reliability-engineer-(remote---latam)-in-chile-at-jobgether

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Site Reliability Engineer (Remote - Latam) at Jobgether. This position is posted by Jobgether on behalf of Launchpad Technologies. We are currently looking for a . Site Reliability Engineer. in . Latam. .. Join a global team at the intersection of development and operations, where you'll help design, automate, and maintain high-availability systems for leading international clients. This role offers the chance to work remotely while solving complex infrastructure challenges, improving system resilience, and ensuring consistent performance across cloud environments. You'll contribute to mission-critical platforms, leveraging automation, monitoring, and modern DevOps tools in a collaborative, people-first culture that supports continuous learning and professional growth.. . Accountabilities:. . Collaborate with cross-functional teams to ensure system uptime, performance, and reliability.. . Design and implement robust monitoring and alerting systems to proactively manage infrastructure health.. . Automate infrastructure provisioning and deployments using Infrastructure as Code (IaC).. . Manage and support containerized environments (Docker/Kubernetes) in cloud-native setups.. . Perform root cause analysis, respond to incidents, and minimize service disruptions.. . Conduct capacity planning and performance optimization for scalable systems.. . Support CI/CD pipeline maintenance and infrastructure automation workflows.. . . Proven experience in Site Reliability, DevOps, or Infrastructure Engineering roles.. . Strong scripting and automation skills in Python, Bash, or Go.. . Proficiency with Docker and Kubernetes for container orchestration.. . Hands-on experience with cloud platforms (AWS, Azure, or Google Cloud).. . Familiarity with monitoring/logging tools like Prometheus, Grafana, or ELK stack.. . Solid understanding of performance tuning, fault tolerance, and system observability.. . Strong communication skills in English and a collaborative problem-solving mindset.. . Bonus points for:. . Knowledge of networking and security best practices.. . Experience managing incident response and conducting post-mortem reviews.. . Relevant certifications (e.g., AWS Certified DevOps Engineer, Kubernetes Administrator).. . Background in designing highly available and fault-tolerant systems.. . . Company Location: Chile.