Site Reliability Engineer at Dropbox

Source: https://www.workingnomads.com/jobs/site-reliability-engineer-dropbox-1677793

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Site Reliability Engineer at Dropbox. Location Information: . Role Description. . As a Corporate Site Reliability Engineer. . (SRE). at Dropbox, you will help lead the infrastructure strategy and technical direction of one of the most innovative technology companies globally. Successful candidates will possess a growth mindset, strong accountability and be passionate about designing, building, and securing scalable infrastructure services in a dynamic environment. You will drive improvement projects in automation and observability and effectively handle incidents that arise in a prompt but measured way. In this role, you'll serve as a technical lead of programs related to monitoring, metrics, alerting and reliability throughout the IT Services organization, and contribute to the evolution of our world-class infrastructure while ensuring utmost security and scalability.. . . Our Engineering Career Framework is . viewable by anyone outside the company. and describes what’s expected for our engineers at each of our career levels. Check out our blog post on this topic and more . here. .. . . Responsibilities. . . Ensure the reliability, scalability, and performance of Dropbox's infrastructure and services. . Collaborate with cross-functional teams to develop and maintain best practices for monitoring, logging, and incident response. . Build, Implement and maintain automations & infrastructure-as-code tooling, specifically Terraform, Ansible, and Github Actions as well as custom code platforms. . Utilize container orchestration platforms, such as Kubernetes, Amazon ECS and Red Hat Openshift, to manage containers at scale. . Manage and optimize monitoring and logging pipelines using tools like Datadog and Cribl LogStream. . Drive improvement projects related to service health and visibility for our stakeholders, ranging from developers to business service owners to C-level. . Develop and maintain custom tooling and automation scripts in Bash, Python and other scripting languages. . . On-call work may be necessary occasionally to help address bugs, outages, or other operational issues, with the goal of maintaining a stable and high-quality experience for our customers.. . Requirements. . . 5+ years of experience in site reliability engineering or a similar engineering roles with hands-on coding experience. . Strong knowledge of AWS services, including EC2, S3, RDS, R53, Lambda, and others. . Strong knowledge of Linux administration, internals, filesystems, volume management and specific distro's such as Ubuntu, RHEL, DNS, DHCP. . Experience with monitoring and logging tools, Datadog and logging pipeline tools such as Vector or Cribl LogStream. . Experience driving one or more transformational programs related to metrics and observability. . Experience with scripting in a higher level language. . (Python. preferred). . Experience developing automation to solve infrastructure-related tasks with tools such as Chef/Ansible/Terraform. . Experience with log analysis and building metrics, alerts and visuals from log data. . Strong proficiency in infrastructure-as-code tools, such as Terraform. . Strong Proficiency in Config Management tools specifically Ansible Automation Platform and Chef. . Experience with containerization technologies, such as Docker, and container orchestration platforms like Kubernetes or Amazon ECS. . Knowledge of LDAP, REST API's and current Auth. . Familiarity with GitHub and Git-based workflows. . Understanding of RDS databases and network security technologies, such as WAF. . Strong problem-solving skills and the ability to work well in a fast-paced, collaborative environment. . Excellent written and verbal communication skills. . . Preferred Qualifications. . . Experience managing large-scale multi-cloud or hybrid infrastructure.. . Strong background in Infrastructure as Code . (Terraform,. Ansible) and GitOps workflows.. . Familiarity with Kubernetes, Docker, and serverless platforms.. . Proven track record improving observability, reliability, and incident response.. . Understanding of compliance and security frameworks . (SOC2,. ISO 27001, FedRAMP).. . Experience implementing Zero Trust security and access models.. . . . . Compensation. . . Poland Pay Range. 272 000 zł. —. 368 000 zł PLN