Senior Platform Engineer/SRE - Tech Lead Critical Infrastructure Transformation at Cloudlinux

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Senior Platform Engineer/SRE - Tech Lead Critical Infrastructure Transformation at Cloudlinux. Build the internal platform that powers our engineering teams, delivering mission-critical software to 4,000+ cloud hosting providers worldwide.. CloudLinux powers 4,000+ hosting providers managing millions of websites globally. Our infrastructure team is at a critical inflection point – moving from 8+ years of technical debt to building a modern platform. This isn't a typical SRE role; it's a chance to architect the future of infrastructure that cannot fail.. Where we are:. Legacy systems, reactive operations, bus factor = 1. OpenNebula bottlenecks blocking releases. 70% time on firefighting.. Where we're going:. Self-service platform, Infrastructure as Code, proactive engineering. You'll be one of 2-3 senior engineers leading this transformation alongside a new Infrastructure Director with full B-level support.. What You'll Actually Do. Stabilize & Assess:. . Deep dive into OpenNebula issues with the existing team. . Map critical dependencies and single points of failure. . Implement quick wins (automated VM cleanup, monitoring gaps). . Begin documenting undocumented systems. . Build Foundation:. . Leading the design and development of an internal development platform (IDP). . Implement GitOps for critical workflows. . Establish SLIs/SLOs for core services. . Create runbooks for top incidents. . Transform Platform:. . Architect self-service Internal Developer Platform. . Drive Infrastructure as Code to 60%+ coverage. . Eliminate single points of failure. . Drive development and implementation of complex architectural decisions. . Technical Stack You'll Transform. Current:. . Virtualization: OpenNebula (main bottleneck), oVirt/OpenStack/CloudStack, KVM. . Storage: Ceph (recently stabilized), Cephadm, Rook. . Network: Juniper. . Bare metal (3 Datacenters) + AWS + Google Cloud + Azure. . Automation: ~5% Terraform coverage, manual operations dominant. . CI/CD: Gitlab, Jenkins, Gerrit, Github. . Your Tools for Transformation:. . Kubernetes & KubeVirt and/or all necessary. . Terraform/Terragrunt + Ansible. . GitOps (ArgoCD/Flux). . Python/Go for custom tooling. . Modern observability stack. . To thrive in this role, we are looking for someone who has:. . Migrated legacy systems to modern platforms at scale. . Strong Kubernetes production experience (multi-tenant, federation). . Infrastructure as Code expertise (Terraform/Ansible in production). . Linux at scale (RHEL/CentOS/AlmaLinux, 1000+ servers). . Network fundamentals, underlay, overlay, (EVPN, BGP, VXLAN, DNS, network architecture & segmentation, native pod networking at scale). . Proven ability to work independently with minimal documentation. . Experience building self-service platforms. . English B2+ and excellent documentation skills. . Critical Mindset:. . Comfortable with ambiguity and technical debt. . Pragmatic: know when to fix vs. replace vs. work around. . Can balance firefighting with strategic improvements. . Strong opinions, loosely held. . Teaching mentality – you'll help upskill the team. . What Makes You Successful Here:. . You'll have significant technical decision-making power and direct impact. . New Infrastructure Director + B-level backing for transformation. . Approved investment in people and technology. . Full authority to simplify and modernize. . Protected time for strategic work, not just operations. . The Opportunity. This isn't about maintaining the status quo. You'll:. . Define infrastructure strategy affecting 4,000+ companies. . Build an internal development platform. . Lead technical transformation with real budget and support. . Become the principal architect of a modern platform. . Work directly with the Infrastructure Director. . Shape how critical infrastructure software gets delivered globally. . Company Location: Georgia.