
Platform Engineer / SRE (Kubernetes) at Portainer.io. As part of the global Platform Engineering team at Portainer, this role is critical to ensuring the reliability, scalability, and efficiency of large-scale, self-managed Kubernetes environments across customer data centers. You’ll be working directly with customer platform teams to operate and improve their Kubernetes estate, enhance observability and automation, and extend platform capabilities via Portainer and complementary tools. This is a high-impact role that blends deep infrastructure knowledge, cloud native expertise, and a DevOps/SRE mindset to support mission-critical systems across global time zones.. We’re looking for someone who’s done the hard yards - not just operating Kubernetes, but engineering it at scale. You’ve faced real-world incidents, solved complex infrastructure problems, and carry the kind of experience that only comes from owning production systems end-to-end. This role demands practical expertise earned through building, breaking, and hardening Kubernetes platforms in demanding environments.. . Operate and manage self-hosted Kubernetes clusters at scale (5,000+ nodes per region) across multiple sites.. . Serve as a subject-matter expert on Kubernetes internals, delivering proactive support, performance tuning, and architectural recommendations.. . Enable and extend platform tooling using Portainer, integrating it with identity, observability, and lifecycle management systems.. . Design and automate Day-2 operational workflows including node lifecycle, network overlays, and storage provisioning.. . Lead technical engagements such as architecture reviews, operational readiness assessments, and incident postmortems.. . Build and maintain IaC pipelines and GitOps patterns using tools like Terraform, ArgoCD, and Flux.. . Troubleshoot and resolve advanced infrastructure issues related to scheduling, networking, DNS, ingress, and runtime isolation.. . Contribute to internal reusable tooling, engineering standards, and automation frameworks.. . Collaborate with customer stakeholders and internal technical teams across time zones as part of a 24/7 high-availability model.. . Skills & Qualifications:. . . 6+ years. of hands-on experience in Platform Engineering, DevOps, or SRE roles.. . . 3+ years. operating large-scale . on-prem or self-managed Kubernetes clusters. in production.. . Deep understanding of Kubernetes control-plane components (API server, etcd, controller-manager, scheduler).. . Experience with . Portainer. or other Kubernetes platform management tools (e.g., Rancher, Lens, OpenShift).. . Proficiency in service mesh technologies such as . Istio. and . Envoy. .. . . Demonstrable experience in Go. is a strong advantage; particularly in building custom Kubernetes operators or contributing upstream (e.g., submitting PRs to Kubernetes core or CNCF projects).. . Advanced skills in . Infrastructure as Code. (Terraform, Helm, Kustomize) and GitOps workflows.. . Solid knowledge of CNI plugins (e.g., Cilium, Calico), ingress controllers, and CSI drivers.. . Scripting and automation using Python, Ansible, Terraform, or Bash.. . Familiarity with observability tooling (Prometheus, Grafana, Loki, VictoriaMetrics, Mimir, etc.).. . Strong grasp of reliability engineering principles: SLOs, SLIs, chaos testing, and scaling patterns.. . Company Location: India.