Software Engineer, Internal Infrastructure (Europe & UK) at Jobgether

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Software Engineer, Internal Infrastructure (Europe & UK) at Jobgether. This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Software Engineer, Internal Infrastructure in Europe & UK.. As a Software Engineer in Internal Infrastructure, you will design, build, and operate large-scale, multi-cloud Kubernetes clusters that power advanced AI workloads. You will collaborate closely with research teams to deliver stable, scalable, and efficient systems, enabling high-performance model training and deployment. The role offers a blend of hands-on engineering, infrastructure optimization, and problem-solving in a fast-paced, innovation-driven environment. You will contribute to infrastructure that accelerates AI capabilities, improve observability and automation, and help other teams self-serve and troubleshoot efficiently. This is a highly collaborative role with opportunities to influence best practices, mentor teammates, and work on cutting-edge technologies in AI infrastructure.. . Accountabilities:. Build and operate Kubernetes GPU superclusters across multiple cloud environments.. Partner with cloud providers to optimize infrastructure cost, performance, and reliability for AI workloads.. Work closely with research teams to understand infrastructure needs and enhance stability, performance, and efficiency.. Design resilient, scalable systems and intuitive user interfaces that empower researchers to self-serve.. Implement software best practices, participate in code reviews, knowledge sharing, and on-call rotations.. Contribute to open source tools and internal automation that improve platform scalability and maintainability.. . 6+ years of experience in software engineering, site reliability, or internal infrastructure roles.. Deep experience running Kubernetes clusters at scale and managing cloud-native infrastructure.. Strong programming skills in Go or Python, with familiarity in Infrastructure as Code tools.. Experience troubleshooting Linux systems, distributed computing, and high-performance workloads.. Ability to work independently, prioritize tasks, and drive solutions in a fast-paced environment.. Strong communication and collaboration skills for working with cross-functional research and engineering teams.. Bonus: experience with ML training infrastructure, GPU workloads, RDMA networking, or research collaboration.. . Company Location: France.