Machine Learning Platform Engineer (Remote - Europe) at Jobgether

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Machine Learning Platform Engineer (Remote - Europe) at Jobgether. This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Machine Learning Platform Engineer in Europe.. We are seeking a talented Machine Learning Platform Engineer to support the infrastructure and operational needs of AI-driven systems. In this role, you will collaborate closely with AI researchers, engineers, and product teams to deploy, manage, and optimize cloud-based machine learning workloads. You will design scalable, reliable infrastructure, implement CI/CD pipelines, and maintain observability for production systems. The ideal candidate combines hands-on expertise in Kubernetes, AWS, Infrastructure-as-Code, and workflow orchestration with a proactive, problem-solving mindset. This role offers the opportunity to work in a fast-growing AI environment, influence platform architecture, and directly contribute to delivering innovative AI-powered solutions at scale.. . Accountabilities. Deploy, manage, and optimize machine learning models and data pipelines on Kubernetes (EKS) and cloud infrastructure.. Collaborate with AI researchers to support infrastructure requirements and facilitate research-to-production workflows.. Implement and maintain Infrastructure-as-Code (Terraform/Terragrunt) for provisioning and scaling cloud resources.. Build and manage CI/CD pipelines to automate model and application deployment.. Monitor, troubleshoot, and enhance system performance using observability tools (e.g., Datadog).. Manage workflow orchestration platforms (e.g., Temporal.io) to streamline model and data lifecycles.. Partner with finance and operations teams to integrate cost management and vendor oversight (FinOps).. . Proven experience with cloud-based hosting, preferably AWS, and container orchestration platforms like Kubernetes (EKS).. Strong knowledge of Infrastructure-as-Code frameworks such as Terraform and Terragrunt.. Experience building and managing CI/CD pipelines and deployment workflows.. Proficiency with scripting languages such as Python and operational support of production systems.. Familiarity with observability and monitoring platforms (e.g., Datadog) for large-scale infrastructure.. Ability to work collaboratively with AI researchers and engineering teams in an asynchronous, remote-first environment.. Strong problem-solving skills, adaptability in dynamic environments, and attention to detail.. Preferred: experience with high-volume SaaS products, workflow orchestration platforms (Temporal.io), and frontend or full-stack application support.. . Company Location: Spain.