
Infrastructure/DevOps (Remote - Canada or US) at Jobgether. This position is posted by Jobgether on behalf of Abnormal AI. We are currently looking for an Infrastructure/DevOps Engineer in Canada or the US.. In this fully remote role, you will play a pivotal part in enabling AI software engineers to innovate quickly by designing, building, and maintaining secure, scalable, and reliable infrastructure. You’ll collaborate closely with IT, security, and AI/ML engineering teams to ensure the systems powering AI experimentation, deployment, and monitoring are efficient and future-ready. The position combines systems engineering expertise with AI platform enablement, offering the opportunity to solve complex operational challenges and boost productivity across the organization. This role is perfect for a professional who values automation, operational excellence, and delivering measurable impact in a fast-moving, collaborative environment.. . Accountabilities. . Architect, manage, and optimize infrastructure supporting AI/ML pipelines, tools, and data platforms.. . Implement and maintain containerization (Docker) and orchestration (Kubernetes) environments.. . Develop CI/CD systems integrated with ML workflows to ensure reproducible experiments.. . Collaborate with security and compliance teams to meet data protection and regulatory standards.. . Automate provisioning and deployment using Infrastructure as Code tools (Terraform, Pulumi, or Ansible).. . Monitor and troubleshoot systems using observability tools such as Prometheus, Grafana, and ELK stack.. . Work with AI and software engineers to enhance platform performance and resource efficiency.. . Produce and maintain clear, accessible documentation to support knowledge sharing.. . . 4+ years’ experience in DevOps, Site Reliability, or Infrastructure Engineering.. . Proficiency in cloud platforms (AWS preferred), Kubernetes, and Docker.. . Skilled with Infrastructure as Code tools such as Terraform, Ansible, or Pulumi.. . Strong scripting abilities in Python, Bash, or similar languages.. . Hands-on experience with CI/CD systems like GitHub Actions, Jenkins, or CircleCI.. . Solid understanding of networking, security, and identity management in cloud environments.. . Experience supporting ML workloads and GPU-based infrastructure.. . Proven ability to troubleshoot complex distributed systems and work cross-functionally.. . Bonus: familiarity with MLOps tools (MLflow, Kubeflow, SageMaker), AI platform infrastructure, data platforms (Snowflake, Databricks, Hadoop), AWS certification, or experience in high-growth tech environments.. . . Company Location: United States.