
Senior Data Engineer at CME. Location Information: Remote. . This is a remote position.. We are seeking a self-motivated, intellectually curious Data Engineer to join our Data Science and Solutions team. This engineer will be responsible for building robust, scalable data . pipelines. using Databricks on AWS, integrating a wide range of data sources and structures into our AI and analytics platform. We have built our ‘minimum viable product’ and are now scaling up to support multi-tenancy in a highly secure environment. . The ideal candidate has more than 2 years’ experience in Databricks, and preferably building scalable, high-quality data pipelines in a distributed, serverless cloud environment. They will be well-versed in CI/CD best practices, system monitoring and the Databricks control surface as you will be building infrastructure-. as-. code. to deploy secure, isolated, and monitored environments and data pipelines for our end users and AI agents. Most of all, you will be an expert in collaboration in a distributed, remote environment, a team player, and always learning.. . . Data Pipeline Development. . . Design, build, and maintain . ETL. /ELT pipelines in Databricks to ingest, clean, and transform data from diverse product sources.. . Construct gold layer tables in the Lakehouse architecture that serve both machine learning model training and real-time APIs.. . Monitor data quality, lineage, and reliability using Databricks best practices.. . AI-Driven Data Access Enablement. . . Collaborate with AI/ML teams to ensure data is modeled and structured to support natural language prompts and semantic retrieval using 1st and 3rd party data sources, vector search and Unity Catalog metadata.. . Help build data interfaces and agent tools to interact with structured data and AI agents to retrieve and analyze customer data with role-based permissions.. . API & Serverless Backend Integration. . . Work with backend engineers to design and implement serverless APIs (e.g., via AWS Lambda with TypeScript) that expose gold tables to frontend applications.. . Ensure APIs are performant, scalable, and designed with data security and compliance in mind.. . Utilize Databricks and other APIs to implement provisioning, deployment, security and monitoring frameworks for scaling up data pipelines, AI endpoints, and security models for multi-tenancy.. . . . Requirements. . . 3+ years of experience as a Data Engineer or related role in an agile, distributed team environment with a quantifiable impact on business or technology outcomes.. . Proven expertise with Databricks, including job and workflow orchestration, change data capture and medallion architecture.. . Proficiency in Spark or Scala for data wrangling and transformation on a wide variety of data sources and structures. . . Practitioner of CI/CD best practices, test-driven development and familiarity with the . MLOps. / AIOps lifecycles. . . . . Proven ability to work in an agile environment with product managers, front-end engineers, and data scientists.. . Preferred Skills. . . Familiarity with AWS Lambda (Node.js/TypeScript preferred) and API Gateway or equivalent serverless platforms, knowledge of API design principles and working with RESTful or GraphQL endpoints.. . Exposure to React-based frontend architecture and the implications of backend data delivery on UI/UX performance – including end-to-end telemetry to measure performance and accuracy for the end-user experience.. . Experience with A/B testing, experiment and inference logging and analytics.. . . . .