Data Engineer (DBT + Spark + Argo) (Remote - Latam) at Jobgether

Source: https://jobs.workable.com/view/cfzqN1sY5HuawzSe56XKQY/data-engineer-(dbt-%2B-spark-%2B-argo)-(remote---latam)-in-colombia-at-jobgether

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Data Engineer (DBT + Spark + Argo) (Remote - Latam) at Jobgether. This position is posted by Jobgether on behalf of a partner company. We are currently looking for a . Data Engineer (DBT + Spark + Argo). in . Latin America. .. We are seeking a highly skilled Data Engineer to join a remote-first, collaborative team driving the modernization of large-scale data platforms in the healthcare sector. In this role, you will work on transforming legacy SQL pipelines into modular, scalable, and testable DBT architectures, leveraging Spark for high-performance processing and Argo for workflow orchestration. You will implement modern lakehouse solutions, optimize storage and querying strategies, and enable real-time analytics with ElasticSearch. This position offers the chance to contribute to a cutting-edge, cloud-native data environment, working closely with cross-functional teams to deliver reliable, impactful data solutions.. . Accountabilities:. Translate legacy T-SQL logic into modular, scalable DBT models powered by . Spark SQL. .. Build reusable, high-performance data transformation pipelines.. Develop testing frameworks to ensure . data accuracy and integrity. within DBT workflows.. Design and orchestrate automated workflows using . Argo Workflows. and CI/CD pipelines with Argo CD.. Manage reference datasets and mock data (e.g., ICD-10, CPT), maintaining version control and governance.. Implement efficient storage and query strategies using . Apache Hudi, Parquet, and Iceberg. .. Integrate ElasticSearch for analytics through APIs and pipelines supporting indexing and querying.. Collaborate with DevOps teams to optimize cloud storage, enforce security, and ensure compliance.. Participate in Agile squads, contributing to planning, estimation, and sprint reviews.. . Strong experience with . DBT. for data modeling, testing, and deployment.. Hands-on proficiency in . Spark SQL. , including performance tuning.. Solid programming skills in . Python. for automation and data manipulation.. Familiarity with . Jinja templating. to build reusable DBT components.. Practical experience with . data lake formats. : Apache Hudi, Parquet, Iceberg.. Expertise in . Argo Workflows. and CI/CD integration with Argo CD.. Deep understanding of . AWS S3. storage, performance tuning, and cost optimization.. Experience with . ElasticSearch. for indexing and querying structured/unstructured data.. Knowledge of healthcare data standards (e.g., ICD-10, CPT).. Ability to work cross-functionally in . Agile environments. .. Nice to have:. Experience with Docker, Kubernetes, cloud-native data tools (AWS Glue, Databricks, EMR), CI/CD automation, data compliance standards (HIPAA, SOC2), or contributions to open-source DBT/Spark projects.. . Company Location: Colombia.