Data & AI Specialist - Data Scraping, Enrichment & Quality Assurance at division50

Source: https://jobs.workable.com/view/mmQ4cvXo23CeWGMAHAKxT7/remote-data-%26-ai-specialist---data-scraping%2C-enrichment-%26-quality-assurance-in-pakistan-at-division50

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Data & AI Specialist - Data Scraping, Enrichment & Quality Assurance at division50. Overview. We’re looking for a data-obsessed explorer who can build and maintain pipelines that collect, clean, and enhance large volumes of data, then apply AI tools to keep it accurate, useful, and ready for analysis. This is initially a . project-based role. with the possibility of evolving into a . full-time contract. based on performance and business needs.. Key Responsibilities. . . Data Acquisition & Scraping. . . . Design, develop, and maintain scalable web-scraping systems and APIs to collect structured and unstructured data from diverse sources. . . Ensure compliance with data privacy laws (GDPR, CCPA) and site-specific terms of service. . . . . . . Data Enrichment & Transformation. . . . Implement pipelines to clean, normalize, and enrich raw data using third-party datasets, NLP (natural language processing), and machine learning techniques. . . Build automated matching and deduplication processes to maintain a unified source of truth. . . . . . . Quality Assurance & Monitoring. . . . Create automated QA checks to validate data accuracy, completeness, and consistency. . . Set up monitoring and alert systems to catch anomalies or pipeline failures early. . . . . . . AI & Process Optimization. . . . Integrate AI models for entity extraction, text classification, and predictive enrichment. . . Work with the data science team to design features that feed analytics and machine learning models. . . . . . . Collaboration & Documentation. . . . Partner with product, engineering, and analytics teams to define data requirements and priorities. . . Maintain clear technical documentation and data lineage records.. . . . Strong programming skills in Python (Scrapy, BeautifulSoup, Selenium, Playwright) or equivalent languages. . . Experience with data pipelines and ETL tools (Airflow, Prefect, or similar). . . Proficiency in SQL/NoSQL databases and data warehousing (e.g., BigQuery, Snowflake). . . Familiarity with cloud platforms (AWS, GCP, or Azure) and containerization (Docker/Kubernetes). . . Knowledge of machine learning workflows and libraries (scikit-learn, spaCy, Hugging Face) is a big plus. . . Solid understanding of data privacy and ethical data collection practices. . . Nice-to-Have. . Experience with LLMs (large language models) for text enrichment. . . Background in data visualization or BI tools (Tableau, Looker, Power BI). . . Familiarity with real-time streaming data (Kafka, Kinesis). . . Traits for Success. . Detail-oriented with a knack for spotting hidden data issues. . . Curious problem solver who loves automation and efficiency. . . Comfortable in a fast-paced environment where requirements evolve quickly.. . Company Location: Pakistan.