Senior Scraping Infrastructure Engineer at OnHires

Source: https://jobs.ashbyhq.com/onhires/2f8bf345-2ffb-4ae3-ae20-1fb17564ccaf

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Senior Scraping Infrastructure Engineer at OnHires. Remote Location: Brazil. Our client is a Berlin-based, remote-first scale-up providing cutting‑edge market intelligence and software solutions to the automotive industry. As the company enters an exciting new phase of growth, they are looking for an experienced . Scraping Infrastructure Engineer. to strengthen their international, high‑impact team.. If you thrive on . architecting and maintaining highly resilient, large-scale scraping systems. capable of handling sophisticated anti-bot and blocking mechanisms, this role is for you. You will be responsible for the entire lifecycle of our high-volume scraping pipelines, focusing on the infrastructure, tooling, and strategic defenses that guarantee accurate, consistent, and high-speed data collection at scale.. Responsibilities. Infrastructure Strategy & Architecture:. Architect, build, and maintain the core infrastructure for massive, large-scale asynchronous data extraction system.. Advanced Resilience Engineering:. Design, implement, and continuously optimize sophisticated anti-blocking strategies, IP rotation, fingerprint management, and anti-bot bypass techniques to ensure high reliability and consistent uptime against modern web blocking.. Operational Excellence & Monitoring:. Implement robust monitoring, alerting, and logging systems to proactively debug, troubleshoot, and continuously improve scraper performance, reliability, and data quality across the platform.. Core Development:. Develop, test, and deploy highly robust and fault-tolerant web scraping components using advanced Python tools (Scrapy, Playwright, Selenium, Requests, etc.).. Integration & Pipelines:. Manage and automate high-volume data ingestion pipelines and seamless integrations with internal and external REST APIs.. DevOps & Automation:. Drive DevOps best practices, including managing infrastructure with Docker, Nomad knowledge (a plus), CI/CD pipelines. Collaboration & Mentorship:. Partner with other engineers to set standards, enhance core infrastructure tooling, and mentor junior team members.. Requirements. Core Experience:. Proven, hands-on professional experience in high-volume web scraping and data extraction using Python.. Anti-Blocking Expertise:. Deep, practical knowledge of anti-bot solutions, including CAPTCHA solving, browser fingerprinting, and effective proxy/IP management strategies.. Technical Depth:. Solid understanding of HTML parsing, browser automation techniques, and asynchronous programming.. Frameworks:. Proficiency with leading web scraping frameworks (e.g., Playwright, Scrapy, or Selenium).. Web Knowledge:. Strong knowledge of REST APIs, HTTP protocols, and effective proxy management.. Database Skills:. Familiarity with both SQL and NoSQL databases for efficient data storage and processing.. Infrastructure:. Experience with Docker, Linux environments, and version control (Git).. Communication:. Fluent in English (written and spoken).. Mindset:. Self-driven, pragmatic, and capable of taking full ownership of critical, high-impact infrastructure projects.. Nice to Haves (Bonus Points). Experience with advanced async libraries (e.g., asyncio). Understanding of data quality validation and pipeline monitoring tools.. What they offer. Impact & Ownership:. A high degree of freedom and the opportunity to have a meaningful, measurable impact on a growing scale-up business.. Flexibility:. A high degree of flexibility – our client is a remote-first company and actively support remote work.. Growth:. A competitive compensation package and dedicated support for your personal & professional development (ongoing training & coaching).. Team & Atmosphere:. A great work atmosphere within a small, talented, and international team.. Office (Optional):. A modern office located on the campus of Wildau Tech University, easily accessible by public transport (just outside Berlin).