Associate Production Support Engineer at Updater

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Associate Production Support Engineer at Updater. Location Information: USA. View All Jobs. About This Role. Are you passionate about . delivering delight. through operational excellence and solving complex technical puzzles? Do you thrive in fast-paced environments where your work directly impacts millions of Americans during one of life's most stressful experiences—moving?. Join Updater's Production Support team as we revolutionize how technology enables seamless moving experiences. As an Associate Production Support Engineer, you'll be the guardian of our revenue-generating systems while evolving into a reliability engineering partner who prevents incidents rather than just responding to them.. You'll work alongside a dedicated team that takes . ownership. of critical production systems, embraces a . growth mindset. in evolving toward Site Reliability Engineering practices, and believes in . doing it right. through operational excellence and systematic improvement.. The #1 Challenge You'll Solve. Your primary mission over the next 12 months: Directly contribute to the operational capabilities of our Production Support team while mastering the foundational elements that will enable your evolution into a reliability engineering role.. You'll support our internal transition from a reactive incident response culture into a proactive reliability engineering practice that scales with Updater's growth.. What You'll Do. Core Responsibilities (70% of time during first 12 months). Operational Excellence & Revenue Protection:. Monitor critical production systems that directly generate company revenue using DataDog dashboards and synthetic tests. Respond to incidents with speed and precision, following established escalation procedures to minimize business impact. Manage escalations from internal and external partner call center agents. Partner with Updater engineering, support teams, and our providers to resolve escalated issues. Participate in 24x7 on-call rotation, ensuring someone is always watching our systems. Triage and resolve production issues through JIRA workflows, maintaining clear communication with stakeholders. Incident Management & Communication:. Lead incident response for production outages, coordinating across teams to restore service quickly. Document incidents thoroughly and participate in blameless postmortem processes that focus on system improvement. Communicate effectively with internal teams, external partners, and leadership during high-stress situations. Build relationships across the organization, assuming positive intentions and celebrating team successes. Growth & Evolution Responsibilities (30% of time during first 12 months). Site Reliability Engineering Development:. Learn to implement Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical services. Develop basic automation scripts to reduce manual operational tasks (Python, Bash, AWS CLI). Gain familiarity with Infrastructure as Code concepts using Terraform. Collaborate with the DevOps team to understand CI/CD pipelines and deployment automation. Contribute to developer self-service initiatives that reduce operational dependencies. Platform Engineering Integration:. Support the Platform Engineering mission by identifying opportunities to simplify and standardize operational processes. Help develop documentation and "golden path" procedures that enable teams to operate more independently. Participate in cross-team collaboration to advance platform maturity and developer experience. Learn observability best practices including custom metrics, alerting, and Real User Monitoring (RUM). About You. Required Experience & Skills. Technical Foundation:. 2+ years of experience troubleshooting production systems, networks, or applications. Understanding of fundamental programming concepts and basic scripting abilities. Experience with SQL and relational databases for data analysis and troubleshooting. Familiarity with API testing tools (Postman) and web service troubleshooting. Basic understanding of Git concepts and collaborative development workflows. AWS Certified Cloud Practitioner level knowledge or equivalent cloud platform experience. Operational Excellence:. Proven ability to work effectively under pressure during system outages or critical incidents. Experience with ticketing systems and structured escalation procedures. Strong problem-solving skills with ability to analyze complex technical issues. Excellent time management and ability to prioritize multiple concurrent issues. Communication & Collaboration:. Professional communication style with both technical and non-technical stakeholders. Experience providing status updates and technical explanations during incident response. Ability to write clear documentation and incident reports. Comfort participating in bridge calls and leading technical discussions. Preferred Qualifications. Emerging SRE Skills:. Basic familiarity with Infrastructure as Code concepts (Terraform). Experience with monitoring and observability tools (DataDog, Prometheus, Grafana). Understanding of containerization and Kubernetes fundamentals. Knowledge of CI/CD pipeline concepts and deployment automation. Experience with configuration management or automation tools. Advanced Operational Experience:. Previous experience in customer-facing technical support roles. Background in system administration or DevOps practices. Experience with incident management frameworks (ITIL, SRE practices). Understanding of SLA/SLO concepts and reliability engineering principles. Education & Experience. Bachelor's degree in Computer Science, Information Technology, Engineering, or related technical field. OR. equivalent work experience (additional 2+ years of hands-on technical experience in lieu of degree). 2+ years of experience in technical support, system administration, DevOps, or related operational roles. Demonstrated ability to learn new technologies quickly and adapt to changing technical environments. Compensation: . This posting is anticipated to remain open until August 17, 2025. The new hire base salary range if $70,000-$95,000. Factors which may affect the starting pay within this range include skills, experience, and other qualifications aligned with Updater's internal leveling guidelines.