Monitoring & Support Engineer at TO.SCALE

Source: https://jobs.workable.com/view/tP8znQyMH3usZ4v3nFDaYT/remote-monitoring-%26-support-engineer-in-portugal-at-to.scale

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Monitoring & Support Engineer at TO.SCALE. At . To.Scale. , we help forward-thinking tech teams grow with the right people — fast. We’re currently hiring on behalf of a global enterprise client for a . Monitoring and Support Engineer. role. If you get a kick out of keeping systems stable, alerts meaningful and uptime high, keep reading.. We’re looking for a . Monitoring and Support Engineer. to help keep critical systems running smoothly, round the clock. . This isn’t your typical low-level support role. You won’t just be clearing tickets and forwarding alerts, but rather proactively watching over large-scale systems, catching issues before they escalate and helping coordinate fast, effective incident response.. You’ll work closely with senior engineers and SREs to fine-tune monitoring, improve alert logic and make sure noise stays low and signal stays high. It’s a hands-on, technical role for someone who thrives in high-availability, shift-based environments.. What you’ll actually be doing. . Monitor the health of production systems using tools like . Datadog. , . CloudWatch. , and . PagerDuty. . . Triage alerts and escalate real incidents to engineering or L3 support . . Set up and tune dashboards, monitors, and alert thresholds to improve accuracy and reduce false positives . . Participate in incident management calls and contribute to structured root cause analysis . . Maintain and improve runbooks and incident handling documentation . . Support integration of new systems into the monitoring ecosystem . . Collaborate with global teams to ensure consistent observability and uptime . . In standard job ad lingo, you’ve probably got 2–4 years of experience in IT operations, system monitoring, or support roles.. But at . To.Scale. , we care more about what you can . actually. do than how long you’ve done it.. Here’s what we’re really looking for:. . You’ve worked in 24x7 production environments and are comfortable handling alert floods without panicking . . You’ve used . Datadog. (or something similar) to monitor systems, configure alerts, and debug incidents . . You understand how alerts, logs, and metrics work together to show what’s . really. happening . . You know how to escalate issues with the right context, not just forward them blindly . . You’re confident in working shifts and keeping calm during the occasional 3 a.m. fire drill . . You’ve dabbled in scripting (Shell, Bash, or Python) to automate recurring tasks or want to learn more . . You take pride in being the first line of defense for system stability, and you genuinely enjoy keeping things running. . Company Location: Portugal.