Lead Site Reliability Engineer (M365) at Jobgether

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Lead Site Reliability Engineer (M365) at Jobgether. This position is posted by Jobgether on behalf of a partner company. We are currently looking for a . Lead Site Reliability Engineer (M365). in the . United States. .. This role provides the opportunity to lead and enhance the reliability, performance, and scalability of a large Microsoft 365 environment supporting a national organization. You will design and implement monitoring and observability dashboards, automate processes with PowerShell and Graph APIs, and optimize workflows with Power Apps/Automate. The position requires hands-on technical expertise combined with leadership skills to guide teams, manage incidents, and drive continuous improvement. You will work with multiple stakeholders, ensuring systems remain secure, performant, and highly available while mentoring team members and shaping best practices for cloud and on-premises M365 services.. . Accountabilities. Lead the development and creation of monitoring and observability dashboards in . Splunk, Dynatrace. , and other platforms.. Drive incident management, post-incident reviews, and root cause analysis for improved system reliability.. Develop and maintain automation scripts using . PowerShell. and integrations with . Graph APIs. .. Optimize and maintain workflows and applications using . Power Apps/Automate. .. Guide teams in deploying new services, performing system validations, and managing service performance.. Coach and mentor team members, design key performance indicators, and implement best practices.. Ensure compliance with organizational policies and drive continuous improvements in system reliability and SDLC processes.. . Bachelor’s degree in a quantitative or technical field (e.g., Computer Science, Engineering, Statistics) or equivalent experience.. 5–7 years of . site reliability engineering. experience, focused on . Microsoft 365. environments.. Advanced proficiency in . PowerShell scripting. and . Graph APIs. ; intermediate skills in . Power Apps/Automate. .. Strong experience with monitoring and observability tools such as . Splunk. and . Dynatrace. .. Solid understanding of . incident management. processes and cloud/enterprise system administration.. Demonstrated analytical skills, project management abilities, and technical aptitude.. Excellent judgment, decision-making, and communication skills to effectively guide teams and influence upper management.. . Company Location: United States.