Site Reliability Engineer at SS&C Technologies Canada Corp.

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Site Reliability Engineer at SS&C Technologies Canada Corp.. Location Information: Canada. This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.. Role Description. Be part of a global team that ensures the performance, scalability, and reliability of critical cloud-based applications. As part of the Global Investor and Distribution Solutions (GIDS) Platform Services team, you’ll play a key role in keeping our systems running smoothly and efficiently—while helping shape the future of our platform.. Collaborate with global teams as part of a follow-the-sun support model. . Respond to, troubleshoot, and resolve Level 2 application incidents. . Ensure critical applications are effectively monitored using tools like Prometheus and Grafana. . Create and maintain dashboards and alerts to enhance visibility into application health. . Define, implement, and track key SRE metrics (SLOs, SLIs, error budgets). . Partner with development teams to improve application reliability and resilience. . Analyze incident trends and recommend improvements to reduce recurrence. . Automate repetitive support tasks to improve efficiency. . Participate in post-incident reviews and drive reliability initiatives. . Perform infrastructure and application patching as part of regular maintenance cycles. . Support security vulnerability remediation efforts across both infrastructure and application layers. . Qualifications. Bachelor’s degree in Computer Science, Computer Engineering, IT, or related field. . 5+ years of experience for senior roles; fresh graduates welcome for junior roles. . Proficiency in one or more programming languages, preferably Java, JavaScript or Python. . Proven ability to troubleshoot complex systems. . Skilled in debugging, code optimization, and automation. . Experience with relational databases and data analysis. . Experience working in Site Reliable Engineer (SRE) roles or incident response environments. . Hands-on experience with cloud infrastructure, preferably AWS. . Familiarity with observability tools such as Grafana, ELK Stack, or similar. . Experience deploying and managing applications on Kubernetes platforms. . Strong skills in analyzing and troubleshooting issues in large-scale, distributed systems. . Familiarity with PostgreSQL and its performance tuning, monitoring, and troubleshooting. . Benefits. Flexibility: Hybrid Work Model & a Business Casual Dress Code, including jeans. . Your Future: RRSP Matching Program, Professional Development Reimbursement. . Work/Life Balance: Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays. . Your Wellbeing: Medical, Dental, Vision, Employee Assistance Program, Parental Leave. . Diversity & Inclusion: Committed to Welcoming, Celebrating and Thriving on Diversity. . Training: Hands-On, Team-Customized, including SS&C Learning Institute. . Extra Perks: Discounts on fitness clubs, travel and more! . Wide-Ranging Perspectives: Committed to Celebrating the Variety of Backgrounds, Talents and Experiences of Our Employees.