Required Qualifications :
Bachelor’s degree or foreign equivalent required from an accredited institution. Will also consider three years of progressive experience in the specialty in lieu of every year of education.
At least 4 years of Information Technology experience.
Preferred Qualifications :
- Experience working in Python based SRE practices and leading transformational change in SRE OPS
- Hands on experience in creating Splunk, New Relic dashboards – preferred. But equivalent monitoring and logging tools like Datadog, Prometheus, Dynatrace, Kibana, Grafana will also work
- Respond to issues promptly, perform root cause analysis, and implement preventive measures to avoid similar incidents in the future.
- Partner with application development teams to deliver operational readiness for new products and features.
- Drive improvements in productivity, software quality and reliability.
- Effectively articulate complex problems, concepts, and solutions to varied audiences.
- Coach and mentor team; proactively identify training and growth opportunities.
- Knowledge of defining and monitoring system quality measures, including SLO and SLA
- Hands-on experience collecting performance data, analyzing, troubleshooting, and tuning
- Excellent critical thinking, communication, presentation, documentation, troubleshooting and collaborative problem-solving skills.
- Provide advanced Incident Management and Problem Management support to teams, to effectively identify, remediate, and resolve issues related to platform reliability, stability, and performance through careful analysis of telemetry data and system logs.
- Document all changes following controls, procedures and documentation standards and raises issues and concerns with recommendations for follow-up action
Emplois favoris
Vous devez être connecté pour pouvoir ajouter un emploi aux favoris
Connexion ou Créez un compte