Site Reliability Engineer

Site Reliability Engineer
Empresa:

Guidewire Software


Detalhes da Vaga

The Opportunity


We are searching for a Site Reliability Engineer eager for a rare chance to transform insurance with the industry's leading cloud platform. As a member of the SRE-Application team, you'll be responsible for building and evolving our SRE practice for the applications running on our Guidewire Cloud Platform. This is an opportunity to apply your expertise in automation, software engineering, and operational discipline to ensure the reliability, performance, and scalability of our cloud-based solutions.

What You'll do

Collaborate with development teams to troubleshoot and solve problems, reducing customer impact.
Develop automated runbooks and implement measures to handle issues proactively.
Apply sound engineering principles and mature automation to our operating environments.
Monitor, maintain, and enhance the reliability and performance of applications on our Guidewire Cloud Platform.
Leverage your automation and software engineering expertise to optimize systems and eliminate toil.
Document and examine incidents to improve processes and continuously prevent future occurrences.
Stay up-to-date with the latest industry trends, tools, and best practices in site reliability engineering.
Contribute to a culture of innovation, learning, and continuous improvement.

What You'll Bring

Proven experience as an SRE or similar role, with a track record of improving system reliability
Strong problem-solving skills and the ability to analyze complex systems and devise effective solutions
Excellent collaboration and communication abilities to work cross-functionally and clearly document processes
Experience with automation, monitoring, and performance optimization tools and techniques
Dedication to maximizing uptime, scalability, and delivering an exceptional end-user experience
A passion for technology and a strong desire to continuously learn and grow your skills
Alignment with Guidewire's mission to leverage technology to help protect and support others

Required Skills & Experience

Proven experience leveraging application performance monitoring (APM) and telemetry tools to troubleshoot and diagnose problems
Proven experience triaging and debugging distributed systems on cloud infrastructure
Proven experience in designing and engineering CI/CD pipelines within Kubernetes (K8S) and legacy ecosystems
Proven experience in designing and engineering monitors, dashboards, and synthetic transactions in Datadog
Proven experience in building, deploying, and running scalable infrastructure within AWS and Kubernetes ecosystems using Terraform and other cloud-native approaches
Proven experience in managing infrastructure configuration at scale using multiple approaches and/or tools such as GitOps, Puppet, or Ansible
Good understanding of AWS cloud networking and security with hands-on experience remediating infrastructure vulnerabilities at scale
Good understanding of SLIs, SLOs, and Error Budgets
Comfortable with Linux system administration, with the ability to program/script using Python, Go, Java, shell, or equivalent
Participate in mandatory on-call rotations to ensure service availability and reliability, responding to incidents and alerts outside regular hours, including weekends and holidays. Candidates must be willing and able to fulfill this critical responsibility.

Preferred Skills

SRE certified in multiple categories
AWS certified in multiple categories
Proficiency with SQL, database administration, data pipelines, performance tuning, and schema design
Proficiency with multiple pipelining tools such as TeamCity, Bitbucket Pipelines, Jenkins, and GitHub Actions
Familiarity with open-source distributed data processing frameworks such as Hadoop, Apache Spark, AWS Redshift, etc.


#J-18808-Ljbffr


Fonte: Whatjobs_Ppc

Função de trabalho:

Requisitos

Site Reliability Engineer
Empresa:

Guidewire Software


Banco De Talentos

Banco de Talentos - Odoo Você é apaixonado por tecnologia, empreendedorismo e quer construir sua carreira em um dos principais players do mercado? Talvez seu...


Desde Odoo - Paraná

Publicado 6 days ago

Arquiteto De Soluções Em Sap Analytics

Venha para uma das maiores empresas de Serviços TI domundo!! Aqui você pode transformar sua carreira! Por que fazer parte da TCS? Aqui na TCS acreditamos que...


Desde Tata Consultancy Services - Paraná

Publicado 6 days ago

Técnico Suporte Usuário 1N I - Bilingue - Espanhol

A Atos é uma empresa líder mundial em transformação digital, oferecendo serviços de transação de alta tecnologia, consultoria, integração de sistemas, cloud ...


Desde Atos - Paraná

Publicado 6 days ago

Cisco Network Solutions Architect

Come to one of the biggest IT Services companies in the world!! Here you can transform your career!Why to join TCS? Here at TCS we believe that people make t...


Desde Tata Consultancy Services - Paraná

Publicado 6 days ago

Built at: 2024-10-04T15:30:06.628Z