Site Reliability Engineer

Detalhes da Vaga

POSITION OVERVIEW
At Guidewire, we make software that offers Property and Casualty (P&C) Insurance companies the
tools to take care of their customers when they need it the most, whether that's a time of crisis, a
natural disaster, an accident, or exposure to cyber risks.
We build the core applications that
insurance companies use to sell and underwrite policies, settle claims, and bill their customers.
We
also have a portfolio of innovative products serving the needs of P&C insurance companies in areas
such as data management, digital online portals, and predictive analytics.
We run these products on
the Guidewire Cloud Platform, and we help hundreds of insurance providers all over the world to
handle billions of dollars of business.
We are proud to be voted a Top Cloud Employer on Glassdoor by our own employees and positioned
as a market leader by industry experts like Gartner.
We have a fun work environment and a culture
that lives by our core values of integrity, rationality, and collegiality.
We're searching for people who are as passionate about working together to deliver quality products
and support as we are.
Join us and enjoy a career where you can make an impact.
You'll be inspired
by those around you, and you'll be trusted and empowered to go further.
As a Site Reliability Engineer, you will be part of a team that is passionately automating everything
possible to make Guidewire systems run more efficiently.
The Platform team is dedicated full-time
to creating and running software that improves the reliability of systems in production, serving
hundreds of customers and supporting millions of transactions each day.
You will be ensuring the
reliability of Guidewire's flagship cloud platform and Insurance Suite products and building tooling to
help ensure efficient operations and optimal availability of all SaaS multi-tenant and customer-
focused systems.
Platform SREs collaborate closely with Guidewire's core product developers to
ensure that the Guidewire core cloud products address functional and non-functional requirements
such as availability, performance, observability, and maintainability.
This role requires a high degree of collaboration, teamwork, ownership and responsibility.
If you like
to be challenged and have a passion for solving problems at scale with systems like AWS,
Kubernetes and Aurora, then we would love to hear from you.
The ideal candidate is someone who
exemplifies the ethics of, "If you have to do something more than once, automate it," and who can
rapidly self-educate on new concepts and tools.
Bonus points if you have prior experience doing
production support of a SaaS platform and are comfortable working with bleeding edge highly
containerized cloud-native environments in AWS.
ESSENTIAL DUTIES AND RESPONSIBILITIES

Take a purist SRE approach to shared multi-tenant infrastructure for a resilient SaaSmicroservice-based containerized systems in addition to customer-centric applicationenvironments.
Oversee and automate the team's growing presence in AWS.
Contribute to core infrastructure systems development with features, bug fixes, reliabilityimprovements, etc.
Platform reliability engineering of a complex single sign-on SAML/OAuth-based centralauthentication platform.
Creatively build and develop tooling to aid in driving 24x7x365 follow-the-sun operations of critical production systems.
Automate deployment tasks for core product and infrastructure tools and maintain automationinfrastructure.
Create system documentation and training materials to empower and educate our fellow teammembers.
Build and maintain observability tooling, metrics, and dashboarding for a global platform product infrastructure.
Improve our incident management lifecycle to identify, mitigate, and learn from reliability risksand issues.
Enhance platform observability with helping create a self-healing approach to platform reliability.
Collaborate with engineering teams, providing product feedback and where necessary contribute code to the product.
Education and Work Experience

Bachelor's Degree in Computer Science or related field
Software engineering and task automation skills with Bash, Python, and/or Go are a must.
Solid understanding of agile software development methodologies (Scrum, Kanban, etc.)
Deep background with Linux systems and engineering
Highly experienced with engineering and automating on Amazon Web Services (AWS)
Experience supporting web applications running on Java / Apache / Tomcat in a live productionenvironment.
Prior experience with IaC tools like Terraform/Terragrunt/Terraspace
Prior experience with devops/gitops tools (Git, Bitbucket, Flux CD, Teamcity) for gate promotions
Production-At-Scale support background in a heavily microservice-based world
Hands-on engineering and ops expertise in containerization (Docker, Helm, Kubernetes/EKS, CNI
and Ingress networking)
Strong understanding of Single-Sign On, SAML, OAuth (Bonus if hands-on experience with Okta)
Seasoned expertise around

x.509

certificate technology and basic concepts of encryption.
Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDS
Advanced exposure to application development, web UI (design and development), JSON,
application architecture
Experience strongly utilizing observability tools (logging/APM) like Datadog, CloudWatch, andPagerDuty.
Familiarity with event store/stream-processing technologies like Kafka or AWS SQS
Understanding of Open Application Model systems such as KubeVela or Crossplane.
Personal Qualities and Soft Skills

You greatly prefer writing code than clicking a GUI.
You enjoy teaching, being a mentor to others, and working across boundaries
Outstanding troubleshooting skills; ability to think critically and display an aptitude for problem solving
Strong analytical mind with a penchant for process development and enhancement
A highly positive can-do attitude with desire for being a team player
Great communication skills and ability to explain complex technical concepts to a varied audience
Demonstrate strong follow-through, a strong work ethic and consistently keep and meet commitments
Ability to champion a culture of reliability within the product team, promoting practices like blameless postmortems, SLO tracking, and continuous learning from incidents.
Other Requirements

Ability to read, write, and speak English
We provide 24x7 support to our customers, so we expect you to take turns with your teammates being on-call for weekend production emergencies or to provide rotating weekend operational support.
Travel – Expect occasional travel (less than 5%) to other Guidewire offices for training and team meetings.
About Guidewire
Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently.
We combine digital, core, analytics, and AI to deliver our platform as a cloudservice.
More than 540+ insurers in 40 countries, from new ventures to the largest and most complex in the world,run on Guidewire.
As a partner to our customers, we continually evolve to enable their success.
We are proud of our unparalleled implementation track record with 1600+ successful projects, supported by the largest R&D team and partner ecosystem in the industry.
Our Marketplace provides hundreds of applications that accelerate integration, localization, and innovation.
For more information, please visit www.guidewire.com and follow us on Twitter: @Guidewire_PandC .
Guidewire Software Inc. provides equal employment opportunities to all applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
All offers are contingent upon passing a criminal history and other background checks where it's applicable to the position.
We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment.
Please contact us to request accommodation.

#J-18808-Ljbffr


Salário Nominal: A acordar

Fonte: Appcast_Ppc

Função de trabalho:

Requisitos

Tech Lead - Sre (Remoto)

Job descriptionO QUE ESTAMOS BUSCANDO?Somos uma fintech em plena expansão e, neste momento, temos buscado fortalecer ainda mais os nossos esforços em SRE. Pr...


Vexpenses - Brasil

Publicado 9 days ago

Data & Analytics Spec Iv

Se você tem grandes sonhos para sua carreira, e gosta de desafios, vem para o Santander.No Santander temos a cultura de horizontalidade e nelapraticamos 4 cl...


Banco Santander Sa - Brasil

Publicado 9 days ago

Senior Python Developer

Queremos fazer diferente com o essencial. Apostamos numa estratégia de corporativismo sustentável. Acreditamos que o sucesso se baseia na construção de uma e...


Team.It - Brasil

Publicado 9 days ago

Senior Qae

Queremos fazer diferente com o essencial. Apostamos numa estratégia de corporativismo sustentável. Acreditamos que o sucesso se baseia na construção de uma e...


Team.It - Brasil

Publicado 9 days ago

Built at: 2024-12-13T08:59:51.195Z