Job Type: Full REMOTE Experience Level: More than 4 years About the Role: We are seeking a highly skilled Senior Site Reliability Engineer (SRE) with a strong technical background in managing Kubernetes clusters, particularly with Amazon EKS.
The ideal candidate will have a deep understanding of the Kubernetes ecosystem, proven expertise in AWS, and a successful track record in SRE roles, collaborating closely with engineering teams.
This role requires proficiency in multiple programming languages, experience with Terraform, and the ability to build and manage CI/CD pipelines and microservices.
Key Responsibilities: Kubernetes Management: Provision and manage Kubernetes clusters, with a preference for Amazon EKS, ensuring high availability, scalability, and reliability.
Helm Charts: Develop and maintain Helm charts to streamline and automate the deployment of applications within Kubernetes.
SRE Collaboration: Engage with engineering teams to enhance the reliability, performance, and scalability of applications through SRE practices.
AWS Expertise: Utilize your extensive experience with AWS (4-5 years) to manage cloud infrastructure, optimize resources, and implement best practices for security and cost management.
Terraform: Leverage your 2-3 years of experience with Terraform to create, manage, and automate cloud infrastructure as code.
Programming Proficiency: Use your strong command of multiple programming languages to automate processes, build tooling, and enhance infrastructure reliability.
CI/CD Pipeline Development: Design, implement, and maintain CI/CD pipelines to ensure smooth and efficient deployment processes.
Microservices Deployment: Manage and optimize the deployment of microservices, ensuring they are scalable and resilient within the Kubernetes ecosystem.
Monitoring & Alerting: Implement robust monitoring and alerting systems within the Kubernetes ecosystem to proactively identify and resolve issues, ensuring the reliability of applications.
Qualifications: Experience: 3-5 years of experience in provisioning and managing Kubernetes clusters, with a preference for EKS.
Kubernetes Expertise: Solid understanding of the container and Kubernetes ecosystem, including best practices and tools.
Helm: Strong experience writing and maintaining Helm charts.
AWS: 4-5 years of strong expertise in AWS, with a deep understanding of its services and best practices.
Terraform: 2-3 years of extensive experience with Terraform for infrastructure as code.
Programming: Proficiency in multiple programming languages, with a focus on automation and tooling development.
CI/CD: Hands-on experience in building and managing CI/CD pipelines.
Microservices: Proven experience in deploying and managing microservices in a production environment.
Monitoring & Alerting: Experience with monitoring and alerting tools within the Kubernetes ecosystem, ensuring high system reliability.
Preferred Skills: Strong problem-solving abilities and a proactive mindset.
Excellent communication skills, with the ability to collaborate effectively with cross-functional teams.
Experience in a fast-paced, agile environment.