Monitor our systems and applications, ensuring that everything is working perfectly. Escalate more complex incidents to our squads, ensuring that they have all the necessary information. Be proactive in identifying patterns and suggesting improvements to prevent problems from occurring again. Work together with the development and operations teams to improve our incident monitoring and response processes. Analyze and optimize alerts to ensure they are relevant and actionable, avoiding false alarms. Analyze system logs to identify and solve problems. Maintain constant and clear communication with everyone involved during incident management, ensuring that everyone knows what is happening. Participate in postmortems to learn from each situation and continually improve our processes. Create and update documentation of incident monitoring and management procedures and processes.Requirements CompetenciesPrevious experience in technical support or N1 monitoring. Knowledge of monitoring tools such as Grafana, Prometheus, AppDynamics, and Dynatrace. Ability to diagnose and resolve basic technical problems. Excellent communication skills to interact with different levels of the organization. Proactivity and initiative to suggest improvements in monitoring processes and systems. Ability to work well in a team and collaborate with different squads. Basic knowledge of ITIL or good IT service management practices.Skill SetExperience in DevOps environments and adoption of agile methodologies. Experience with automation of repetitive tasks. Knowledge of ITIL, DevOps practices, and Site Reliability Engineering (SRE). Knowledge of Java programming.LanguagesPortugueseEducationnull
#J-18808-Ljbffr