Site Reliability Engineer
Cape Town South Africa, Western Cape, Sudáfrica
1d ago
source : jobomas

Site Reliability EngineerAre you craving the challenge of building complex systems- Really smart systems where performance and speed are essential without sacrificing the working environment-If this appeals to you, join a purpose-driven, fast-growing enterprise software company that is working to transform Public Safety.

The power to do remarkable things when IT matters most is the heart of public safety. At The company, we believe that regardless of size, geography or budget, everyone in public safety should have access to the data they need when IT matters most to save more lives.

That-s why, since 2016, our mission has been to reduce emergency response times and improve public safety. We are the industry-s only truly open and integrated emergency response platform with a portfolio of web-based cloud solutions that includes analytics, mapping, dispatch and first responder applications.

Job DescriptionSite Reliability Engineering is an engineering discipline devoted to helping an organization sustainably achieve the appropriate level of reliability in their systems, services, and products.

The Company SRE team plays a crucial role in our mission to reduce emergency response times and improve public safety.The Company is looking for a Site Reliability Engineer who will be part of a team who will be responsible for monitoring our production systems 24 / 7.

We are looking to hire for our US based overnight shifts with weekend flexibility. Your primary responsibility will be to Important Company support when there is an incident and managing communications and escalations around the incidents.

You will be monitoring our entire platform infrastructure and applications. You must be comfortable performing well under pressure with tight deadlines and communicate to larger audiences.

Your other responsibilities will be to build monitoring and alerting tools around the availability, performance, and overall health of our services with scalability and automation in mind.

Responsibilities : Work with DevOps and DBA teams to support Cloud infrastructure.Work with Analytics team to support Eclipse Analytics.

Work with Platform and other Development teams to support Nimbus / Radius front end applications and back end services.Work with IoT Team to support IoT Devices.

Work with Customer Support team to Important Company technical support for customer reported issues.Work with QA and Implementation teams to Important Company insight on application and infrastructure performance with future releases.

Be in a scheduled rotation for On duties which include receiving alerts from monitoring systems as well as internal escalations.

Build and improve monitors and alerts to increase visibility of system health.Build tools or automation that can improve SRE role efficiencies or increase monitoring capabilities.

Troubleshoot technical issues with infrastructure and applications.Operate as an Incident Commander role when Incidents are created.

Escalate to other teams, be a central communication channel across teams, and make detailed timeline entries of actions taken during Incident.

Produce Root Cause Analysis reports for customers.Write post-mortems for Incidents and review with internal teams.Skills / ExperienceBachelor''s degree in Computer Science, Management Information Systems, or equivalent field with 1-2 years- experience as a Site Reliability EngineerExperience with Cloud services, with preference with Azure around Application Insights, Logging, and MonitoringReliability engineer, DevOps engineer, or Software engineerFamiliarity of distributed systems and microservicesUnderstanding of front end and back-end architectureExperience with SQL databasesExperience with Datadog or other monitoring and logging toolsProgramming / Scripting skills in a major language such as .

NET, PowerShellExperience with deployment tools such as Terraform, Ansible, PuppetExperience in KubernetesStrong communication skillsBehavioural competencies requiredWork well under pressureGood communication skills Written and verbalA good problem solverHave an inquisitive natureAbout The company Inc.

Fast-growing, passionate, mission-driven team - we care about saving lives through technology!We are people-centric and ensure an environment where employees are encouraged to grow and learn every dayOffices in Austin, TX and Cape Town, South AfricaIf you don-t receive feedback from us within two 2 weeks of receiving your CV, please consider your application unsuccessful.

Report this job

Thank you for reporting this job!

Your feedback will help us improve the quality of our services.

My Email
By clicking on "Continue", I give neuvoo consent to process my data and to send me email alerts, as detailed in neuvoo's Privacy Policy . I may withdraw my consent or unsubscribe at any time.
Application form