Site Reliability Engineering Team Lead - Guadalajara, México - Finastra

    Finastra
    Default job background
    Descripción

    Responsibilities What will you contribute? As a Site Reliability Engineer Lead your mission is to protect and advance the software & systems behind Finastra's Cloud hosted services running on Fusion Operate.

    Finastra believes in a blameless culture where the primary objective is continuous improvement.

    You'll be leading an operations team whose aim is to treat operations as a software engineering problem aiming to build reactive systems that self-heal, ensuring we keep revenue-critical systems up & running despite natural disasters, unexpected surges in traffic, and configuration errors.

    Your day will vary from the fine-grained details of optimizing disk performance, authoring operational code for our applications to the big picture of reliability modelling.

    You will operate as part of a global scaled agile SRE team applying your experience in Continuous Delivery.


    Responsibilities & Deliverables:
    Your deliverables as a Site Reliability Engineering Lead will include, but are not limited to, the following:
    Act as primary SME for Cloud tooling, as well as mentoring colleagues on the SRE team
    Assume leadership and mentorship responsibilities in post-mortem reviews of incidents
    Work with containers and container orchestration systems such as Kubernetes
    Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable
    Identify and troubleshoot any availability and performance issues at multiple layers of deployment, from hardware to operating environment, network, and application
    Collaborate with other engineers to implement operational solutions while defining and adhering to industry best practices
    Participate in weekly on-call rotation
    Required Experience: 3/5+ years of experience in Cloud Operations
    Prior team leadership or management experience
    Proficiency with Infrastructure as Code technologies such as Terraform, CloudFormation, or ARM
    Experience developing and deploying resources with a cloud provider (I.e., Azure, AWS, Cloudflare, GCP)
    Networking concepts (load balancing, TCP/IP, HTTP, gRPC, DNS) and troubleshooting tools (Wireshark, command line, BPF)
    Experience with version control systems (GitHub, Gitlab, Bitbucket)
    Comfortable with scripting languages like Python, Bash and Go
    Familiarity with container technologies like Docker and Kubernetes
    Knowledge of Cloud-native architecture, Cloud tooling and the latest trends and practicesAppropriate Linux, Kubernetes & Cloud Certifications a plus

    #J-18808-Ljbffr