+
Login

Enter your email and password to log in if you already have an account on H512.com

Forgot password?
+
Създай своя профил в DEV.BG/Jobs

За да потвърдите, че не сте робот, моля отговорете на въпроса, като попълните празното поле:

115+60 =
+
Forgot password

Enter your email, and we will send you your password

Intellias Bulgaria

Site Reliability Engineering Expert (Lead)

ApplySubmit your application

The job listing is published in the following categories

  • Anywhere
  • Report an issue Megaphone icon

Report an issue with the job ad

×

    What is wrong with the job listing?*
    Please describe the problem:
    In order to confirm you are not a robot please fill the answer to the calculation in the field:
    Tech Stack / Requirements

    We are looking for a Site Reliability Engineering Expert to drive reliability, scalability, and performance across our systems and services. This role is ideal for someone with deep technical expertise in SRE principles, cloud-native infrastructure, and incident management. You will act as a strategic advisor and hands-on contributor, helping teams build resilient systems and improve operational excellence.

    What project we have for you

    Our customer is a multinational corporation with more than a century of history and offices in over 180 countries. Their most ambitious goal at the time is to introduce a range of Reduced-Risk Products (RRPs). The target audience is more than 1 billion of consumers around the globe. IT platform hosts 700+ applications.

    Intellia’s mission is to help the client with the engineering of a comprehensive software ecosystem for a game-changing IoT product on the margin of innovative consumer experience and cutting-edge technology. Our teams are involved in the engineering of core platform components for best in class eCommerce, Digital Marketing and IoT solutions. As a Cloud engineer you will become a part of Core Architecture Team and be responsible for the architecture, implementation of best practices in our Digital Engineering Enterprice Platform.

    The Platform is a set of services and internet applications that accelerate the development and delivery of software applications by taking care of common SDLC challenges. The Platform provides access and consumption for engineering teams to a set of services, technologies, practices for their development and for operating their application, ensure a set of compliance and best practices.

    Project is in production for 2+ years, being supported by multiple teams.

    Our technical domains are:

    • AWS cloud, partially Azure
    • SSO, Organizations, Service control policies, access models.
    • IAAC: terraform enterprise, terratest, chalice
    • Serverless: lambda, step functions, wide range of misc automations, fargate
    • System, Application, Network and security architectures
    • Orchecstration: k8s (eks)
    • SRE activities (logging, tracing, monitoring), OpsGenie, Splunk
    • Hashicorp Vault
    • Hybrid Networking

    What you will do

    • Define and promote SRE best practices across engineering teams
    • Lead reliability initiatives, including SLIs/SLOs definition and tracking
    • Design and implement scalable, fault-tolerant systems
    • Collaborate on incident response, root cause analysis, and postmortem processes
    • Improve observability and alerting strategies for proactive issue detection
    • Mentor teams on reliability engineering principles and tooling
    • Drive automation to reduce toil and improve operational efficiency
    • Contribute to architectural decisions with a focus on reliability and performance

    What you need for this

    • 5+ years of experience in Site Reliability Engineering or related roles
    • Strong knowledge of SRE principles: SLIs, SLOs, error budgets, incident response
    • Experience with cloud platforms (AWS, GCP, Azure) and Kubernetes
    • Proficiency in observability tools (New Relic, Prometheus, Grafana, ELK, etc.)
    • Solid understanding of CI/CD pipelines, automation, and infrastructure-as-code (Terraform, Helm, ArgoCD)
    • Strong programming/scripting skills (Python, Go, Bash)
    • Experience with incident management, postmortems, and reliability reviews
    • Excellent communication and mentoring skills
    • English — Upper-Intermediate

    Nice to have:

    • Certifications in SRE, cloud, or DevOps domains
    • Experience with chaos engineering and resilience testing
    • Familiarity with ITIL or other service management frameworks

    What it’s like to work at Intellias

    At Intellias, where technology takes center stage, people always come before processes. By creating a comfortable atmosphere in our team, we empower individuals to unlock their true potential and achieve extraordinary results. That’s why we offer a range of benefits that support your well-being and charge your professional growth.

    We are committed to fostering equity, diversity, and inclusion as an equal opportunity employer. All applicants will be considered for employment without discrimination based on race, color, religion, age, gender, nationality, disability, sexual orientation, gender identity or expression, veteran status, or any other characteristic protected by applicable law.

    We welcome and celebrate the uniqueness of every individual. Join Intellias for a career where your perspectives and contributions are vital to our shared success.