+
Login

Enter your email and password to log in if you already have an account on H512.com

Forgot password?
+
Създай своя профил в DEV.BG/Jobs

За да потвърдите, че не сте робот, моля отговорете на въпроса, като попълните празното поле:

84-1 =
+
Forgot password

Enter your email, and we will send you your password

HR agency Elevate

Senior Site Reliability Engineer

ApplySubmit your application

The job listing is published in the following categories

  • Anywhere
  • Report an issue Megaphone icon

Report an issue with the job ad

×

    What is wrong with the job listing?*
    Please describe the problem:
    In order to confirm you are not a robot please fill the answer to the calculation in the field:

    We are looking for experienced Senior Site Reliability Engineers (SREs) to join our team and help maintain and enhance the reliability, scalability, and performance of our cloud-based systems. Our platform processes vast amounts of data in real time and operates 24/7 with high availability, requiring expertise in automation, monitoring, and incident resolution.

    This role requires on-site presence at our office 4 days a week to support effective collaboration and teamwork.

    Responsibilities:

    • Design, implement, and maintain highly available, fault-tolerant cloud infrastructure with an Infrastructure-as-Code (IaC) approach.
    • Develop and optimize automated CI/CD pipelines following the GitOps methodology.
    • Improve service scalability and engineering productivity through automation.
    • Monitor and maintain production systems, proactively identifying and resolving performance bottlenecks.
    • Implement security and compliance best practices.
    • Develop and maintain observability solutions, ensuring comprehensive monitoring, alerting, and logging across distributed systems.
    • Participate in an on-call rotation, incident resolution, and root cause analysis to enhance system resilience.
    • Plan and execute disaster recovery and system capacity scaling strategies.
    • Collaborate closely with development and architecture teams to drive performance improvements and optimize infrastructure.

    Requirements:

    • 4+ years of experience as an SRE, Systems Engineer, or DevOps Engineer supporting large-scale, high-availability systems.
    • Strong Linux administration skills and knowledge of networking fundamentals (TCP/IP, DNS, routing).
    • Hands-on experience with public cloud providers (AWS, GCP, or Azure) and container orchestration using Kubernetes & Docker.
    • Proven expertise in Infrastructure-as-Code tools (Terraform, Ansible, ArgoCD, or Helm).
    • Proficiency in automation and scripting using Python, Go, or Bash.
    • Experience working with distributed systems and databases such as Kafka, Cassandra, ClickHouse, PostgreSQL, MySQL, MongoDB, or VictoriaMetrics.
    • Familiarity with CI/CD tools such as GitLab CI/CD, Spinnaker and experience deploying high-availability applications.
    • Strong knowledge of monitoring and logging systems like Prometheus, Grafana, ELK Stack, Zabbix, or CloudWatch.
    • Effective communication and problem-solving skills, with the ability to work in a globally distributed team.
    • Fluent English (written & spoken).

    Nice to Have:

    • Experience with high-load distributed systems and microservices.
    • Knowledge of VoIP solutions, contact center technologies, or SaaS monitoring practices.
    • Experience with JVM tuning, Nginx administration, and high-availability configurations (HAProxy, Keepalived).
    • Familiarity with ITIL or other IT service management frameworks.

    What We Offer:

    • A well-coordinated, professional team working on cutting-edge technologies.
    • Interesting and challenging tasks in a dynamic environment with opportunities for professional growth.
    • Additional Health and Life Insurance Package.
    • Employee Assistance Program.
    • 25 vacation days.
    • 200 BGN Digital Food Vouchers.
    • 120 BGN Gross as part of the salary for Working Expenses Allowance.