OpsHero is a boutique technology firm with an exclusive focus on DevOps services. We thrive on our non-traditional, flat structure which enables us to navigate away from the constraints of the corporate world. This allows us to maintain agility and adaptability, ensuring our ability to swiftly meet the evolving needs of our clients
At OpsHero, we are at the forefront of innovation, always incorporating cutting-edge technologies into our processes. We believe that by bridging the gap between development and operations, we can accelerate product delivery, enhance efficiency, and ultimately drive our client’s success.
What you will do
Reliability ownership: Define and measure SLIs/SLOs, manage error budgets, and use them to guide release gates and risk decisions with product and engineering.
Observability, end to end: Instrument services and infra for metrics, logs, and traces; standardize dashboards/alerts; reduce noise and MTTR via runbooks and actionable paging.
Automation & toil reduction: Build tools and guardrails (scripts, controllers, GitOps jobs) to eliminate repetitive ops work and improve stability.
Incident response & postmortems: Lead on-call duties, triage issues, coordinate comms, and run blameless postmortems with clear, tracked actions.
Architecture collaboration: Contribute to resilient designs with product and platform and engineering teams.
Scope note: SREs collaborate closely with Platform/Infra; you influence infra changes and validate outcomes, while primarily owning reliability at the Service/Kubernetes layer (releases, configs, runtime policies, and telemetry).
What you will need
Kubernetes expertise
Strong understanding of K8s architecture (API server, controllers, scheduler) and CRDs/Operators.
Practical RBAC, admission controls, quotas/limits, and multi-tenant hardening.
Networking: CNI, Ingress/Load Balancers, traffic policies; Autoscalers (HPA/VPA) and Cluster API exposure.
Storage: CSI drivers; experience with cloud block/file (e.g., OpenStack Cinder) a plus.
OpenShift experience is a plus.
Software development is a plus (Go preferred)
We use Go for ops tooling/operators; experience with client-go, controller-runtime, Operator SDK; comfortable with REST/gRPC and auth patterns; event-driven reconciliation.
Infrastructure & DevOps
Solid Linux admin/debugging, containers, and networking fundamentals.
Understanding of distributed systems (etcd basics, leader election) and failure modes.
Nice to have
CKA/CKAD/CKS or CNCF certifications (KCNA/CCA).
Contributions to open source (operators, controllers, CRDs).
Experience with OpenTelemetry, Prometheus/Alertmanager, Loki, Grafana, and SLO-based alerting.
OpenStack experience
Sounds awesome! How to Become a DevOps Hero?
Applying for a job at OpsHero is easy and fast. Once you apply and sign your contract, you’re ready to start your new job. As part of our team, we’ll help you grow in your career and as a person.
We welcome everyone, no matter their background. Different perspectives make us think differently and come up with new ideas. This belief guides our entire hiring and employment process.
By enabling them, you help us to develop and deliver better services in the way that's most convenient for you. For information and settings, see our Cookie Policy.