At DIGITALL we dont just deliver technology we deliver the future! We are explorers, knowledge-hunters, tech geeks, problem solvers and game changers who want to inspire and be inspired. Our DIGITALL people are always one step forward: working with top-notch technologies, creating innovations ahead of the market trends, sharing the passion for discovering better ways. As a human-centric organization, our teams are built on mutual respect and open communication, allowing everyone to be authentic, express ideas and unleash their potential. We are proud of our DIGITALL bright minds and never stop developing their skills to keep pushing boundaries together and do what we love. DIGITALL operates globally with a team of 1300 experts in 16 locations across 8 countries.
You will be part of a team, which is playing a fundamental role in managing and maintaining the organization’s cloud computing strategy. They are responsible for overseeing the daily operations and maintenance of cloud applications, ensuring their availability, performance, reliability, and security. This involves monitoring system health, handling software upgrades and deployments, identifying and troubleshooting issues, and ensuring optimal resource allocation. They also work closely with other teams to troubleshoot complex system issues, implement necessary updates, and ensure compliance with industry’s best practices and regulations. Furthermore, the team is crucial in disaster recovery planning and execution, as well as creating guidelines and procedures for cloud operations.
The team makes the respective environment to run better by providing deep technical coverage for Incident Management applying SRE principles. We share a Live Site First culture and care for the business continuity of our customers running mission critical applications on top of the Cloud Platform.
This is your job:
Automation of internal and production environments using Jenkins as a Service (JaaS), Python, Bash, and Shell scripts.
Service onboarding to new Data Centers (DCs).
Service deployment and release management through JaaS, Concourse, Terraform, and related tools.
Operations & Ongoing Maintenance, including:
Major change management
Autoscaling
Database scaling
Subscription updates
Monitoring & Alerting using Dynatrace, Grafana, and related platforms.
Service Resilience Testing via automation frameworks (Multi-AZ setups, Chaos Engineering/Chaos Days).
Regional Support for EU-access and MENA-access environments.
User Access Management, including CAM profile management.
Your qualifications:
Bachelors degree in Computer Science, Information Technology, or related field.
Significant prior experience in a DevOps, SRE, or System Administration role.
Hands-on experience with AWS, GCP, or Azure services and architecture.
Strong knowledge of Jenkins, Travis CI, Puppet, Chef, GitHub, and related tools.
Proficiency in Python, Go, Bash, Shell, Groovy or Selenium.
Experience with Grafana, Dynatrace, performance tuning, monitoring, and system-level debugging.
Knowledge of automated response mechanisms for incidents, warnings, and KPIs is a plus.
Familiarity with Docker, Kubernetes, Cloud Foundry, ECS, or OpenShift.
Expertise with Terraform, CloudFormation, or Ansible.
Knowledge of Agile/Scrum processes is a plus
Organizational information:
All applications will be treated in strict confidentiality
Please note that only shortlisted candidates will be invited to an interview
By enabling them, you help us to develop and deliver better services in the way that's most convenient for you. For information and settings, see our Cookie Policy.