Responsibilities
- Design, build, and enhance cloud infrastructure and deployment pipelines supporting web and mobile applications
- Drive infrastructure automation, configuration management, and operational improvements using Infrastructure as Code (IaC) and scripting
- Design, implement, and optimise CI/CD pipelines to improve deployment speed, reliability, and consistency
- Set up and manage observability, monitoring, logging, and alerting systems to proactively identify and resolve issues
- Implement and maintain platform security practices including access control, secrets management, vulnerability management, and secure deployments
- Lead incident response, troubleshooting, root cause analysis, and post-incident improvements to strengthen system resilience
- Collaborate closely with software engineers, product managers, and stakeholders to support reliable and efficient software delivery
- Improve system scalability, maintainability, operational processes, and cost optimisation
- Develop and maintain operational documentation, runbooks, and technical standards
- Promote DevOps, SRE, observability, and platform engineering best practices across the team
- Mentor junior engineers and contribute to technical capability building
- Continuously evaluate and adopt relevant cloud, DevOps, and platform engineering technologies and practices
Requirements
- Bachelor's degree or higher in Computer Science, Information Systems, Engineering, or a related field
- 7+ years of experience in DevOps, Platform Engineering, Site Reliability Engineering (SRE), or Infrastructure Engineering
- 4+ years of experience leading technical initiatives, projects, or engineering teams
- Proven ability to work independently and drive initiatives end-to-end
- Strong hands-on experience with AWS services including ECS Fargate, Lambda, S3, Aurora, RDS, IAM, CloudWatch, and networking/security components
- Strong experience with containerisation technologies such as Docker and production-scale container deployments
- Experience designing and maintaining CI/CD pipelines using GitLab and related DevOps tooling
- Strong Infrastructure as Code (IaC) experience using Terraform and/or CloudFormation
- Solid understanding of observability, monitoring, logging, alerting, and system reliability principles
- Familiarity with modern application stacks such as React, Node.js, and React Native
- Experience managing and supporting MySQL and PostgreSQL databases
- Good understanding of authentication, middleware, API integrations, application security, and distributed systems
- Strong foundation in cloud computing, secure system design, and software engineering principles
Problem Solving & Leadership
- Strong analytical and problem-solving skills with the ability to independently drive solutions
- Comfortable operating in fast-paced and ambiguous environments
- Ability to break down complex technical challenges into actionable solutions
- Experience mentoring and guiding junior engineers