Responsibilities
- Design, build, and maintain cloud infrastructure and deployment pipelines supporting web and mobile applications
- Automate infrastructure provisioning, configuration, and operations using Infrastructure as Code and scripting
- Implement and manage CI/CD pipelines to improve deployment speed, consistency, and reliability
- Monitor system health, performance, and availability, and proactively resolve issues
- Strengthen platform security through best practices in access control, secrets management, vulnerability management, and secure deployment
- Support incident response, troubleshooting, root cause analysis, and continuous improvement of system resilience
- Collaborate with engineers, product managers, and stakeholders to enable efficient development workflows and reliable releases
- Review system architecture and operations to improve scalability, maintainability, and cost efficiency
- Maintain operational documentation, runbooks, and technical standards aligned with governance guidelines
- Drive adoption of DevOps, observability, and reliability best practices
- Stay up to date with industry trends, tools, and practices in cloud, DevOps, and platform engineering
Requirements
- Bachelor's degree in Computer Science, Information Systems, Engineering, or a related field
- Minimum 5 years of experience in DevOps, Platform Engineering, Site Reliability Engineering, or Infrastructure roles
- At least 3 years of experience leading or guiding technical initiatives or small teams
- Strong hands-on experience with AWS services such as ECS Fargate, Lambda, S3, Aurora, RDS, IAM, CloudWatch, and networking/security components
- Experience with containerisation technologies such as Docker and container-based deployments
- Strong experience in CI/CD pipelines and release automation using GitLab and Jira
- Proficiency in Infrastructure as Code tools such as Terraform or CloudFormation
- Solid understanding of observability, monitoring, alerting, and logging practices
- Familiarity with modern application stacks (e.g. React, Node.js, React Native) for effective collaboration with development teams
- Experience with database operations (MySQL, PostgreSQL)
- Strong understanding of authentication, APIs, middleware, and system integration
- Good foundation in cloud computing, software design patterns, and secure system design
Nice to Have
- Experience working in Singapore government or regulated environments
- Strong stakeholder management and cross-team collaboration skills
- Knowledge of disaster recovery, backup strategies, and business continuity planning
- Experience with platform governance, compliance, and production support processes