Associate Director, Software Engineering Specialist(AI Platforms SRE)
Guangzhou, GD, CN, 510620
Job description
Some careers have more impact than others.
If you’re looking for a career where you can make a real impression, join HSBC and discover how valued you’ll be.
We are currently seeking an experienced professional to join our team in the role of ssociate Director, Software Engineering Specialist(AI Platforms SRE).
Business: AI Platforms
Job ID:46882
Principal responsibilities
- Lead major incident troubleshooting and root cause analysis, driving fast recovery and durable fixes.
- Architect and evolve scalable, highly available, secure infrastructure using cloud and container platforms (AWS/GCP/Azure, Kubernetes, Docker).
- Embed and mature SRE disciplines: define/measure SLIs/SLOs, manage error budgets, and automate operations to reduce toil.
- Build and operate observability (monitoring, logging, alerting) with tools such as Prometheus, Grafana, ELK, Datadog, Splunk.
- Advance CI/CD, deployment automation, infrastructure as code (Terraform, Ansible, Helm), and configuration management.
- Mentor engineers and junior SREs, promoting reliability, learning, and knowledge sharing.
- Partner with engineering, QA, product, and operations to bake reliability, scalability, and security into the SDLC.
- Lead on-call practices, improve incident response, and run blameless post-mortems.
- Prioritise and deliver large-scale engineering initiatives that improve reliability and operational efficiency.
- Maintain high-quality documentation, runbooks, and knowledge bases.
Requirements
- Degree in Computer Science/Engineering (or equivalent practical experience).
- 10+ years in IT, with strong experience in SRE/DevOps/Production Support or similar roles.
- Deep hands-on expertise in Kubernetes/Docker, cloud platforms, and orchestration.
- Strong Linux, networking, and security fundamentals.
- Proven experience with monitoring, logging, and observability platforms.
- Proficiency in at least one language (e.g., Python, Bash, Go).
- Demonstrated delivery of automation and optimisation at scale.
- Experience implementing and scaling SRE principles across teams.
- Strong analytical, communication, and coaching skills.
- Track record supporting high-availability / mission-critical / 24x7 environments.
- Preferably with strong IaC expertise (Terraform, Ansible, or similar), relevant certifications (GCP Professional SRE, AWS DevOps Engineer, CKA/CKAD), experience with microservices/distributed high-throughput systems, and AIOps exposure.
- Demonstrates an AI-native mindset by applying AI-driven approaches, including coding assistants, to improve productivity, quality, and engineering best practices.
/WX
You’ll achieve more when you join HSBC.
HSBC is an equal opportunity employer committed to building a culture where all employees are valued, respected and opinions count. We take pride in providing a workplace that fosters continuous professional development, flexible working and, opportunities to grow within an inclusive and diverse environment. We encourage applications from all suitably qualified persons irrespective of, but not limited to, their gender or genetic information, sexual orientation, ethnicity, religion, social status, medical care leave requirements, political affiliation, people with disabilities, color, national origin, veteran status, etc., We consider all applications based on merit and suitability to the role.
Personal data held by the Bank relating to employment applications will be used in accordance with our Privacy Statement, which is available on our website.
***Issued By HSBC Software Development (GuangDong) Limited***