DevOps & SRE
Staff Site Reliability Engineer
Job ID: #
An established hedge fund firm is looking for a senior SRE to join their Singapore office.
Position: Staff Site Reliability Engineer (Bare Metal Experience) - Hedge Fund Domain
Location: Singapore
Company Overview:
Our client is an established hedge fund in Singapore. With a strong track record of success and a dedication to innovation, they are at the forefront of leveraging technology and data-driven insights to drive investment strategies and deliver exceptional returns to our clients.
Job Description:
We are seeking a talented Staff Site Reliability Engineer (SRE) with expertise in bare metal infrastructure to join their dynamic team within the hedge fund domain. As a Staff SRE, you will play a critical role in designing, building, and maintaining the reliability, scalability, and performance of our infrastructure stack, with a focus on bare metal environments. You will work closely with cross-functional teams to ensure the seamless operation of our trading systems and support our mission-critical operations.
Responsbilities:
- Design, implement, and maintain highly available, fault-tolerant infrastructure solutions for our trading systems, with a focus on bare metal environments.
- Collaborate with engineering teams to define reliability and performance requirements, and implement solutions to meet those requirements.
- Develop and maintain automation tools and frameworks for infrastructure provisioning, configuration management, and monitoring.
- Implement and maintain robust disaster recovery and failover mechanisms to ensure business continuity.
- Lead efforts to identify and address performance bottlenecks, scalability challenges, and other operational issues.
- Define and enforce best practices for infrastructure security, compliance, and operational excellence.
- Mentor and coach junior members of the SRE team and contribute to the overall technical growth of the organization.
Industry
Location:
Singapore, Remote
Company Size:
Job Type:
Date:
Requirements
- Bachelor's or advanced degree in Computer Science, Engineering, or a related field.
- Extensive experience as a Site Reliability Engineer or similar role within the hedge fund, financial services, or high-frequency trading domain.
- Strong expertise in designing, building, and managing bare metal infrastructure at scale.
- Proficiency in scripting and automation using languages such as Python, Bash, or similar.
- Experience with configuration management tools (e.g., Ansible, Puppet, Chef) and infrastructure as code (IaC) principles.
- Deep understanding of networking concepts, protocols, and technologies.
- Strong problem-solving skills and ability to troubleshoot complex issues in distributed systems.
- Excellent communication and collaboration skills, with the ability to work effectively in a fast-paced, dynamic environment.
Good to have:
Expsoure to HFT or similar trading background.
Previous exposure to K8s operators.
Proficiency in containerisation technologies.
Understanding of security principles from both operational and implementation perspectives.