Senior Quality Engineer – Platform Resilience & Scalability | Platform Engineering | UK - Remote
RLDatix (RLD) is on a mission to help raise the standard of care…everywhere. Trusted by over 10,000 healthcare organisations around the world, our solutions help improve health and care. Our applications ensure that patients receive the best and safest care while supporting the providers who deliver it.
Joining TeamRLD means being part of a global effort of over 2,000 team members in making a difference in healthcare…every day.
We’re searching for a UK-based Quality Engineer – Platform Resilience & Scalability to join our Platform Engineering team, so that we can ensure our Internal Developer Platform remains resilient, scalable, and highly available across multiple global regions. The Quality Engineer will design and execute resilience and performance testing strategies to guarantee our platform meets a 99.95% uptime SLA and scales dynamically under demanding conditions.
How You’ll Spend Your Time
Design chaos experiments using tools like Chaos Mesh, Litmus, or AWS Fault Injection Simulator to validate failure scenarios across EKS clusters and regions.
Test auto-recovery mechanisms such as Karpenter autoscaling, pod restarts, and ALB failover in order to ensure platform resilience.
Analyse performance bottlenecks in Kubernetes clusters, Istio service mesh latency, and GitOps pipeline throughput to optimise system behaviour.
Validate scalability by testing rapid scale-up scenarios and multi-region failover capabilities to support 3,000+ pods per cluster.
Define and monitor SLOs/SLIs for platform services using HoneyComb, CloudWatch, and Prometheus to maintain observability and reliability.
What Kind of Things We’re Most Interested in You Having
Strong experience in Kubernetes production environments (EKS preferred).
Proven success in chaos engineering and resilience testing using major frameworks.
In-depth knowledge of distributed systems failure modes and performance tuning.
Sincere interest in building resilient, scalable platforms that power global healthcare solutions.
A knack for working collaboratively within a fast-paced, cloud-native environment.
Check out our Employee Spotlights blog!
This website uses cookies.
We use cookies to personalise content such as job recommendations, and to analyze our traffic. You consent to our cookies if you click "I Accept". If you click on "I Do Not Accept", then we will not use cookies but you may have a deteriorated user experience. You can change your settings by clicking on the Manage Cookies link in the website footer.
These cookies are necessary for the website to function and cannot be switched off in our systems. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site will not then work.
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site. If you do not allow these cookie we will not know when you have visited our site, and will not be able to monitor its performance.