Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

DuffelLondon
30+ days ago
Job type
  • Full-time
Job description

Create the future of travel with us Whether it’s to visit the people closest to us, starting an exciting adventure, or a career-defining business trip, travel is an essential part of our lives. Yet we've all experienced the aches and pains of getting to our destination. Today, more than 4 billion airline passengers rely on technology that hasn't kept up with the expectations of the modern connected traveller.That’s why we’ve started to rebuild the infrastructure that underpins the travel industry. We’re on a mission to unravel travel — simplifying systems and building the tools that will make the future of travel effortless.We were part of Y Combinator S18's cohort and we are backed by Benchmark, Blossom, Index Ventures and Kima Ventures. A fantastic set of investors that has helped build some of the world's largest companies.Our team in London is growing and we’re looking for talented people to join us on our journey Engineering at Duffel We're building tools to simplify travel distribution, search and booking. What does this actually mean? It's one common and seamless API. This brings huge technical challenges as we need to design and build a beautiful API before integrating to hundreds of airlines. Along with that we need to navigate through the differing needs and systems of each airline whilst building a fantastic developer experience to go with it.The tools used on the team include Elixir, Phoenix, Kubernetes and Google Cloud Platform. Site Reliability Engineering at Duffel As an SRE at Duffel, you’ll be part of a small team within engineering that is responsible for the reliability, performance, and resilience of our infrastructure and applications. You will be working closely with engineering teams to understand their needs and help meet the demands of our product as we scale globally. What we're looking for - An infrastructure and systems engineering generalist who is comfortable diving deep into the weeds on different issues. Some recent examples include : - A configuration issue between Google’s Load Balancer and the HTTP server in our main Elixir application causing HTTP 5XX responses to be returned to our customers. - Debugging an issue in our OpenTelemetry pipelines causing us to silently drop spans.- An enthusiasm for both software development and systems engineering.- A high bar for code and configuration quality and readability.- A good understanding of current observability and reliability practices.- Experienced and comfortable in running incident response.- Big picture thinking - you can make trade offs on technical work streams against business impact.- Fantastic communication skills. You're able to articulate what you're working on and why to the team in a clear and structured way.- You thrive in a collaborative environment. You believe in your own methods but keep an open mind, taking suggestions and feedback onboard as well. Technologies Don’t worry if your experience doesn’t exactly align with this stack, we understand that skills are transferrable. This is to give you an idea of what you’ll be working with if you join the team.- We run our infrastructure on Google Cloud Platform, so you’ll be helping to run a few of their products such as GKE, CloudSQL for PostgreSQL, BigQuery, Memorystore (Redis) and more.- We manage the infrastructure and security for a segregated PCI Cardholder Data Environment, entirely managed with Google Cloud Platform services and tooling.- We follow an Infrastructure as Code approach to managing our infrastructure, using Terraform.- We follow a GitOps approach to managing our Kubernetes configuration, using ArgoCD and Helm.- We manage a high-availability metrics collection system using Grafana, Thanos & Prometheus. We’re in the process of transitioning to OpenTelemetry and Honeycomb for our application telemetry (traces and metrics).- We manage a data pipeline using Pub / Sub, Airbyte, and dbt. Our Current Focus We’re currently driving a big shift in how we think about and monitor reliability across the engineering organisation, with a focus on early detection of customer-impacting issues. We’re extending and standardising our use of OpenTelemetry, and introducing Honeycomb as the single place for engineers to understand how our applications are operating in production. This project involves both technical work, on the application libraries and infrastructure that make up the OpenTelemetry pipeline, and an education piece, working to change perceptions and behaviours across engineering. The Future - We currently run all our services from a single European region in Google Cloud. In the medium term, for performance, reliability, and data residency reasons, we’ll be starting to think about how to (re)architect our applications and infrastructure to span multiple regions, operating globally.- We deploy our application multiple times a day, but deploys are all or nothing, and when we encounter issues, roll backs are slow. One way to address this would be to invest in CI / CD performance improvements, but we’d also like to explore alternative deployment strategies like Canaries, Blue / Green, and traffic mirroring, and get more comfortable testing changes in production with real customer traffic. What you can expect from us : We're dedicated to your personal growth. Our environment is comfortable both physically, but also in that our ears are always open to any ideas, concerns and questions. We believe that everyone should have pride in their work, taking full ownership of it and its impact. That's why everyone who joins Duffel owns a share of the company.

  • We are an equal opportunities employer. We believe that the key to our success is employing a diverse team, that's why recruitment decisions are only based on your experience and skills. We value your ability to problem solve and build amazing things so we welcome applications for everyone – regardless of age, sex, disability, sexual orientation, race, religion or belief.
Create a job alert for this search

Site Reliability Engineer • London

Related jobs
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

NinjaOne, LLCLondon, England, United Kingdom
Full-time
At NinjaOne we are passionate about building unified IT solutions that simplify the way IT organizations work.We are currently looking for a. SRE team in the Platform Engineering organization and he...Show moreLast updated: 7 days ago
Site Reliability Engineer

Site Reliability Engineer

Palantir TechnologiesLondon, United Kingdom
Full-time
Palantir builds the world’s leading software for data-driven decisions and operations.By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving ...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

ZipRecruiterLondon, England, United Kingdom
Full-time
Site Reliability Engineer (SRE) - Kubernetes, Observability, Prometheus, Dynatrace, OpenTelemetry.This is a fantastic opportunity with a consulting company seeking to fill multiple SRE roles.You wi...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

ThreddLondon, England, United Kingdom
Full-time
Get AI-powered advice on this job and more exclusive features.Are you passionate about building reliable, scalable, and high-performing systems? Do you thrive on solving complex infrastructure chal...Show moreLast updated: 30+ days ago
Site Reliability Engineer

Site Reliability Engineer

Third RepublicLondon, England, United Kingdom
Permanent
Our client helps companies build highly complex mathematical models from available limited datasets in order to help them make key business decisions. With increased demand and growth at the forefro...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Leo TechnologyLondon, United Kingdom
Full-time
The Job : Job Title : Site Reliability Engineer Industry : SaaS Working Set-Up : Remote first set-up Salary - £45,000-£55,000 per annum Interview process : 2-3 stages The Role : One of our long stand...Show moreLast updated: 27 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

GSRLondon, England, United Kingdom
Full-time
Founded in 2013, GSR is a leading market maker and programmatic trading firm in the fast-evolving world of cryptocurrency trading. With over 200 employees across seven countries, we provide billions...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

SS&C TechnologiesLondon, England, United Kingdom
Full-time
As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries.Some 20,000 financial se...Show moreLast updated: 22 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

C3 AILondon, England, United Kingdom
Full-time
NYSE : AI) is a leading Enterprise AI software provider for accelerating digital transformation.The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications mor...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

AnnapurnaLondon Area, United Kingdom, United Kingdom
Full-time
Location : London Hybrid (3 days WFH).Annapurna is working on behalf of a pioneering technology company to recruit a Site Reliability Engineer (SRE). This is a unique opportunity to play a vital role...Show moreLast updated: 30+ days ago
Site Reliability Engineer

Site Reliability Engineer

linkupLondon, UK
Full-time
C3 AI (NYSE : AI), is the Enterprise AI application software company.C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing,...Show moreLast updated: 30+ days ago
Site Reliability Engineer

Site Reliability Engineer

XcedeLondon, London, United Kingdom
Full-time
A technology-focused, multi-strat investment firm, operating at the cutting edge of their industry, is looking for a Site Reliability Engineer to join their highly skilled, innovative team.Make sur...Show moreLast updated: 6 days ago
Site Reliability Engineer

Site Reliability Engineer

J Bandy ConsultingLondon, United Kingdom
Full-time
We are hiring for a next generation telecoms software company who are seeking a Network Autonomy Engineer to join their expanding team. Primary Function of the Position .Reporting to the Site R...Show moreLast updated: 30+ days ago
Site Reliability Engineer

Site Reliability Engineer

CanonicalLondon, GB
Remote
Full-time
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise in...Show moreLast updated: 16 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Explore GroupLondon, ENG, UK
Full-time
Site Reliability Engineer (Hybrid – London) | RegTech Innovator | AWS, Terraform, Kubernetes.London (Hybrid – 2-3 days in office). Are you passionate about scalable infrastructure and mo...Show moreLast updated: 15 days ago
Site Reliability Engineer

Site Reliability Engineer

Liberty Charge Ltd.London
Full-time
At Believ, formerly known as Liberty Charge, we believe sustainable transport should be accessible to everyone.We’re a Charge Point Operator (CPO) on a mission to create the UK’s most reliable elec...Show moreLast updated: 16 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

UnitaryLondon, England, United Kingdom
Full-time
Get AI-powered advice on this job and more exclusive features.This range is provided by Unitary.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more....Show moreLast updated: 23 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

CurveLondon, England, United Kingdom
Full-time
Social network you want to login / join with : .Curve was founded with a rebellious spirit, and a lofty vision; to truly simplify your finances, so you can focus on what matters most in life.That’s why...Show moreLast updated: 13 days ago