Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

ThreddLondon, England, United Kingdom
30+ days ago
Job type
  • Full-time
Job description

Join to apply for the Site Reliability Engineer role at Thredd

Join to apply for the Site Reliability Engineer role at Thredd

Get AI-powered advice on this job and more exclusive features.

Are you passionate about building reliable, scalable, and high-performing systems? Do you thrive on solving complex infrastructure challenges while driving automation and observability best practices? If so, we want to hear from you!

At Thredd, we’re looking for a Site Reliability Engineer to act as a North Star for this evolving discipline. As our first engineer in this role, you’ll have the unique opportunity to shape our SRE strategy, establish best practices, and set the standard for service reliability and performance.

What You’ll Do

Define strategies for Application Performance Monitoring, Unit Cost, and Chaos Engineering.

Continuously optimize production environments to enhance reliability and efficiency.

Implement and apply MTTR, SLO, and SLI principles to ensure high service standards.

Respond to incidents, analyze root causes, and drive long-term improvements.

Maintain fault-tolerant, scalable, and cost-effective infrastructures and services.

Monitor availability, latency, and system health to keep our platform running smoothly.

Lead blameless postmortems and refine our incident response processes.

Provide feedback loops to development teams on operational gaps and resiliency concerns.

Support services before they go live with system design consulting, capacity planning, and launch reviews.

Scale systems sustainably through automation and infrastructure evolution.

Deeply understand our customers’ needs and the critical role Thredd plays in their businesses.

What You’ll Be Working On

Building and maintaining the infrastructure, tooling, and technical foundation of Thredd.

Ensuring high service uptime and reliability so product teams can innovate effectively.

Playing a key role in shaping the core technology layers that drive our platform’s success.

What You Need

Proven experience implementing SRE principles at scale, including deep knowledge of SLI / SLO / SLA differences.

A product engineering background with strong coding skills in Python, C#, or similar.

Experience with incident management frameworks and evolving them for efficiency.

Expertise in cloud platforms (AWS preferred) and container orchestration (Docker, Kubernetes, ECS).

Solid understanding of microservices, service mesh, and modern architectural concepts.

A collaborative mindset – you thrive on helping others and driving company-wide impact.

Nice to Have

Experience working in regulated industries (e.g., PCI compliance).

Background in capacity planning, performance, and load testing.

Sysadmin skills for troubleshooting disk, network, and infrastructure issues.

Why Join Thredd?

The chance to define and lead SRE best practices from the ground up.

A high-impact role in a rapidly growing company.

A collaborative, innovation-driven culture where your expertise will shape our platform’s future.

If you’re excited about scaling infrastructure, improving reliability, and making a real impact, apply now and help us build the future of Thredd!

Seniority level

Seniority level

Mid-Senior level

Employment type

Employment type

Full-time

Job function

Job function

Engineering and Information Technology

Referrals increase your chances of interviewing at Thredd by 2x

Get notified about new Site Reliability Engineer jobs in London, England, United Kingdom .

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 3 weeks ago

Greater London, England, United Kingdom 2 months ago

London, England, United Kingdom 9 hours ago

London, England, United Kingdom 5 days ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 1 week ago

South Croydon, England, United Kingdom 1 week ago

City Of London, England, United Kingdom 1 week ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 2 months ago

London, England, United Kingdom 2 weeks ago

Greater London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 2 months ago

Site Reliability Engineer, Traffic Platform

London, England, United Kingdom 1 week ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 1 week ago

London, England, United Kingdom 2 weeks ago

City Of London, England, United Kingdom £80,000.00-£100,000.00 3 weeks ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 1 day ago

London, England, United Kingdom 1 week ago

London, England, United Kingdom 6 days ago

London, England, United Kingdom 5 days ago

London, England, United Kingdom 5 days ago

London, England, United Kingdom 3 weeks ago

Site Reliability Engineer – Field Operations

London, England, United Kingdom 1 week ago

London, England, United Kingdom 1 week ago

London, England, United Kingdom 2 weeks ago

London, England, United Kingdom 2 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

J-18808-Ljbffr

Create a job alert for this search

Site Reliability Engineer • London, England, United Kingdom