Talent.com
This job offer is not available in your country.
Observability and Automation Engineer

Observability and Automation Engineer

TP ICAPBelfast
30+ days ago
Job type
  • Full-time
Job description

Role Overview

This is a varied role working across a multitude of observability tools to ensure smooth operation of Liquidnet’s Production trading platforms and underlying infrastructure supporting them. The role is responsible for early identification of issues, correlating multiple issues across hardware and software and ensuring clear runbooks are created and followed at all times.

Liquidnet employ an offshore ‘Network Operational Centre’ (NOC) in India who perform ‘eyes on glass’ monitoring of systems, raising alerts to the relevant teams as per runbook definition. This role provides a level of oversight and guidance to the offshore team, but is embedded within the wider Production Support organisation so that it can provide realtime statistics and input towards root cause analysis, particularly during major incidents. Input is critical on major incident bridges and post-mortems to ensure continuous improvement to the observability of systems.

Liquidnet currently operate on-prem however over the next 2 years, there is a major transformation programme that will see all non-latency-sensitive applications moving to the Cloud, so experience in observability across both setups is crucial. A consolidation project is also underway across the observability tooling stack within the TP ICAP group that is likely to have an impact on the Liquidnet tooling and working practices.

Lastly, there is a big drive to automate as much as possible at Liquidnet and so this role would contribute to this by automating smoke checks and other manual workflows. So the ideal candidate would combine a scripting and automation skillset with a ruthless desire to reduce TOIL and repetitive, low-value tasks.

Role Responsibilities

Day to day oversight of the offshore NOC (alongside the role’s manager in New York)

Day to day and responsibility for monitoring tooling, ensuring robust design and prescribed documentation is followed at all times

Actively engage and participate on incident bridges, helping to identify root cause as quickly as possible through realtime status updates and monitoring

Participate in post-mortems to ensure monitoring quality is continually evaluated and any gaps identified within incidents are closed as quickly as possible

Onboard new workflows and systems into the observability stack, ensuring adherence to standards. This includes building bespoke solutions where vendor products fall short

Working with DevOps and Deployment teams to ensure system changes are agreed from an observability perspective and not causing unnecessary risk

Contributing towards Automation and AI workstreams with a particular focus on the automation of post-deployment checks and smoke testing of application workflows to remove the requirement for manual work

Occasional weekend work will be required during major upgrades and out of hours testing

Experience / Competences

Essential

At least 5 years hands-on experience within a SysOps or NOC team, ideally within a financial institution (buy-side, sell-side, venue / platform provider)

Working knowledge of Cloud based infrastructure and application monitoring, ideally with AWS certification (Cloud Practitioner, SysOps Administrator or other)

Proven hands-on automation and scripting experience (PERL, Python, Powershell, Bash etc)

Basic application support experience within a Unix / Linux environment

Experience of supporting Windows Server environments

Experience in troubleshooting network problems : i.e. firewall and routing problemsProven track record of implementing and maintaining a robust and flexible monitoring solution across a complex technical environment

Proactive, tenacious individual with ability to solve complex issues

Willingness to challenge the status quo and bring about positive change

Capable of balancing multiple conflicting priorities and managing stakeholder expectations honestly and appropriately

Desired

Knowledge of OTEL and STATSD protocols

Knowledge of DevOps principles and workflows, including collaboration

with Development teams

Experience with automation tools (Ansible, Puppet etc)

Experience supporting message-based architecture (Solace, Tibco, MQ etc)

Experience with industry-standard monitoring tools (ITRS, Prometheus or similar)

Working knowledge and experience working with SNMP and iLo protocols

AWS-certified to SysOps Administrator level

Basic knowledge of the FIX protocol and workflows

Experience within MSSQL, Oracle or Sybase database environments

Experience working within an ITIL framework, ideally with ITIL Foundation qualification

LI-Hybrid #LI-ASO #NIJobs

Create a job alert for this search

Automation Engineer • Belfast