LSEG has won the Most Innovative Data Quality Initiative Award in A-Team Group’s Innovation Awards 2025 for its Tick History – PCAP, which was expanded this year to offer more than 400 feeds, with new coverage spanning 14 markets in the Americas, eight in the Asia-Pacific region and 76 in EMEA.
These awards, now in their fifth year, celebrate innovative projects and teams across vendor and practitioner communities that make use of new and emerging technologies and techniques to deliver high-value solutions for financial institutions in capital markets.
LSEG’s TH-PCAP was selected as an award winner by A-Team Group’s independent, expert advisory board in collaboration with A-Team’s editorial team.
William Watson, Director, Data, Low Latency at LSEG explains how the innovation came to life and why data quality is so important to financial institutions.
Data Management Insight: Congratulations on winning the Most Innovative Data Quality Initiative Award. Can you tell us a little about how TH-PCAP has evolved and how it uses data?
William Watson: What started as Patrick Flannery and Michael Lehr in a garage with a dream and a single piece of software has grown into a world-class data platform. In the early days, it was scrappy — no customers, no funding, just grit. Slowly, customer by customer, the company grew organically to about 25 people before securing its first major investment round.
By the time I joined, there was no formal data team. Yet we were already generating around 20 terabytes of data per day. Our data stack was held together with cron jobs, shell scripts, and some Python, running on a homegrown but solid C++ engine. No formal platform or process.
Today, we’ve built a full-scale data ecosystem. We run over 1,500 Airflow jobs daily. Our test coverage spans unit, functional, integration tests with staging, UAT, and production validation. We’ve implemented automated data intelligence, real-time checks, processing pipelines, and documentation so vast it might justify hiring a librarian. We’ve gone from garage to enterprise-grade.
DMI: What are its unique attributes and how is data quality ensured?
WW: Our data starts with some of the best engineers in the world — literally. Seven of our C++ developers sit on ISO C++ committees. They have built custom software to capture packet-level data with zero gaps, using enhanced industry-standard PCAP formats. Our software is so good that we actually sell Real-Time – Ultra Direct as a separate product.
We live-monitor every component — the capture system, the software, the files — to ensure every bit of data is complete and reliable. Downstream, we inspect data at every stage: arbitration, normalisation, storage. We look for gaps, anomalies, missing fields, nonsensical values, and anything else that doesn’t look right. Data quality isn’t something we add on — it’s something we build in, end-to-end.
DMI: What would be the consequences if data quality cannot be assured?
WW: For hedge funds and front-office traders: poor quality data can lead to misinformed algorithms, and without a robust assurance process to correct poor quality data, this could negatively impact revenue.
For governments and compliance teams: invalid or non-compliant trades could be missed — or worse, false positives could flood their systems, wasting time and causing unnecessary friction.
For global banks: all of the above risks apply — but there’s an added weight when retail investors’ money is in play. That’s real people’s savings on the line.
DMI: Why is it so important that data quality is optimised in modern data setups?
WW: Because scale changes everything. With Tick History – PCAP, you have data from more than 450 trading venues, thousands of daily jobs and tens of petabytes in storage. No human — or team of humans — can manually keep up. You have to trust your systems. But more than that, you need systems to monitor the monitors.
Perfection isn’t the goal. In fact, chasing perfection is a trap. What’s far more powerful is building flexible, intelligent systems that adapt, learn, and optimise. We don’t just want more alerts — we want better alerts. Smarter detection. Actionable insights.
If you’ve got 1,000 alerts and only 10 people, that’s not monitoring — that’s chaos. Quality at scale is about strategy, efficiency, and continuous evolution.
DMI: How does this fit into a modern trading architecture – have you needed to rebuild other parts of your setup?
WW: Our data and software are used throughout the entire trading lifecycle:
Algorithm training: Every night, our PCAP data powers offline learning systems to help customers optimise their trading strategies.
Trade validation: Pre-trade, post-trade, even mid-trade — our PCAP data helps validate the legality and accuracy of every execution.
Performance analytics: Want to know if you beat the market? Our data tells you.
Backtesting: Hypotheses don’t matter until they’re tested. We provide the testing ground.
Regulatory insight: Government agencies use our data to understand the markets and inform policy.
And yes, there are many other use cases that are NDA’d or classified.
We are always rebuilding some part of our platform to meet these and future, unpredictable needs — and it’s been worth every bit of effort. Everything is built with scalability and the future in mind and when we find that something may not scale like we would like, we consider rebuilding.
DMI: What market trends/changes/challenges have stimulated the development of this innovation?
WW: Real-time is the new normal. When I started in this space, most customers were fine with T+1 delivery — “give me the data tomorrow, we’ll be fine”.
Now? T+1 means T+15 minutes. Some want near-instant, intra-day updates.
This isn’t just about speed — it’s about actionable quality. High-frequency trading has been around forever but the data behind it hasn’t always kept up. We’ve been capturing real-time data since day one, and our data is trusted for its accuracy and reliability around the world, but now customers actually want to use it in real time.
Processing 70-plus terabytes per day isn’t a flex — it’s a challenge. The data is only as good as what you can do with it. We’re finally reaching a point where customers are trying to operationalise it at scale. It’s hard, but that’s where we thrive.
DMI: Do you have any further updates in the pipeline?
WW: Always — otherwise, I’d be out of a job.
This year, we’re migrating our core data processing to the cloud. That unlocks huge wins: isolated workflows, full data lineage, elastic scaling, faster pipelines, real-time observability, and full redundancy.
And beyond infrastructure, we’re exploring AI-driven arbitration and data quality diagnostics. Those systems are still in early stages, but by 2026 we expect them to be production-grade.
There’s no finish line — we’re always pushing forward.
Subscribe to our newsletter