As market data volumes continue to grow, firms across the industry – both market participants and their service providers – are having to think strategically about what kind of infrastructure they need in order to accommodate this growth, particularly under peak loads, whilst optimising for high performance.
Given the rising volume, what kind of decisions are firms now having to make around how they deploy networks, databases and compute resources? How can they leverage service providers to help them achieve the necessary performance at scale? How are service providers rising to the challenge of supporting such a demanding customer base in this environment, ensuring that their customers get the data they need, when they need it, in a clean and consistent way? And how are firms adopting cloud-based solutions for high performance data management?
What exactly do we mean by high performance?
With such a wide variety of participants in the financial markets, every firm undoubtedly has its own idea of what constitutes high performance.
“Market making firms & HFTs are latency focused and rely on their speed to set and react to prices across multiple venues simultaneously. Their success is ultimately dependent on integrating and leveraging high performance trading infrastructures,” says Scott Charity, Senior Market Intelligence Specialist & Regulatory Affairs at Berenberg, the German private bank. “Whilst latency is also important to market practitioners executing orders on behalf of institutional clients, a more wholistic approach to price discovery and liquidity provision is required to mitigate market impact. This approach relies heavily on aggregating market data across numerous sources efficiently and effectively.”
This is an important point, because it highlights the fact that firms throughout the industry are impacted by growing data volumes. And they all need the ability to process this data in a timely manner, not just for fast trading, but for other purposes across the enterprise.
“We draw a distinction between real-time databases and real-time analytics,” says Umair Waheed, Head of Product Marketing at cloud data specialists Yellowbrick Data. “From an analytics standpoint, high performance is about understanding capital liquidity, understanding your aggregate risk across all different portfolios, across different parts of your organisation, and making sure that you’re not overexposed in any one area at a given point in time. The faster you can get an accurate picture of those positions, the faster you can act.”
This means giving customers full control of, and visibility into, their data, explains Waheed. “We’re exposing all of that data to all the people in the organisation. It could be quants, it could be finance, it could be operations, or compliance teams that don’t have to learn any new skills and don’t need any licences to use that data.”
Network implications of market data growth
For sell-side firms servicing institutional clients, the growth in market data volumes (particularly in the US, where the OPRA feed reported 128.5bn messages per day in December 2021) is having a significant impact on the network capacity they need to deploy in order to maintain high performance.
“There are now a number of US venues where users can’t even get the full bandwidth on a 10 GbE connection, so they are forced to move to 25, 40 or even 100 GbE, to prevent packets queuing,” says Ben Stephens, Global Head of Product Management at Instinet, the global agency broker. “I expect to see a number of exchanges outside of the US go that way too,” he adds.
Trading infrastructure providers are also upgrading their networks to accommodate the growing volumes, explains Alastair Watson, EMEA Managing Director at managed solutions vendor Transaction Network Services (TNS). “A massive part of our business is distributing raw market data around the world, so we understand the importance of network capacity. Many of our clients are data normalisers, that’s the core of their offering. And if they get poor data in, they send poor data out, so we’re held very accountable by those clients, in particular for zero packet drops. We’ve upgraded to 100 Gig circuits around the US, and upgraded all the switching infrastructure, to facilitate the additional bandwidth requirements of growing market data volumes.”
Richard Balmer, Director, Network Product Management at IPC, which delivers communications solutions for financial markets, suggests that the current landscape is driving firms towards specialist service providers. “The sustained increases in market volatility have driven a huge amount of data volume, requiring more network bandwidth, more compute horsepower, more memory, and more database storage. Firms that planned upgrades ahead of the global supply chain issues we’re now experiencing are probably sitting more comfortably than others right now,” he says. “Enterprises should really be looking to have an offload strategy with managed service providers who tend to have a huge amount of capacity in their networks anyway. And from a compute and database expansion perspective, where better than looking at Infrastructure-as a-Service plays, either with a managed service provider or with public cloud players?”
Can outsourcing help?
According to Will Winzor Saile, Partner, Execution Analytics and Architecture at research and execution broker Redburn, significant benefits can be gained from outsourcing trading and data infrastructure, not least of which is that it allows firms to focus on their core competencies.
“Redburn’s goal is to provide better trading outcomes for our clients. To do that we need to focus on where we can add value,” he says. “Spending my time touring around data centres and moving racks around is not worthwhile, particularly when many of our vendors have years of experience running large data centres and have agreements with hardware and network providers. It’s much more efficient for us to outsource the management of the infrastructure to the experts. Of course, that doesn’t mean outsourcing the oversight, we still know where the servers are, how they’re connected, and so on. We have all the control we need to manage our trading performance, but none of the maintenance overhead.”
The build, buy, or outsource question is an important one, particularly when it comes to capturing and storing the huge amounts of market data that firms might need for a wide range of downstream business functions. However, there are some key issues to consider, says Stephens.
“The question is, do you do this on your own or do you go to a provider who can capture and timestamp all of this data?” he asks. “Quant hedge funds and high frequency trading firms have exacting requirements. There are providers out there now that do a pretty good job of capturing market data before providing it to you as a service. But then there are questions around usage rights and who owns what. It’s nice to have a third party capture all the data and present it to you via GCP or AliCloud or Azure, but the licencing can be quite tricky. Still, there are providers starting to make that happen.”
One such provider is Yellowbrick, suggests Waheed. “People today need fear-free access to data,” he says. “It’s no longer acceptable to have big repositories of historical datasets that are only available to be queried occasionally. Data needs to be fluid and it needs to be accessible at all times. Users, whether finance teams, analysts or quants running analytics over multiple years’ worth of data, now have an expectation of this data being available, to effectively build and back test more accurate models, run more scenarios and to have a better sense of risk.”
Data traceability and monitoring
For firms maintaining their own high performance data infrastructure, having traceability of data throughout the trade lifecycle is essential, says Ilya BochKov, Senior Architect, Capital Markets at Luxoft, the enterprise technology services company.
“With the advent of microservice architecture, different parts of the business process can be comprised of multiple microservices, platforms, Ethernet interfaces, APIs, and a whole range of other components’” he says. “If something goes wrong, or if something slows down, it’s essential to pinpoint at what stage the request fails. Traditionally, it used to be a big monolith that just sequentially logged to some storage and then you could plough through the logs and find the problem. These days, you have to treat logging and tracing capabilities as multiple streams of events that can be collected and correlated to each other through a dashboard, so that you can track how the request goes through all the sources, retries, rolls backs, rolls forwards, and so on.”
TNS’s Watson highlights the fact that in a high-performance trading environment, this is no small undertaking. “You can’t just use standard off-the-shelf monitoring and instrumentation when you’re talking about these levels of volume and throughput,” he says. “The key to infrastructure monitoring is twofold; making sure that you can measure at a granular enough time level, and ensuring that you can do it at multiple points. You’re most likely to lose data when you get rapid spikes, micro-bursts where you can get 10s of 1000s of messages in a handful of microseconds. If you do experience packet loss, you need to be plugged in at multiple places to identify where that happened, whether it was on a switch somewhere or whether it was on a particular circuit. Only when you’ve got that amount of visibility, can you really dive into where the packet loss might have happened and then resolve it.”
It is also important that the monitoring and instrumentation is kept independent from the trading infrastructure, suggests IPC’s Balmer. “When you’re building a monitoring management capability in a high-performance environment, it’s best practice to decouple that, rather than put that burden on the infrastructure,” he says. “That gives you more flexibility to do upgrades and still have consistency in the way you’re managing and monitoring. Also, from a best of breed standpoint, it’s easier to achieve the right level of resolution across multiple infrastructure types if you can plug in at various different levels for an end-to-end view. Having an infrastructure-independent monitoring capability, being able to pull in the data sets, normalise that data, and then have a view across everything, is key in this in this multi-cloud world in which we’re now operating.”
Maintaining data consistency
Given that many firms now ingest multiple data streams from multiple sources in a variety of formats, what are some of the issues that firms face around consistency of data, and how can those issues be addressed?
“You don’t necessarily have to be an ultra-low latency trading shop to demand clean data,” says Watson. “Everyone should have access to clean data, because if you’ve got poor data, you’re going to make poor trading decisions. The range of trading firms requiring high performance varies from a buyside client just doing a handful of instruments to a large enterprise trading hundreds of markets across multiple geographic locations. Clearly the more trading you do, the more instruments you consider, the more high-performance you need your infrastructure to be to fulfil those requirements. But the baseline for anyone is good clean data.”
“There’s a lot of irrelevant data, which adds to both storage and transfer costs,” adds BochKov. “It also acts as a blockage for relevant data, which may not be that big, but it has to be precise, clean, high quality and fast. Whereas if you just dump everything into one big, centralised data platform, then your data set or data stream that’s actually interesting to the business can get lost, which in turn affects the focus of the engineers.”
This is area that benefits from being outsourced, says Winzor Saile. “Historic market data for research and back-testing algos is always a challenge. Despite being relatively easy to source, it requires a huge amount of manual oversight to check, cleanse, normalise and maintain. For many, it’s difficult to justify the cost of maintaining the infrastructure to manage that data. Redburn have gone down the route of outsourcing. That’s something that’s only really been available in the last few years, there weren’t many firms with the infrastructure to support that level of data. But with the growth of cloud, it’s become a lot easier for specialist providers to spin up an optimised database and provide compute power over the top of that data.”
Use of the cloud
How are firms now using the cloud in high performance trading environments?
“A key benefit of the cloud is the ability to run against cloud-native datasets, with an analytics capability that runs close to where that data resides, not having to ship it back to another cloud or to an on-prem location,” says Waheed. “To keep data fluid, you don’t want different solutions in different places. What we’re creating with Yellowbrick is a data fabric that spans multiple clouds. And having the ability to run anywhere is a key advantage.”
“Cloud is the perfect place to train models,” adds Stephens. “CSPs like Amazon and Google have invested heavily in custom hardware for training AI and reinforcement learning-type models. These tasks don’t tend to be a base load compute however. Generally speaking, if you are renting a server from a CSP where your usage is more than 40% of the time, then it’s cheaper for you to buy the server and run them yourself. But to get those loads up to 40% is really hard, so shared infrastructure is great for those more ad-hoc type tasks.”
Balmer’s view is that moving to public cloud platforming is not necessarily about saving money but more about enabling new levels of transparency in cost management and agility in the business.
“Firms are now looking to refactor their application estate to move on to a cloud-native environment, primarily because it’s the best way to evolve a platform without any specific lock-in to proprietary technology. It allows you to de-risk, so you can build more highly available cloud-native infrastructure,” he says. “What’s interesting though, is that the high-performance folks are getting into public cloud because of the scalable, elastic compute capability, the data laking, and the AI/ML training techniques that you can use on your datasets. The insights that are delivered through that kind of data processing & data analytics machinery, can be used to then influence strategies, and firms are able to use that to tune in and keep a watchful eye as those strategies unfold in the marketplace.”
“As a mid-sized firm, the idea of having huge amounts of compute power to run machine learning models would likely to have resulted in challenges from our IT team over whether we justify something that’s going to sit there and not do anything for much of the time,” adds Winzor Saile. “By leveraging cloud infrastructure, if I need to spin up a cluster to do some historic analysis or to process large amounts of data, I can. More importantly, I just pay for it when I need it, which is a lot easier to justify. Because a lot of our data is already in the public cloud, we can easily pull out whatever data we need, combine it with data from other providers, run some analytics, and then push that wherever it needs to go; whether that’s our algos, smart order routers, or wherever else.”
Enrico Cacciatore, Head of Market Structure & Trading Analytics at Voya Investment Management, the asset management business of Voya Financial, provides some final thoughts. “Historically, most architecture framework has been on-prem, constantly pulling data in, whether aggregated market data or direct feeds. And computationally that made things more challenging. Whereas now the migration to more of a cloud-agnostic environment, not only for the computational framework, but also for how data is accessed and distributed, has made things much easier, especially when it’s more on-demand versus having a very large container of market data. You can go to an aggregator whose specialty is high speed, low latency aggregation of market data, pull full records of whatever sort of data you’re looking for, and apply visualisation, machine learning, AI functionality, whatever you want. It allows you to compartmentalise each aspect of functionality and leverage APIs to stitch it all together.”
To learn more about how firms are deploying high-performance data systems to support trading analytics and other business functions, please register for the upcoming A-Team webinar, ‘Best Practices for Building High-Performance Data Infrastructures’.
Subscribe to our newsletter