By Dave Carson, Head of Field Engineering at DiffusionData.
The challenges I want to discuss are those that are affecting every trading participant today, including exchanges, data providers, banks and hedge funds. These include:
Controlling the Firehose of Data
With the rise of electronic trading, data volumes have spiralled. As technology continues to advance, so too does the speed at which automations submit orders/quotes etc. However, not every application needs to receive all market data. In response, Messaging platforms moved away from socket-based TCP to multicast to keep up with data volumes – no need for ACKs.
What this means is that platforms are just overloading downstream clients with data, data that they may not be interested in or even asked for. How do you solve this problem? By structuring data in a structured, hierarchical topic tree enables clients to subscribe at their relevant level, meaning that they only get the content they need. In addition, the texture of the delivery of the data can be modified, for example it can be throttled for websites or delayed for non-paying users.
Another challenge is that different users have different needs, but using personalization technology, you can cater for a global userbase from single data source – all performed in real-time.
This is all about controlling the firehose. Depending on what a trader is looking at, personalized content can be made available. This saves CPU, NIC utilization and network bandwidth. By controlling the firehose everybody wins.
As more feeds are built (including internal systems), data needs to be sent from A to B and onto C and maybe back to A. Therefore, all these systems need to be able to talk to each other – a technical challenge every time. The problem is compounded when each engineer tasked with deploying integration brings their own style – leading to a plethora of distinct applications being deployed to production. They behave differently and can create daily challenges and issues, especially if the person responsible for a specific set of code leaves the company. The whole approach is, frankly, unsupportable and not a strategy.
The solution is to deploy a framework that can integrate back-end data sources and external systems, using consistent boilerplate code. By deploying such a framework, this industry-wide problem is resolved as adapters will look and behave the same. This saves time in production support as every adapter shares 90% of the code, which is tested and production ready. Not only does this mitigate the risk of bugs every time a new adapter is built, but the user also benefits from a consistent upgrade/downgrade process, configuration, administration and monitoring. By using a framework, developers can build once and deploy many.
Disaster Recovery and Data Replication
Sending data from A to B, from site to site, or cloud to site etc, creates challenges and it can be something as simple as keeping Disaster Recovery (DR) up-to-date. Any institution that is mandated must do DR tests as part its compliance, it must have an updated DR site that is ready in the case of failure.
Keeping DR up-to-date with such high data volumes is not an easy task. Every DR test that I have been involved in takes weeks of planning, you tend to do it over the weekend, you get your results, and then if it fails, you have to analyse why it failed and how do we make sure it doesn’t happen again. That’s the process. I am aware of many DR tests failing! This will resonate with a lot of people. Many DR sites will be out-of-date. For example, changes can be made in production over the weekend and they may not be replicated to the DR site.
To solve this there needs to be a focus on streaming data reliably, securely and efficiently. It is possible to ensure that the DR site is ready to go at all times. This involves replication between sites, such that all that has to be done to flip the site is to flip a switch on the load balancer. Clients will still receive a seamless experience. One month you can be running on site A, the next month site B. You don’t actually have a DR site, you have two primaries and they’re always ready. They’re not sitting there getting stale. Many trading firms are seeking to implement this sensible Flip/Flop approach.
Auto Scaling and Handling Peak Events
Operational costs are always a focus for financial institutions and indeed all companies. To reduce costs there is need for platform intelligence which can reduce operational complexity and cost, manage increased scale of real-time data delivery to meet market demands, and assure data access control and security. The key to success is deploying a platform that can easily and reliably scale up and down as required and also deliver data with optimal efficiency using data deltas and compression algorithms to reduce bandwidth usage. Thereby reducing both CapEx and OpEx costs.
Development teams are now, more than ever, seeking a low-code approach to building their next generation event-driven applications. For the finance industry to continue its digital transformation, driven by AI, machine learning, and cloud architectures, an intelligent event data platform that can autonomously react to market activity in real-time will be pivotal to the evolution of their infrastructures. This has to be the case if you want to deliver a hyper-personalized experience.
Controlling Access to Data/Entitlements
Every exchange and data provider control access to data it provides to trading firms. Data consumers must be able to report and monitor who is using what data, and ensure only permissioned users get the data. However, data providers may visit a trading firm to verify how said firm is enforcing access to data? This can be quite hard to do.
This requires a platform which can talk to any permissioning system and monitor who is using what data in real-time and when the user subscribes/unsubscribes. In addition, permission changes need to take place in real-time. This information can be electronically collected and reported which can then be used as compliance evidence. In essence, what seems a complicated procedure becomes a file sharing exercise.
Resiliency and Monitoring
How do you make sure there is no single point of failure? Clustering technology is key here. Node failure doesn’t impact data flows since the cluster replicates data/configuration/sessions across nodes. If you haven’t got a clustering strategy, you should have.
How do you monitor the sheer number of disparate systems I have discussed? A monitoring API that can monitor any measurable metric is crucial, be it network, operating system, data or client/user related. And because it’s an API it should be able to integrate with any monitoring system. Integrating with any monitoring system is a challenge for financial services, which goes back to my earlier point about integration challenges.
So that’s the here and now and some of the challenges financial institutions are trying to solve. But just looking to the future briefly, and especially in the European space, Consolidated Tape (CT) is a huge challenge. It may change regulations which creates new issues, but on the whole, I think CT is a move in the right direction.
The other main challenge is the move to the cloud which is well underway. Everyone is looking to define their strategies. How do organizations deploy a cloud agnostic approach? Nobody wants to be coupled to a particular player. Many of the issues institutions might have in their cloud deployments, I believe are touched upon in the trading stack challenges I have highlighted. The move to the cloud won’t be easy, but taking the best practice approach I have outlined will certainly help.
Subscribe to our newsletter