The era of software-based market data systems running on generic server and desktop hardware is now well through its second decade. The lifeblood of any large trading floor flows through the veins of its digital data distribution system. When the flow stops, all hell breaks loose. Our firm, TSAssociates, is in the business of preventing that from happening.
The market data systems world seemed to have fallen into a deep slumber until just a few years ago, when it started waking up to an increasing frenzy of direct exchange connectivity and regulatory changes, all resulting in accelerating data rates.The first jolt came from U.S. decimalization, phased in during 2000/01. Now, we face RegNMS, MiFID and penny options trading, all of which promise to continue ramping up data rates.
Opra is gearing up to handle 700,000 updates per second by the end of the year. Rather that on your network than mine! Now everybody – institutions and vendors alike – is jumping on the ‘low latency’ bandwagon. Direct exchange connectivity, once the preserve to the top-tier banks, is rapidly heading towards commoditization through the efforts of vendors including Wombat, Activ Financial, and InfoDyne. Even Reuters, with RDF Direct, and now Thomson Financial, with its deal with Wombat, have joined the party. But, for those living on the bleeding edge of the market data industry, a far more disruptive tsunami is heading your way.
This wave is called hardware acceleration, and it is already breaking on some shores, thanks to the efforts of firms such as Tervela, Exegy, Solace Systems, and even our very own TS-Associates. Hardware acceleration is all about taking tasks traditionally implemented in software, and re-implementing them in specialized hardware.
The advantages of this approach include improved throughput, reduced latency, reduced power consumption or heat generation, and in some cases the smaller physical form factor. Hardware acceleration, in a world of software solutions, is nothing new. There are many precedents. Back in 1981, when IBM launched its first PC, powered by the Intel 8088 microprocessor, it included a socket for the Intel 8087 floating-point co-processor, which accelerated floating-point operations by a factor of 50.
In its absence, software emulation took over. The co-processor was optional because its high cost was only justified for computationally intensive tasks. We now live in a world filled with hardware-accelerated devices, whether for 3D graphics, gaming, compression or signal processing. Look inside a Cisco router and you’ll find an architecture where 90% of the work is coordinated in hardware, with a supervisory software element dealing with exceptional situations and non-time-critical tasks.
Expect to see this approach to the design of high-performance, low latency market data systems become increasingly prevalent. Hardware accelerated market data systems are heading towards us from two directions – the network layer up, and the application layer down. In the network camp we have Cisco AONS and Solace Systems, whose approach is to build the intelligence into the network layer. (Incidentally, the Cisco approach is not really one of hardware acceleration. Cisco is starting from a base of fast hardware, and integrating a software capability into routers).
In the application camp, contenders are reusing standard network technology, with a focus on the acceleration of applications sitting within the network fabric. It is this approach that is gaining early traction in the market data space, with solutions being offered by such firms as Tervela and Exegy. And the achievable performance speed-ups and latency reductions are truly staggering. The secret ingredient is the FPGA – Field Programmable Gate Array. Think of an FPGA as allowing developers to compile a highly parallelized version of their application directly into hardware, and to execute it with close to zero latency.
One systems vendor we are working with has found that for market data processing, software executing on a 2 GHz dual core CPU can be outgunned by a functionally equivalent FPGA implementation clocked at 100 MHz. Much has been written about the ongoing performance war between AMD and Intel. The current situation in broad terms is that if your application requires floating-point computational grunt, then go with Intel Xeon processors.
If, however, you require superior memory bandwidth, then go with AMD Opterons. In the multi-core battlefield, neither Intel nor AMD is convincingly ahead of the other, although AMD was first past the dual core post. AMD, though, has an FPGA initiative called Torrenza, aimed at encouraging solutions that tightly couple hardware accelerated coprocessors with CPUs via AMD’s HyperTransport.
Today, you can buy an FPGA module from a firm called DRC Coprocessor Systems that will slot straight into an AMD Opteron socket on a multi-processor motherboard. Expect to see a lot more of Torrenza in particular, and FPGAs in general.