NYSE Technologies recently rolled out version 6 of its Data Fabric messaging platform, including significant new functionality for scale out of networks while maintaining very low latency. IntelligentTradingTechnology.com checked in with Brian Doherty, NYSE Technologies’ global product manager for Data Fabric, to find out the how and the what.
Q: For Data Fabric 6, you are highlighting the introduction of MultiVerb. What is it?
A: Data Fabric was one of the first commercial messaging products to utilise RDMA (Remote Direct Memory Access) when it was launched in 2008. While, RDMA provides excellent latency, determinism and throughput, it is a point-to-point solution. As with any point-to-point solution, it has limitations around scalability and fairness – as more consumers are added, the source must send the message to each consumer and, by definition, the first consumer will get the data earlier than the last.
MultiVerb is designed to solve these limitations by using the same InfinBand verbs interface (getting the same latency and determinism) for publishing but also using the inherent multicast capabilities in InfiniBand and 10gE networks, replicating the messages to interested consumers.
So the source sends a single message and all interested parties get it at the same time, solving any concerns around scalability and fairness. As can be seen from the testing with Intel, a million messages per second can be delivered to 1,000 consumers with average latency in the five microsecond range.
Q: And backtracking, can you explain what RDMA is, and how it reduces latency and jitter?
A: RDMA allows applications to directly read from and write to memory regions in user space on remote hosts without any involvement of the remote host’s operating system or CPU.
This capability offers several architectural and performance advantages for messaging implementations including reduced CPU overhead, zero-copy, kernel bypass and dramatically reduced latency over IP messaging.
RDMA on both Ethernet and InfiniBand is zero-copy, which means that sending or receiving data circumvents copying the data to or from kernel space and the required context switch required by other mechanisms.
For Data Fabric, an RDMA publish/receive operation entails little more than writing/reading the message in user space memory while the underlying network cards and infrastructure moves the message almost instantenously.
Alternatively, UDP and TCP publishers copy data from user space to kernel space requiring system calls, expensive memory copy operations, and context switches. Once in kernel space the data must be copied yet another time to the NIC for traditional networking architectures. Avoiding expensive systems calls and the increased context switching associated with them enables RDMA applications to use system resources more efficiently.
Q: You’ve tested MutiVerb up to one to 1,000 clients. So what typical applications would require this high fan out, and very low latency?
A: MultiVerb is just one of the transports that Data Fabric supports. Additionally, it offers RDMA, LDMA and TCP. With the introduction of MultiVerb, Data Fabric can support almost any deployment strategy in capital markets. For example:
– LDMA: On a single server with multiple processes, LDMA (Local Direct Memory Access) is used to accelerate inter-process communication. Co-located servers are an example of this deployment.
– RDMA: Is ideal for moving large amounts of data with very low latency between a few servers, as is often required for complete trading systems.
– MultiVerb: Is for distribution to a large number of servers, such as an Enterprise Ticker Plant.
– TCP: Is suitable for legacy network distribution, WANs, and deliver to desktops.
Q: Apart from MultiVerb, what other new functionality and enhancements are in Data Fabric 6?
A: Data Fabric 6 includes enhancements to both the LDMA and RDMA transports with improve overall performance and efficiency.
Q: More broadly, are you seeing takeup of LDMA transports – shared memory inside multi-core/multi-socket servers? How prevalent is this?
A: Data Fabric is deployed at a number of client site using LDMA, so we certainly see this as a very important segement. Also we have partnered closely with Intel to test and evaluate our solutions on Intel’s multi-core machines so that we can leverage the ever increasing core counts effectively.
Q: And also more broadly, what trends are you seeing re. the need for different latency requirements – low latency vs ultra low latency. Where are you seeing most demand? And what do you expect in the future?
A: Given the breadth of our coverage and the number of clients we have (in the hundreds) we see many different deployment models. Other than an overall desire to get latency as low as possible, we do not see any one model becoming the de-facto standard.
Subscribe to our newsletter