Earlier this year, Cisco Systems announced new switches focused firmly on the low-latency trading space, and along with them, it announced its Trading Fabric architecture, setting out how these switches, along with other components, can be deployed to solve real-life application challenges.
IntelligentTradingTechnology.com checked in with Cisco’s director of solutions architecture Dave Malik to get some more detail on the products, the architecture and the vision.
Q: What is special about your Nexus 3064 and 5500 switches that makes them especially suitable for electronic trading?
A: These switches provide ultra-low-latency at scale and protect against microbursts during market volatility. When talking about latency, it is critical to understand scaling requirements and the ability to manage short-lived congestion in the infrastructure. For example, by leveraging a 24-port switch, the latency profile drastically changes as the need to scale the fabric beyond this capacity. If there are packet drops, this event can have adverse affects which can cause packet retransmissions or simply loss of valuable data needed for trading applications.
For example, the Nexus 5596 with its 96 10gE ports provides 64MB of packet buffer memory and allows extending the fabric to 768 hosts with the exact same latency profile of less than three microseconds from any host to any host by leveraging the Nexus 2000 Fabric Extenders. The Nexus 5500 platform has a crossbar and port ASIC implementation with dedicated buffering per port.
The Nexus 3000 switch provides less than one microsecond of latency along with a rich feature set. The switch architecture is designed leveraging a single chip implementation providing deterministic performance on a 64-port configuration with 9MB of shared buffer.
Q: Apart from low latency, what are some of the other performance characteristics of these switches that you have optimised for the trading fabric?
A: The Nexus 3000 and Nexus 5000 have a robust Layer 3 BGP (Border Gateway Protocol) feature set which is typically required for exchange and client peering points. Both platforms provide extremely low jitter in single digit nanoseconds which make them the ideal choice for trading infrastructures. In addition, the security features, such as PVLAN, route control and filtering, Virtual Routing and Forwarding (VRF) for segmentation of traffic, and upcoming IEEE 1588 Precision Time Protocol support in hardware can be leveraged for sub-microsecond time synchronization for switches and hosts within the trading fabric.
[Low-Latency.com: There’s more insight into the new 3000 switch in this video.]
Q: You put a lot of effort into testing the switches for performance. But we know, for instance, that market data rates are still increasing. So did you test to future proof them?
A: Cisco invests a significant amount of time in testing environments at scale. The testing is conducted for the particular Nexus switches themselves and at a topology level under increasing load with various traffic conditions. It is very important to understand each platform’s limits before designing a trading fabric. The information is shared with clients so they can understand the various scaling parameters of the entire system. As an example, the scalability testing performed on the Nexus 3000 can be viewed here.
Cisco leverages industry benchmarks as well as application specific testing with ecosystem partners to ensure high performance with current and projected message rate growth. Several benchmarking examples can be found here.
Q: So, how do you take a switch – a product – and use it to build a trading fabric, an architecture?
A: Components behave differently when used as standalone devices as compared to when used in an interconnected fashion. The value of the Cisco architecture is that we formulate designs based on client requirements ensuring the infrastructure will perform to a client’s needs. Without architecture, the burden is on the client to integrate and validate a complete solution. The trading fabric consists of more than just a single switching component. There are compute, messaging, applications, monitoring tools, etc. which have to be considered since they are leveraging the fabric for applications which require having various flow characteristics at precise intervals.
The first step in defining an architecture is to understand clear requirements. The next step is to develop a framework that will eventually become the architecture. Once the architecture is defined, it is appropriate to select the best components to support it and how they are interconnected to meet the goals of the framework. The crucial step is to validate and ensure the design can sustain deterministic performance under current and projected peak traffic patterns as the architecture scales to accommodate growth. Monitoring and instrumentation capabilities are weaved into the fabric for real-time analysis.
Q: As well as switches, another element of your trading fabric are servers – your Unified Compute Systems. How are you engineering those for applications such as data feed handling, messaging, etc.?
A: The Unified Computing System (UCS) unites computing, network, storage access, and virtualisation in a single system which can be managed centrally with a single pane of glass. It is based on standard x86 Intel processors and also contains a high performance 10gE Virtual Interface Card for I/O intensive applications. By leveraging the adapter’s possible 128 PCIe standards compliant virtual interfaces, there are use cases where a firm can consolidate several applications on a single host and bind specific processes to a particular CPU cores which then can be associated with a specific virtual NIC defined on the physical Ethernet adapter. This allows streamlined application performance by minimising context switching and interrupt steering which can cause additional latency in the application.
The recently announced VIC 1280 adapter will be able to support 256 virtual interfaces for further scalability. For certain memory intensive workloads such in-memory data grids, the extended memory capability of UCS can be leveraged to increase performance.
Q: It’s not a green field world out there – there’s a lot of existing networking and servers out there. For firms that want to take advantage of your fabric approach, what practical measures are you making to help them transition?
A: Cisco’s Advanced Services consulting organisation works closely with our clients to understand their business and technical goals and help them evolve from where they are today. By leveraging vast experience working with financial services firms across the world, Cisco has developed robust best practice architecture designs to ensure clients successfully transition to new trading environments. A trading fabric can be located in a single location or span multiple locations, which is the case for several of our clients. Cisco also has large testing facilities where clients can validate future designs to ensure a seamless transition of their applications. Scale-out testing with different traffic patterns under increasing load provides greater visibility on what to expect during a production deployment. The network, compute and storage infrastructures need to work together as a system and there are methodologies formulated for firms to leverage. Direct involvement with ecosystem partners in the entire process also enables that the system is being tuned on an ongoing basis.
Q: Dare I mention InfiniBand? There’s quite a lot of it out there, especially for so-called ultra low-latency applications. And it certainly has a passionate constituency. What’s your position – steer clear of the fight or take it to them? And if the latter, what’s your proposition?
A: In order to take advantage of InfiniBand performance, clients have to rewrite applications, which is unacceptable in most cases. With 40gE switching products becoming available in the market this year, Ethernet is delivering the highest levels of performance that firms require for demanding workloads.
There are other ultra-low-latency technologies in the RDMA and sockets-based arena which leverage Ethernet as a transport. RDMA over Converged Ethernet (RoCE) has provided a path to clients who are using InfiniBand and want to migrate to Ethernet. The industry trend reflects Ethernet as the transport of choice today. Further innovation in silicon and software which will reduce the small latency gap between Infiniband and 10gE wich exists today from an application perspective.
Q: Finally, can you put any flesh on the bones of plans to push adoption of the trading fabric, and of the products that underlie it? What can we expect in the next few months? Don’t worry – everyone in the Low-Latency.com community has signed your NDA 🙂
A: The trading fabric is currently being implemented by several firms who are leveraging strategies across multi-asset and multiple venues. They are on a journey to capture new market opportunities. The Nexus 3000 and Nexus 5500 have been adopted in various environments depending on the firm’s strategy. Cisco is also working with ecosystem partners in the entire stack to ensure success. Cisco cannot comment on specific product roadmaps but you can expect continued innovation in this fast growing space.