It’s a bit of a paradox that Java is so widely used across all applications and industries, and yet is often cited as not suitable for the automated trading space. IntelligentTradingTechnology.com caught up with Gil Tene, CTO and Co-founder of Azul Systems, to find out more about Java performance, and how to improve it.
Q: To begin with, can you describe how widely used Java is in the electronic trading world? What are the reasons for adoption?
A: Java-based applications are used throughout the electronic trading world – from trading platforms to settlement systems, risk and various reporting systems. As in other enterprise environments, Java is dominant, and is associated with systemic benefits including productivity, improved time to market, and the ability to leverage a huge ecosystem of enterprise software components. Java’s recent usage trends in low-latency trading may be surprising to many, as it has well known shortcomings in that space. It would seem that the overwhelming benefits of using the Java ecosystem are large enough for many to overlook or accept Java’s historic limitations in the latency-sensitive application space.
Q: Can you outline some of the off-cited performance issues with Java and its use in low-latency environments?
A: It’s all about consistency. While Java had been considered to be “slow” in the past, current JVMs provide speed that is highly competitive with other software based implementations. The remaining problem with Java is that it alternates between being very fast most of the time and being excruciatingly slow (or “stuck”) some of the time. Java’s reputation for jitter and unpredictable latency is the result of the inherent nature of “Stop the World” garbage collection built into most current JVMs. The pauses caused by this sort of JVM-internal processing can often create unacceptable latencies at unpredictable times, causing variances in low latency operation speeds that are measured in tens of thousands of percent (!). These outrageous discontinuities represent Java’s biggest issue, and its biggest opportunity for improvement.
Q: How does Azul’s Zing offering overcome those issues?
A: Zing tackles Java’s pausing problem head on, and addresses it. Zing’s C4 garbage collector does its job concurrently with the executing application, avoiding the many-millisecond stop-the-world events found in the other Java runtimes available for servers today. With Zing, low-latency Java applications that had previously experienced pauses on other runtimes maintain their common-case speeds while at the same time seeing their actual measured worst case latencies drop to no more than a few milliseconds, with many customers able to reach sub-millisecond worst case levels with relatively little additional tuning. Quite frankly, with this highly differentiated behavior in low latency Java application execution, firms using Zing for their Java-based trading platforms will probably make more money with their systems than those that don’t.
Q: How does Zing compare to Oracle’s JVM?
A: The difference between the two lies primarily in their observed worst-case latency behavior, and in the effort developers spend to maintain low latency behavior. These differences are rooted in core architecture choices and in market focus.
While both JVMs will provide fast, optimised common-case execution, the Oracle JVM uses garbage collection mechanisms that were never designed for consistent or continuous execution when milliseconds matter. Even though the length of stop-the-world pauses within Oracle’s JVM have improved over the years, those pauses are still orders of magnitude too large for today’s aggressive trading platforms. While Oracle had invested some work in recent collectors (CMS, G1) towards reducing the multi-second pause effects common in enterprise applications where human response times are the main concern, the techniques it uses for the frequent “new generation” (or “young generation”) collection result in inherent stop-the-world effects that are nearly impossible to contain to only a few milliseconds.
In contrast, the Zing JVM was architected and optimised from the ground up to eliminate the performance and latency outliers that are so commonly seen with other server JVMs. Even more importantly, Zing’s design not only contains worst case behaviors, but completely de-couples outliers from key application scale metrics, such data set sizes and transaction rates. Zing is able to maintain the exact same worst case behavior across an extremely wide range of heap sizes, and maintains continued low latency behavior even when application throughput exhibits consistently high allocation rates. As a result, Zing eliminates many of the tradeoffs low latency Java developers have had to deal with when trying to get other JVMs to perform acceptably.
Q: Using Zing, what kind of performance is possible, and for what kinds of low-latency applications is it suitable for?
A: Zing keeps peak latencies times low enough that noise introduced by the operating system and other system factors usually dominate latency outliers. On well tuned systems, customers regularly experience measured worse case latencies below the millisecond level, while the typical (median) latencies often range in the tens of microseconds (depending on and dominated by application logic). Zing can maintain these metrics for heap sizes ranging from as low as 1GB to a high of several 100s of Gigabytes, and does so even under application allocation rates measuring at multiple Gigabytes per second.
Q: Can you outline what some of your customers are actually using Zing for in the low-latency space?
A: In a general sense we’ve seen it used predominantly in the front and middle office, where latency and responsiveness matter most. We see Zing used in trading platforms, with the biggest uses usually found in the equities and FX markets, but we see uses in derivatives, fixed income and a variety of other instruments as well. We see Zing used in exchanges and gateways, and we seem to be gaining popularity on powering FIX engines and gateways. In systems that surround actual execution, Zing is being used in risk analysis systems that provide “real time” or “wire risk” behaviors, and appears to be a good fit for order management systems. Our customers find uses for Zing wherever latency or response time consistency matters, and whenever applications have use for large data sets without sacrificing responsiveness or interactivity.
Q: Where are you heading with future versions of Zing?
A: With garbage collection already a solved problem in Zing, Azul is making continuous improvements to other jitter and low latency specific issues – issues that others have not even started measuring. We’ve also been focusing on adding behaviors that are specifically relevant to Java applications in capital markets. Clearly the success we’ve had in financial services is one of the key drivers of our future development directions, and we’ve been adding to our capabilities to support the industry. Zing is quickly becoming the de facto choice for latency sensitive Java applications in the financial services sector, and we have been getting a steady stream of feature improvement requests that we intend to feed into our development process and our roadmap.