So, 2011 is now well behind us and we are already in the year of the Dragon. So it is high time to look back to assess the past and to look into the capacity planning crystal ball. It is impossible to know the future of course but we can learn from the careful examination of historical facts.
Fortunately, there is a vast array of data from the Financial Information Forum (FIF) provided mostly by the exchanges as well as detail from www.marketdatapeaks.com that is worth perusing.
From a practitioners point of view it would be incorrect to add up all the peaks as reported individually by exchanges to compute a grand total for market data rates. This is because the peaks are normally not co-incident. That means that if you add up all the data highs, you fail to report the actual high that a system would have to deal with. For example, some exchanges send out huge bursts of data at the beginning of the day, and some at the end.
Since the aggregate simultaneous highs are such an important data point for all recipients of market data, Exegy, the FIF and Essex Radez created the web site www.marketdatapeaks.com. It is free and available every day. Behind the scenes, the Exegy Ticker Plant processes all the data and counts the number of messages in any given second.
Just to put 2011 data rates into context, peaks were about 1.8 million messages per second (mps) in November 2009. In November 2010, they hit a new all-time high of 3.78 million mps. Then in October 2011 message rates hit the current high water mark of 6.65 million mps. Judging by these highs, data rates are doubling every twelve months or so.
Not surprisingly, practitioners looking ahead to next year prefer the safety in the simple assumption that data rates will double again in 2012; but planners should really examine the feeds one by one to make the most realistic assessment. Let’s look under the covers:
Exchanges in the U.S.
You might recall that in much of 2011, the exchanges saw less volume of actual share trading and some brokers were bemoaning the lack of business. This was punctuated by an enormous flurry of activity from August to October as the world reacquainted itself with doubts over European sovereign debt and the interconnectedness of the world banking system. Data rates reflected the extraordinary price volatility with records broken on MarketDataPeaks in quick succession on 5th and 25th August, 12th September and 7th October.
The immediate cause of the October 7th high point was the market lurching downward with the news that Fitch had downgraded Italy and Spain from A+ to AA-. The combined order book feeds sent out 1,883,824 messages per second while the derivatives markets sent out 4,323,956 messages at exactly the same time. The top of book quotation systems, illustrating the best prices available on the competing US stock markets, changed at a rate of 423,957 mps, while trades reported at 14,414 per second.
Using FIF statistics provided by the exchanges, we can examine the year in even more detail.
Order Book Feeds, UTDF and UQDF
Over the last several years, order book data has grown extremely quickly. In the US, Nasdaq has led the way with TotalView.
The most recent FIF data shows that the market peaked in July at 634,967 mps. This compares to the January 2010 high of 226,109 mps. As you can see from the chart, data rates dropped off at the end of last year, but we can assume that there is potential for the market to hit 1.2 million messages per second next year if the usual annual doubling of capacity continues. Indeed, Nasdaq announced just a couple of weeks ago that they were carrying out serious upgrades to their main data centre in Carteret that introduces capability to push 40 Gbps.
A typical co-located Nasdaq customer would probably also be looking at the market PSX, formerly Philadelphia Stock Exchange. It was launched on the new Nasdaq platform in October of 2010 to appeal to traders who were looking to deal in an order book ranked by size as opposed to time.
PSX sent out 20,035 mps as a peak in its first operational month and reached a high of 79,419 in September 2011. From beginning to highest peak, this would be a 4x rate of growth although it is fair to point out that this is a comparatively young market. It is also clear that its data rate at the end of the last year came back to earth. Still, it might be wise to use the high water mark as a reference and then prepare for a potential doubling in the next twelve months, if you want to be on the safe side.
Co-located trading firms in NASDAQ’s Carteret data centre would also likely take TotalView BX, the Boston Exchange. It sent out 76,490 mps when launched in January 2010. Its high point in 2011 was 61,368 reached in October. Nevertheless, technologists should be aware that it hit 126,146 messages per second in August 2010 so it easily has the potential to send out a massive blast of data.
Traders and investors looking at the Nasdaq market also read the official tape on the UTDF feed. Trades started the year at 26,482 mps and peaked at 51,841 mps in September. In an ostensibly quiet December, they still managed to make a high of 35,999 mps, so there is decent growth here too.
The top-of-book quotes from the Nasdaq market participants represented in the UQDF feed also grew quickly – 155,438 mps in December 2011 compared to 99,411 in 2010. Interestingly the end-of-year drop off in volatility and trading did not slow down growth in data rates.
Firms looking to make markets or trade Nasdaq listed shares normally take into account NYSE Arca, Bats and Direct Edge and that means that they subscribe to their respective voluminous and very fast feeds.
Arca pushed out 356,110 mps in December last year compared to 336,588 in the same month of 2010. That’s a relatively light progression and the high water mark, according to the FIF statistics, was in March 2011 at 408,651 mps. It is odd that the high water marker was in the Spring. By contrast, on marketdatapeaks.com, the high occurred in August and was measured at 568,000 mps. I would therefore caution that the data rates at the exchange output point may underestimate the highs seen by end users in different locations. There are many networks and devices that may create choke-points at the busiest moment and therefore, much like a garden hose, the pressure might conceivably be significantly different at the receiving point than the original output.
Bats BZX exchange’s December 2011 high was 259,923 mps, not that much different from the previous year but traders would also be subscribed to the Bats second market BYX which hit 208,519 on December 13th. BYX only came into existence in March 2011. So, arguably, market participants would have received double the message traffic they had received in 2010 from Bats. As far as I know, Bats is not planning another exchange, but we do know that they are constantly upgrading their technology and therefore the potential for much greater data rates in 2012 remains.
Direct Edge also houses two markets these days – EDGA and EDGX. In December 2011, Direct Edge A hit 89,949 and X sent out data at a high of 145,275 mps. By comparison, in December 2010, EDGA hit 51,627 and EDGX rose to 131,139 mps. So at first glance it seems that we saw moderate growth here, but bear in mind that EDGA crested for the year at 153,548 in September 2011 and EDGX topped out at 212,391 mps in November.
So to my mind, there is no room to be complacent when provisioning systems to deal with the order book feeds. They are very lively and likely to exude very high data rates all at the same time when markets are stressed. Every major stock and ETF has thousands of lines of orders on each book with the capacity to all up date at once. It is at exactly those moments that you can’t afford to be short of capacity.
SIAC Consolidated Trades & Quotes
These days, significant volumes of NYSE and Amex listed stocks are traded on Nasdaq, Bats and other platforms. So firms focused on listed or vice versa normally take all the proprietary feeds when the exchanges have established market share worth looking at. For the official trade file, practitioners and market data vendors subscribe to the CTS feed from SIAC. This trade reporting feed pumped out an impressive high of 67,535 messages per second in December 2011 compared to 48,369 in 2010 and its all-time high was reached in June, when it touched 78,240 mps.
The consolidated quote feed, CQS, reflected all the best prices and sizes from the protected markets, that is the regulated exchanges. It peaked at 518,302 mps last December compared to 308,705 mps in December 2010. That’s a lot of data and probably quite a challenge for traditional market data vendors who keep a central data base and distribute data to terminals or feeds. Their goal is to publish the national best bid and offer in a timely and accurate manner, but that’s hard when it comes at you so fast and furious. The highest data rate for CQS was seen in November when it hit 580,870 mps.
The listed equity options markets are normally the leaders in terms of market data output. This didn’t change in 2011 and it isn’t going to change in 2012. The high volume is a result of their enormous array of products, contracts and technical prowess. For example, every time the stock market goes up or down, several hundred thousand options contracts are re-priced automatically and funneled into the OPRA system. This means that investors can be sure that the latest price is accurate.
In December 2011, the OPRA output peaked at 3,068,426 mps. In the same month of 2010, it hit 2,145,754. One of the drivers was the addition of exchanges. Bats and C2 joined OPRA in 2010 and then ramped up their output and market share in 2011. In 2012 we are expecting to see the new Miami Exchange. Its data rate is likely to be modest as it starts up later this year but 2013 could get interesting. Even without the addition of new exchanges, the options markets are continually creating new products such as weekly options, contracts with strike prices a dollar apart and new maturity dates. They can experiment because the new options symbology introduced in 2010 offers much more flexibility.
From January 2012, OPRA has the capability of sending 7,866,000 mps. It is unlikely to hit this ceiling in my opinion – OPRA always keeps plenty of headroom available. However, anything can happen when black swan events occur during market hours so it might not be wise to under provision. Already we have seen a peak in 2012 at about 4.1 million messages per second and it hasn’t been really busy compared to last year.
Note that OPRA has projected that its ceiling for January 2013 could be raised to accommodate 11,765,000 mps. In addition, OPRA’s total output over the course of the day could be as high as 24.4 billion messages in 2013. So there is no room for complacency for anyone downstream.
Direct Options Feeds
Traders who want to get the options data faster can of course get proprietary feeds from the options exchanges directly. How many fire hoses can you drink at once? Remarkably, the NYSE Arca proprietary options feed was 2.6 times larger in 2012 than in 2011. In December of ‘11 it hit 1,472,063 compared to 566,556 mps in December 2010. That is a healthy growth rate to say the least. The NYSE Amex Options market runs on the same core technology as Arca, so it wasn’t entirely surprising to see it grow rapidly from 784,564 in December 2010 to 1,426,854 in December 2011.
With Amex and Arca at these kind of data rates, it is clearly essential to have robust systems in order to read the full stream with or without data growth.
C2 and Bats are still very young for options exchanges. But they both contributed to the data stream from the markets. Just so you know, C2 pushed out 24,442 mps and BATS 13,716 mps on December 30th 2010 at 9:42:01 AM. OPRA’s busiest second in December last year occurred on the 27th and this time C2 accounted for 62,561 mps and BATS 81,066 mps. They have clearly both grown considerably the last year and one suspects that they will continue their upward trajectory.
The CME feeds include data from several exchanges including BM&F, CBOT, Comex, Globex, KCBT, MGEX and Nymex. Futures markets do not usually have competitors with fungible products so that diminishes reflexive quoting between exchanges as you have in the fragmented equity and US equity options markets. As a result, the overall levels of messages per second are relatively modest. Although the markets were undoubtedly busy as financial firms laid off risk and took positions to take advantage of the conditions in capital markets and commodities, the December 2010 peak, at 22,107 was not that much different at the end of 2011 when the overall CME feed hit 24,792. However, during October, while the market was particularly volatile, the feed made its high water mark of 32,556 messages per second.
European markets are generally following the same technological and regulatory path as the U.S., so it is not unreasonable to guess that data rates are growing at a similar rate to those in America. In fact, the Euro zone itself has been at the center of the capital markets storm for much of 2011. However, we have very little time series data on message rates to be sure of the magnitude of the challenge. Let’s take a look at what we have got.
Eurex’s CEF Core feed pushed out a peak of about 160,000 mps in September 2011 compared to a peak of 143,994 in November 2010. For systems already used to large volumes, this isn’t much to worry about per se. However, it is worth checking the CEF messages per second measured over a one minute interval. This data shows that the highest measure taken in 2010 was 67,229 compared to 92,266 mps in 2012. Moreover, the high point in 2011 was reached in December, an otherwise quiet period in world markets. This means that the CEF markets are busy for longer sustained periods. So instead of seeing very brief spikes, the busy periods are sustained. This is a red flag to those worrying about processor capacity.
Kansas City in Europe
Bats Europe provides data very detailed data to the FIF. Now that Chi-X is moving onto the Bats technology, understanding Bats is even more important.
It is remarkable that the data output peak, measured in messages per second, diminished from November 2009 to the same month in 2010. But in 2011, Bats Europe hit a much larger peak of 128,742 mps on 14 November. This was described as abnormal by the exchange; nevertheless the spike showed the tremendous capability of the Bats systems to send out order book data at a terrific rate of knots, orders of magnitude larger than previous traffic. In the U.S. market place, Bats offers two exchanges, each of which have sent out more than 250,000 messages per second. Could this be a precursor to Chi-X and Bats Europe running side by side?
MarketDataPeaks in Europe
Data is extremely hard to come by for the European theatre. Only Bats, Deutsche Borse and Eurex provide capacity data to the FIF for European feeds, but Exegy is starting a new monitoring service (with partners MarketPrizm and the FIF), showing live market data rates for Europe. The public web site has yet to be put into production, but it will be available on www.marketdatapeaks.eu.
I have monitored recent message statistics on the test system. Eurex (EBS), LSE, Bats Europe, NYSE Euronext and Chi-X Europe data rates are available. Eurex EBS is the feed used by member exchanges of Eurex and it shows a peak of 58,000 mps that occurred on February 9th. The LSE hit 23,000 mps on the 6th February. CHI-X crested at 53,000 mps on the 9th February and NYSE Euronext’s UTP feed made 9,000 mps on February 15th. MarketDataPeaks will build a history through 2012 and we should be in a position to comment next year on historical growth and the likely scenario for 2013.
In the meantime, it is difficult not to imagine higher peaks from these competitive liquidity pools in a European economy that is still on the edge of crisis.
Microbursts – Milliseconds and More
Almost everyone receiving direct feed data is concerned about microbursts. A microburst is a sudden spike of traffic that lasts less than one second. Because the exchanges have tuned up their systems to minimise delays and because bandwidth and switches have been opened up to minimise latency, it is suspected that market data has a much greater propensity to microburst. Given that there is little historical data on this subject, it is difficult to say definitively if microbursts are more frequent today than say two years ago. Anecdotal evidence suggests that there are more microbursts but some have wondered whether we have just become more sensitive to the issue.
This year we do have a growing set of data collected by the FIF from Bats and SIAC and that means that we can cross check data from Bats BZX, Bats BYX, Bats BZX TOP, Bats BYX TOP, Bats Europe CQS, CTS and OPRA. The data is provided from all these sources in messages per second across 10 millisecond slices. In aggregate, the peaks have steadily increased overall from May 2011 to December 2011 from about 6.8 million to 9 million messages per second. This increase is in contrast to the widely observed tailing off of message rates for most of these markets (in the one second slice of time) in December of last year. My guess is that this means that we are seeing more microbursts and should prepare for more in 2012.
Market Volatility=Higher Data Rates
The actual data rates from all these exchanges are dependent on market movement. We don’t know if last year will be repeated. 2011 certainly seemed extraordinary and ground-shaking at the time, but so did 2008. Regardless, it is absolutely clear that all the exchanges and liquidity providers are continuing to vie for market share by improving their technology platforms and this always leads to greater capacity which in turn means that downstream everyone needs to be prepared for yet more high intensity bursts of data. In other words, we should expect data rates to go up in general and microbursts to increase in particular. We can hope for the best, but should prepare for the worst.
As a rule of thumb, 100% growth doesn’t appear to be such a bad prediction for 2012 in aggregate, if you want to be safe and haven’t got anything else to go on but clearly every market is slightly different.
Good luck for what remains of the year of the Dragon. Don’t get burned.