It’s well known that Interactive Data Corp. can be a relatively conservative institution. This is in part due to its valuations business, which forces it to make judgments akin to those of a ratings agency, and sometimes results in a somewhat cautious corporate approach.
And so there has been a distinct lack of the kind of yelling from the rafters one might expect when a top-tier data vendor refits all of its data collection and delivery infrastructure at a cost of multiple millions of your currency of choice.
Interactive Data has almost completed such a refit; we’re just now beginning to hear about it. You’ll have read about the launch of Apex – the name for its new data transmission infrastructure – here. There was press release and a client-only event. But other than that, little fanfare.
The word ‘almost’ above may have something to do with the soft launch. Interactive Data’s new owners – the venture capitalists who have made this refit possible – maybe taking the view that they’d like to see more of Apex in production before they make too much noise about it, which is fair enough.
But you can expect to hear more about Apex from evangelists like Paul Kennedy here in London in coming weeks and months, as more functionality and coverage is rolled out, and client uptake starts to gain traction.
In the meantime, we managed to get an audience with said evangelist Kennedy, who gave us the low-down on how Interactive Data reached the Apex of reference data delivery.
It should be pointed out that we’d seen this development coming for some time. For a start, Interactive Data, which grew out of a series of coagulations and agglomerations over the years involving names like FT Information, Extel Financial, eSignal, 7ticks, Data Broadcasting Corp. and a host of other seemingly unintegratable entities (LEI practitioners would have a field day), wasn’t known for its standardised approach to internal data management. And in their due diligence ahead of their acquisition of the bulk of the company from Pearson PLC, its new owners – Silver Lake and Warburg Pincus – had name-checked the delivery infrastructure as both a liability and an opportunity to leapfrog the marketplace.
What’s more, the task of implementing the technology refit was given to Alex Goor, known from his time at Instinet a decade ago, when he was involved in the transformation of the market’s slowest ECN – Island, acquired by Instinet in 2002 – into its fastest. Indeed, the underlying Inet platform is still being touted by its later acquirer – Nasdaq OMX, which owns the source code – as the fastest matching engine on the market.
With Goor on board, and the addition of the 7ticks low latency hosting and delivery infrastructure soon after, the writing was on the wall for Interactive Data’s aging – some say creaking – technology set-up. With a cash injection from the VCs, Goor and his team had the luxury of rebuilding from scratch. The result is Apex.
So what is it?
Apex is a single data consolidation, processing and distribution mechanism that will handle all of Interactive Data’s data sets, from those requiring microsecond delivery to those whose values don’t change more than once every 30 years. All data types – covering securities, entities, rates, indexes, corporate actions and more – are covered by a single, flexible data model, and all are delivered in XML.
The single data model is key to the overriding mantra of simplification. Apex makes use of the data model to collect and standardise all information sources, and then for distribution to clients. Kennedy says the model has been designed for ease of expansion, which means that clients wishing to contribute to the Apex platform can add to the model to cater to their data types, as appropriate.
Interactive Data sees this approach as a way of expanding its own business, from its core base in reference data and corporate actions, with its mostly back-office and operations footprint, into the front and middle offices, from whence competitors like Thomson Reuters and Bloomberg are moving in the opposite direction.
At the heart of this incursion toward the front office lies the customer requirement for more robust risk management. And to achieve this, they need
access to good quality reference data, credit ratings, counterparty data and identifiers, as well as ancillary data like schedules and prospectuses. Apex is designed to be the single port of call for this data, and provide the platform for subsequent activities like back-testing of models, tracking of price histories and historical application of corporate actions to portfolios.
For clients of Interactive Data the advent of Apex may not be very apparent, at least initially. Kennedy says Interactive Data will continue to support all delivery formats; there will be no Big Bang of forced migrations. Legacy platforms will be switched off as and when clients opt to switch to the Apex delivery system.
For when clients decide to make the change, Apex comes with several delivery options, all making use of XML. Standard delivery methods like FTP are useful for delivering large amounts of data, says Kennedy, “but [they] add the burden of processing for clients, incurring people and IT costs. They need to break that content into something usable that can be delivered to different systems, old and new, and this can be problematic and inconsistent.”
Interactive Data’s approach to the challenge of how to programme content for consumption, is to “stage the data to make it more consumable,” says Kennedy. “We present the data in a manner that’s both human- and computer-readable, so the cost of programming is dramatically reduced.”
This approach, Kennedy says, opens up new possibilities for clients, Kennedy reckons, particularly in the area of risk reporting. “If I have to report to the FSA, I need to know: issue, detail of instrument, seller/counterparty, nature of transaction, position/exposure, rating, mark to market, and so on. All of these variables need to be pieced together, and so the challenge for the risk community is coherence. The marketplace is starting to understand that they need to do this.”
As a result, Interactive Data has adopted XML for the complete data processing workflow: Creation of the Logical Data Model; single data collection model; expression of data in XML. The Apex output is all XML-based on comes in three types, each catering to different client demographics and requirements: FTP; messaging; and API.
“For FTP consumers, we change the payload by making it XML,” says Kennedy. “The barrier to accessing XML is low, as existing open-source platforms can handle it. XML parsers on the market require minimal effort to deploy. You just need business analysts rather than IT staff to handle it.”
Where clients have a requirement for streaming or data on demand, Interactive Data’s messaging provides event-driven delivery capabilities, which can be used to handle the full spectrum of delivery requirements, from dividend announcements to real-time pricing data streams. The use of standard messaging technologies – in this case IBM’s WebSphere – yields the standard features you’d expect: self-recovery, reconstitutions, re-queueing and so on.
With all data tagged with Interactive Data’s metadata based on the Logical Data Model, the company is able to offer a single, standardised API. This, says Kennedy, allows the client to pull the full remit of data into its own applications: reference data, corporate actions, real-time data and historical pricing. Using the API, he says, a client “could get real-time data into a trading engine, and then call from the same source to set up the instrument in the security master.”
Kennedy reckons the API could be a game changer both for Interactive Data and its clients, by allowing the company to offer a hosted, managed database solution. “We can decant all data from our internal data collection systems, pour it into a relational database, underpinned by the Logical Data Model, and commingle with prices or tick histories. We take away the cost of the data centre for the client, maintain the data in an Interactive Data environment in a big relational database, and allow the client to do what it wants with it. It’s a massive query box.”
At the core of the system is a brand new publish-and-subscribe technology platform. Based around a set of local area networks at Interactive Data’s data centres, the Apex core data consolidation system is an information bus running IBM’s WebSphere messaging system.
It’s a modern version of a pretty standard trading room data architecture: Multiple copies of data sets are held in memory. Values get published to the bus. Applications run on boxes hanging off the bus. Applications subscribe to price streams from the bus, make the calculation on the data and then republish the result to the bus, where it’s made available to other applications or consumers.
All data, structured or unstructured – including prospectuses and the like – adhere to the single so-called Logical Data Model. The entire platform has been optimised for real time pricing, but is used for all data sets. It features built-in sequencing and robustness, and a GUI front-end for manual inputs.
According to Marc Alvarez, Senior Director – Reference Data Infrastructure, at Interactive Data, the Apex data architecture has been built using standard technologies.
Data deliveries are standardized to a single XML data schema. The schemas (one each for Global Security Master, Global Corporate Actions, and Global Price Master) all comply with World Wide Web Consortium (W3C) standards.
According to Alvarez, “the big advantage of W3C compliant XML is that it gives users the ability to make use of a very wide and growing range of software applications from the Internet community – most of it open-source and free. The goal here is to shorten the time and effort it takes to make data content available to consuming applications. We are prototyping some of these tools at present to examine whether we can eliminate the need to ever write a feed handler again.”
A planned web services component will be compliant with SOAP, an accepted W3C standard. All file deliveries use standard File Transfer Protocol (FTP) and soon will support standard compression (gZIP), Alvarez says.
Message Queue delivery provides standard connectivity options to a hosted IBM WebSphere Message Queue server. WebSphere hosts a wide range of application connectivity options and gives clients control over when to take delivery of messages. All messages posted to the message queue server are formatted in the same W3C-compliant XML.
The managed database service is based on supporting off-the-shelf network connectivity to Interactive Data’s data centres. Access to the managed database service makes use of standard connectivity supported by most desktops – primarily OBDC and JDBC. For example, it is trivial to set up an ODBC connection from an Excel worksheet using ODBC options (call drivers) that come with Microsoft Office. There is no client software from Interactive Data provided; rather, it is a turnkey, completely standard form of connecting to Interactive Data as a data source.
Query and interrogation of the Interactive Data data universe uses standard transact-SQL, the SQL dialect provided by Microsoft and Sybase/SAP. The number of Transact-SQL users is an order of magnitude greater than all other SQL dialects combined, says Alvarez.
He adds: “The third-party developer community around transact-SQL is very significant in capital markets. There are thousands of Sybase and Microsoft VARs out there than can immediately consume this interface and make our content available to their applications. Apex can support other SQL dialects (PL/SQL from Oracle, Postgres SQL, and ANSII standard SQL), but at a functional level these are all subsets of the T-SQL functionality used by the capital markets. We could support these, but the fact is that our market research showed that the biggest target application footprint we want to sell to in the capital markets space (and the middle and front office in particular) is T-SQL and Office, so it makes sense for us to start with these, taking the path of least resistance to customer adoption.”
Alvarez sums up the intent thus: “The whole point of applying these standards to the new infrastructure is to shorten the last mile to actual data consumption by business person as much as possible.”
As far as the roadmap goes, Kennedy says, FTP and messaging are now live with a subset of the Interactive Data universe. The entire fixed income and equities database should be live by year-end. The hosted version should be available during the first quarter of 2013, with the API in beta in Q1 for a full rollout in the second quarter.