By Amir Halfon, Senior Director for Technology, Capital Markets, Oracle Financial Services
Systemic risk was at the heart of the financial crisis of 2008, and is again on everyone’s mind as the current sovereign debt crisis unfolds. Regulatory and industry efforts are, therefore, focusing on getting a more accurate view of risk exposures across asset classes, lines of business and firms in order to better predict and manage systemic interplays.
Managing large amounts of data (including positions, reference, market data, etc.) is a key aspect of these efforts, and is one of the reasons data management has recently ascended to top-of-mind status, after being relegated to the back burner for many years.
We are finally at a point as an industry when data is considered a key input into business processes, with full awareness of the top executive ranks. Efficient allocation of capital is now seen as a major competitive advantage, and risk-adjusted performance carries more weight than ever before. And so, from the front office all the way to the board room, everyone is keen on getting holistic views of exposures and positions, which require fast, on-demand, aggregated access to disparate data.
Most Big Data discussions have been focusing on Internet companies such as Google and Facebook and the data they generate. There’s been a lot of attention given to harnessing that data for commercial goals, and certainly the banking industry is examining these usage scenarios as it considers its future direction.
I would argue however, that a more urgent task associated with Big Data in our industry is the one mentioned earlier – namely harnessing the large amounts of data that have been gathering within our firms to address critical business concerns such as risk management. The rest of this post will attempt to build a case for this argument, with following posts focusing on specific technology implications.
Being a recent darling of IT analysts, Big Data has had many definitions, but key aspects seem to be categorized along ‘four Vs’:
Let’s examine each in more detail:
The web is not the only place seeing exponential growth in data volumes – our industry has witnessed exponential growth in trade data, beginning with the early days of electronic markets, and skyrocketing with market fragmentation and the wide-spread use of algorithmic, program, and high-frequency trading. These generate orders of magnitude more orders and cancels compared with the ‘quaint days of open outcry’. Additionally, complex strategies, including cross-asset trading and structured products, generate far more data per trade than simple ones.
Higher trade volumes mean higher market data volumes of course, but also much larger amounts of historical tick and positions data that need to be kept around. New regulations require ever more extensive data retention, and sophisticated strategy development requires ever growing amounts of historical data for back testing.
Many systems are struggling to keep up with these vast amounts of data while still performing their tasks – whether it’s risk management, regulatory reporting, trade processing or analytics.
Higher volumes are not the only issue firms are facing; data is also coming at them at much higher speeds, while at the same time, information needs to be culled from source systems in ever growing pace to address pre-trade and on-demand risk analytics requirements.
It is the latter aspect that’s been getting a lot of attention lately, as new regulations become much more stringent about timely delivery of data, essentially mandating on-demand risk exposure and positions reporting.
Most current systems are ill-prepared to meet these requirements, making the notion of on-demand exposure reporting seem all but impossible. Many use long ETL data integration and batch calculation cycles to generate reports overnight, and are completely incapable of supporting an ad-hoc analytics model.
Low value of the overall data set, or low information density, is another key aspect of Big Data. Just as Twitter feeds contain a lot of ‘noise’ when analyzing market sentiment for instance, financial data can have a very low ‘signal-to-noise ratio’ when looking to analyze a specific market exposure, find correlations between unrelated variables, and so on.
Low information density puts even more onus on current analytics systems, as more and more data needs to be sifted through to get at the relevant information. In many cases, this can make traditional approaches to analytics fall apart.
Lastly, information variety has to do with loosely structured data. And while this is quite clear when it comes to images and videos on the Web, within the financial services industry we’ve had a data variety challenge for quite some time, which has actually been getting a lot of attention lately… I’m referring of course to OTC derivatives – essentially contracts that have little in the way of structured data, and which were at the centre of the financial crisis of 2008.
A lot of regulatory effort has been focusing on these instruments, attempting to make them more structured by establishing formulas for their trading, clearing, and settlement (e.g. central counterparties). While this certainly goes a long way toward reducing systemic risk, it will not fundamentally change the fact that certain instruments will always remain nothing more than a bilateral contract.
As long as OTCs exist, we need to find a mechanism to extract structured data out of these contracts in order to properly evaluate them and manage their risk exposures.
I think you’d agree that all these factors make a case for Big Data management being a real challenge that we, as an industry, need to address right away. In the next few posts I’ll cover some technologies that can help us get a grip on Big Data, focusing on the aspects above in greater detail.