By Mike O’Hara, A-Team Special Correspondent.
Data is often cited as the most essential resource in today’s global economy, with many of the world’s most valuable companies, including Amazon, Facebook, Microsoft and Google parent Alphabet profiting massively from how they collect, use and sell data.In financial markets, investment and trading firms have traditionally sourced their data sets either direct from exchanges and trading venues, from their counterparties, or from well-established data vendors such as Bloomberg and Refinitiv. Over the last few years however, an increasing number of ‘data marketplaces’ (DMs) have sprung up in the financial markets sector, to enable data-rich companies – including those that have typically not been involved in commercial data sales previously, many providing so-called ‘alternative’ data sets – to reach a wider target audience of data consumers.
The fundamental purpose of these DMs is to match data providers with data consumers through a central, typically cloud-based platform.
But what kinds of benefits can these DMs offer to data consumers within capital markets? How can they help data-rich firms monetise their data sets? Who uses these DMs, and for what types of data? How are data usage and licensing rights handled? And how can data quality be assured through such third-party delivery mechanisms?
A new way of discovering and consuming data?
Sourcing the necessary data to drive investment and trading strategies, particularly when looking for new ways to uncover alpha, is an ongoing challenge for financial institutions. Discovering and combining data sets can give firms a competitive edge, but it can be a costly and cumbersome process to manage relationships with multiple data vendors, and to maintain the technology infrastructure necessary to access that data.
This is the challenge that data marketplaces aim to address, by bringing together previously untapped data sets and offering them to firms via the cloud, on a self-serve basis. Anurag Mathur, Financial Services ISV Partnerships Lead at Google Cloud, gives a succinct overview of the value proposition. “DMs allow for data discoverability, enhanced delivery options, faster time to value, reduced operational complexity and costs, and spend optimisation,” he says.
By providing a data ‘ecosystem’ that includes a wide range of cloud-based services and tools for end users, can DMs make identifying, curating and preparing data for usage less burdensome for firms?
Paul Watmough, CEO of IOWARocks, a global marketplace designed for data providers and data consumers, thinks so. “The new genre of DMs is empowering many financial markets firms to break free from the onerous cost and legacy constraints imposed by traditional data vendors,” he says. “With a cloud-based DM, the consumer no longer needs to maintain an ageing infrastructure to consume data, nor be subject to the limitations of propriety, expensive terminals to visualise that data. With the growing acceptance of cloud delivery, DMs are providing a more cost-effective, agile solution, delivering a modern, consumer-like ‘on-demand’ approach to data consumption, aligned to the needs of the new, younger generation of users who expect this type of service as a standard.”
Matthew Cheung, CEO of ipushpull, a company specialising in live data sharing, workflow and automation, believes that the financial markets’ growing acceptance of cloud has contributed to the rise of DMs. “The reason why these DMs have started to take off in the last couple of years is really around cloud adoption. Because before it just wasn’t possible. And financial institutions weren’t trusting the cloud. That view has changed now,” he says.
Unlike the legacy approach to data consumption, which might typically involve a data producer making its data set available via FTP, or via its own cloud service requiring APIs to connect, today the likes of Snowflake, DataBricks and Google Cloud BigQuery offer solutions where data can be accessed over the public cloud and used without it ever actually leaving the host platform. In effect, this is ‘data rental’, where consumers access a shared version of the content without having to transfer it their own environment.
“The real value is about discovery, accessibility and integration,” says Michael Rude, COO of Crux Informatics, a managed data hub and data engineering services company. “If you can click a tile online, have it instantly appear in your BigQuery or Snowflake account (for example), and start testing it immediately and doing all the other things you want to do, that’s massively more efficient and beneficial than the historical way, which is for me to go to your FTP server, give you my credentials, download a file, prep it, load it into my database, and so on. So there is huge promise there.”
Nicolas Doyen, Head of Product at TickSmith, a company that provides tools for data producers to connect, package, unify and monetise their data, agrees that this approach offers clear benefits. “With some of these more data rental type concepts, you don’t have to have that ETL (extract, transform & load) anymore, you just connect directly onto the database or the data system with your analytics engine or your algorithm. And you can run your analytics on top of that without having to do any of the very difficult data engineering work to ingest the data.”
Flexibility around data consumption is another key benefit of DMs, says Francis Wenzel, TickSmith’s CEO. “You can slice and dice. It also makes it easier to integrate and work with the data. You might want to open it up in a spreadsheet, or you might even just want to see it on screen. The more mechanisms there are available to the users to integrate within their systems, and the more seamless it is, the more data eventually is going to be consumed. And the happier the customers are, because they’re only getting what they need, and paying a fair price for it.”
Dale Richards, CEO of Island 20 Ventures, a business strategy and advisory firm in data, analytics, technology and data monetization, and Board Chair at JamLabs, a full-stack development and advisory company specialising in data science, believes that consumers benefit by being able to go to one place that has already done a lot of the work for them. “If you look across the whole data consumption workflow, a DM can knock off three or four main elements of the work that is required,” he says. “For the end user, it’s all about ease of administration and managing those relationships, and ease of integration in getting all the data from one place, one central source, from multiple vendors.”
How can data producers benefit from DMs?
For companies that produce unique data sets, DMs can offer potential monetisation opportunities by providing a mechanism to reach users who find value in that data. This can be particularly attractive to companies that might not be considered typical data vendors.
“We’re now seeing non-traditional firms wanting to commercialise and make the most of their data models by using the DM ecosystem & infrastructure to make it available to a new audience. That can be done effectively and at a low cost point,” says Darren Bishop, Director of Business Development, EMEA & Asia at MarketDesk, a market data solutions company that brings together content sellers with content buyers.
“There’s no DM that can guarantee a sale of data, but you stand a much better chance of selling the data if you can put it in front of your target audience and let them choose if they see value in it,” continues Bishop. “If you’ve got quality data internally and you think it has value, then you can get in front of potential customers very quickly and very cheaply. And then you can see if it’s a viable business opportunity. Five to ten years ago, you would have had to massively invest to do that. With the internet and cloud technology, it has never been as adaptable and as cheap as it is now.”
Being able to leverage the cloud not only provides a lower cost-to-market for data providers, but, according to Matthew Cheung, offers additional benefits. “DMs can make use of the clever technologies of the big cloud hosting providers, which firms probably don’t have in-house. So you can just piggyback on what they’ve already built, to access the data, query it, model it, and do all of the very sophisticated things that customers want to be able to do. And they present it in a way that a business analyst could build stuff as opposed to a developer. That’s unlocked a lot of value,” he says.
It’s not just financial data providers that can benefit from DMs, but other data-rich companies too, particularly those producing alternative data sets that can be used in the investment process, says Dale Richards.
“You’ve got what I would call the ‘data exhaust’ providers, companies that are doing something else like selling appliances, or selling farm machinery or selling cars, and they’ve created all kinds of data, through cell phones, or transactions, or IoT, for example. For all of those, data marketplaces may offer targeted distribution for their data, through a channel other than the company’s own direct website. Data-rich companies want to have access to the niche spaces that these marketplaces can provide.”
Richards points out that DMs offer another advantage, the possibility to ‘mix and match’ data. “My data might be worth something by itself, your data might also be worth something by itself, but my data plus your data might be worth something more, by offering new insights. Vendors can indirectly get the benefit of having their data stitched together with other data in a broader ecosystem, and that’s quite powerful,” he says.
Data producers do however need to be able to demonstrate the value of their data, says Marc LoPresti, Co-Founder of BattleFin, a technology company that provides a marketplace for alternative data.
“It’s like a store window on Fifth Avenue, you can have the most wonderful display, but if what’s in the store is not immediately attractive because a buyer cannot readliy understand the use case for what’s on offer, you won’t have people buying,” he says. “That’s why the curation, the refinement of the displaying of the data, and helping the data providers to make their use cases relevant to the buyers’ KPIs is essential. It’s all about understanding how it will help investment firms gain alpha.”
Non-traditional data providers also face a challenge in determining the monetary value of their data, so they can put an appropriate price on it. There are various methods that can be used to formulate a suitable valuation, says Samantha Campbell, CEO at Alqami, a consultancy firm specialising in data monetisation.
“We do a lot of analysis with our clients,” she says. “We have a scorecard, which lists various attributes, and we will score our clients’ data sets using those different aspects. That could be the amount of history, it could be whether the data set is ‘tickerised’ and maps to stock identification codes, the breadth of coverage, how many regions are covered and in what sectors, and the frequency, because it’s important that the data set can be delivered on a regular basis, typically daily. If a data set can only be distributed on a quarterly basis, for example, that’s not necessarily as valuable to the data buyers. We also evaluate our clients’ internal processes, in order to make sure that they are ready and have the ability to deliver the data on a regular, consistent basis, but also, that they have the internal quality controls in place, so that the data model doesn’t change over time, for example.”
User profiles and data types
Within the financial markets sector, what types of firms are actually using these DMs?
“If we split the market between buy side and sell side, I haven’t seen that much evidence that the sell side banks are using these types of ecosystems to source data,” says MarketDesk’s Bishop. “And I’m not surprised by that. Because traditionally, they’ve used the vendor networks, they’ve heavily invested in huge amounts of infrastructure. So it’s just not really the way that they manage their businesses. Where you see the innovators out there, like the systematic hedge funds, those types of firms are thinking outside of the box. They’re purely focused on the content, and getting that content in as cheaply and as fast as possible. We’re seeing those types of firms being more readily adaptable to looking at these types of data sets, in these types of data delivery models.”
For alternative data providers, DMs can reach a wider target audience, says JamLabs’ Richards. “Five years ago, using this data was more or less the exclusive domain of data science quant hedge funds. But now it’s definitely made its way to the mainstream. Discretionary funds, private equity, corp. dev., etc., all definitely mine this information. Supply chain data has always been required for research in traditional portfolio management. And there’s been a few big success stories around things like ESG data, which has definitely made its way into the mainstream.
“We always tend to think about people who want to consume this data as traders, quants or portfolio managers,” continues Richards. “But there’s also a lot of consumption within corporate development, M&A, private equity, and even the investor relations side of the business.”
BattleFin’s LoPresti agrees that it’s not just systematic and quant hedge funds that can benefit from DMs. “A market still to be tapped is the smaller AUM players that need more help,” he says. “Most do not have internal data science teams. So for them, data discovery needs to be more efficient, more intuitive, with strong data visualisation, and correlation of data sets to data buyer KPIs.”
Correlating and aggregating alternative data sets in a meaningful way presents its own challenges, says Alqami’s Campbell. “The fundamental difference between the traditional market data platforms versus what you see in the alternative data space, is that the Bloombergs of the world ultimately provide a level of consolidation and aggregation of data for their users, as opposed to the alternative data space, which is currently mostly individual private data sets or data providers, which creates fragmentation, making it difficult for users to identify and source such data sets through these types of DM. So the consolidation and the ability to extract value definitely still sits with the buyer and consumer.”
Currently, the majority of data consumed via DMs is historic, rather than real-time or streaming data. TickSmith’s Wenzel points out there are good reasons for that.
“Streaming data usually implies that the data is going to arrive in some specific protocol,” he says. “For example, if I’m getting real-time data, am I getting it as a synchronous feed? How do I need to connect to it? Do I need a piece of technology from the data vendor at the receiving end to be able to manage the connection? Because of all of that, the selling of streaming data and the delivery through DMs is not happening at the same pace as other types of packages.”
In specific use cases however, DMs could provide an appropriate mechanism for accessing streaming data, says ipushpull’s Cheung. “There are various different requirements around streaming data, there’s the low latency trading stuff, and then there’s everything else,” he says. “For everything else, the cloud is absolutely fine. Anything that’s exchange traded, there needs to be low latency, because otherwise you’re behind the pack when you’re trying to trade. But for anything that’s not in that world, milliseconds (as opposed to microseconds) are absolutely fine. Risk calculations, maybe some internal sharing of data, things like axes, or some of the OTC market data, which doesn’t actually refresh that quick anyway, so it doesn’t really matter if you’re adding in another half second delay, because it’s doesn’t make much difference.”
Data entitlements, usage rights and licensing
Content owners generally need to understand where their data is going, who is using it, and (to an extent) what for. Traditional market data vendors such as Bloomberg and Refinitiv have well-established data entitlements and licensing systems to take care of all of that. But what about DMs? How can they ensure that data usage rights are observed, particularly as users of such e-commerce platforms expect to be able to go through the whole process of data purchase seamlessly?
“The DM needs to be able to integrate with the vendor’s licensing systems,” says TickSmith’s Doyen. “If they want to control the licensing system themselves, it limits the types of data that they’re going to be able to sell, and therefore limits the firms that they’re going to be able to work with. From the data consumer perspective, they want to have it streamlined in a single session, integrated into the marketplace so that they can finish their whole purchasing process all in one self-service type purchasing flow. That’s just the reality of where the world is moving to, so the data sales space needs to match that, by providing an opportunity for people to go and find what it is that they’re looking for, purchase it in a single session, and only calling or talking to a specialist if they need to get some really explicit information.”
How can this work in practice?
“The implementations vary,” says Crux Informatica’s Rude. “In all cases, I would argue that the supplier has control over who gets access to the data, there are entitlements in all of them. There’s an entitlements workflow, so if a client or a prospect goes and clicks on a tile, the message request goes to whoever owns that tile. Once the data supplier receives that request, they can do all the things that they need to do and then entitle the client to access the data, if that’s the outcome. That gives them control. But the workflows and the line of sight varies depending on platform.”
This whole process can become quite complex, so the DM needs to be able to manage that complexity, says IOWArocks’ Gissing. “DMs need to provide flexible and robust permissioning systems that enable granular control over data access by subscribers,” he says. “IOWArocks allows data to be permissioned right down to the individual instrument level if required. It also facilitates control over which delivery channels a subscriber can use (e.g. direct to Excel, via an API, etc.), including how many connection instances are supported per subscriber. The intention is that data providers can manage permissions themselves if they wish, giving them direct control over their subscribers’ data access. This needs to be backed up with comprehensive auditing of each subscriber’s data usage to give providers confidence that their usage is appropriate, and metrics around the specific data items that are most and least used, providing better insight into the value of their data. We also believe it is essential that, in most cases, data consumers license the data directly from the providers using their agreements, giving the providers complete control over the licence terms.”
Data preparation and data quality
If a data-rich company wants to make its data available through a DM, what are the prerequisites? What does it need to do in order to prepare its data for distribution, for example?
“Clearly, the owner of the content is going to need to have some level of technical proficiency around where the data is in their own environment,” says Bishop. “At MarketDesk, we’ve always taken the view that we would provide the tools to enable people to get their data out of the organisation, and into the DM ecosystem. So we do the heavy lifting. And I don’t believe that we’re alone in realising that there are challenges for some firms to get their data to a DM. Offering technical tools to be able to facilitate that, without charging for it, does therefore make a huge amount of sense.”
Should content owners expect DMs to provide this kind of service free of charge?
“The larger and more successful DMs may be in a position, further down the line, to look at a particular asset class, and say to vendors that if they want to be on the platform, then they’re going to need to pay some level of subscription or data onboarding fee,” says Bishop. “And I expect that will happen quite a lot. The flip side though is that if there’s a particular asset class or data category that is very hot, very topical, the DM might swallow the cost to get the data onto the platform because they believe in it and are happy to take a little bit of risk on it.”
Rude warns against underestimating what’s involved in working with a DM. “There’s this entire chasm in the middle that happens between discovery and usage of analytics-ready data. That includes the validation, the transformation, and all the data preparation work that needs to happen, particularly as the use cases across firms are unique. That’s a very challenging problem, which firms are trying to solve for themselves.
“There’s a massive data gap in the cloud,” he continues. “If you’re a customer who wants to be in the cloud, where’s the data? That’s one of the things that the DMs are trying to solve for, but are the data suppliers actually going to go to the trouble of loading the data into the cloud provider/marketplace, build the tile, administer the tile, and manage those workflows? That is overhead in terms of work, resources and dollars. There’s a real supply and demand gap here. And it’s one of the things that Crux is in a position to solve, because it’s what we do for a living. We onboard data once, and we build the hooks so we can get the supplier into every one of those DMs, with literally no work on their part. And we also provide the 24×7 operations, the quality controls, and everything else that hopefully makes the experience that much better.”
Regarding the quality of data served through a DM, the general consensus is that it is the responsibility of the source data provider to ensure the quality of its data. Very few DMs would be willing to guarantee the accuracy or integrity of data delivered over their platforms. However, there are steps that DMs can take to provide greater assurance around quality of data, says JamLabs’ Richards.
“There are players in the data quality space that are now being courted, and included in DM conversations,” he says. “Increasingly, the DMs want to have a stamp of approval, and get that monkey off their back a little bit by having a data quality partner. There are service providers that are definitely stepping up in that regard, and they’re partnering with the DMs to help with the quality question. Because, like every intermediary, the DMs themselves are not going to take liability for or assure the quality. So the way they can get around that problem is to partner with third parties that can help generate some credibility, or quality statement.”
It’s clear that DMs are generating a lot of interest within the financial markets. But will they become an integral part of the data landscape?
“DMs definitely have a place in the market,” says Alqami’s Campbell. “So you’ll see a lot more of them popping up, because they’re trying to solve the problem of the fragmented data that is out there everywhere. And so they’re trying to create an aggregated list of data sets, which definitely solve a part of that problem. But I do think that the market will evolve as buyers and consumers of that data become more sophisticated. They expect data providers to do a certain amount of work upfront, they want to make sure the DMs will display or list these data sets in a consistent and structured manner, and take them through a certain amount of compliance.”
So what does the future hold for DMs?
“Data visualisation is the next big thing in data marketplaces,” says LoPresti. “As you start to expand the scope of potential users beyond the hedge funds that have teams of internal data scientists that can consume data in raw form, you need to do more to make the data consumable, and visualise it. As always, it starts with the discovery and identification of the data sets for a particular use case.”
To summarise, it seems that data marketplaces have the potential to offer significant benefits to both data consumers and data producers, but there are ongoing challenges that need to be addressed around clear definition and presentation of use cases, managing data entitlements and usage rights, and ensuring that the quality of the data is not compromised by going through a third-party delivery mechanism. If all of these challenges are met, then data marketplaces could potentially be a disruptive force in the financial markets of the future.