Intelligent machine readable news is a powerful tool in the arsenal of trading firms seeking competitive advantage. It offers opportunities to turn unstructured data into actionable insight that can be used to uncover market trends, identify correlations and evaluate sentiment, but also raises challenges such as information sourcing, timing and contextualisation.
A recent A-Team Group webinar sponsored by Moody’s Analytics, Integrating Intelligent Machine Readable News, discussed use cases of machine readable news, practical approaches to development and integration, how to overcome challenges such as data quality and huge data volumes, and how to ensure successful projects.
The webinar was moderated by A-Team Group editor Sarah Underwood, and joined by Andrea Nardon, Chief Quant Officer at Black Alpha Capital; Gurraj Singh Sangha, Chief Quantitative Investment Officer at Token Metrics; Sergio Gago, Managing Director of Media Solutions at Moody’s Analytics; and Saeed Amen, founder of Cuemacro.
Setting the scene, an early audience poll question on the take-up of intelligent machine readable news showed 33% of respondents using the technology to a great extent, a similar percentage to some extent, 22% planning to use it, and 11% using machine readable news to the greatest extent possible.
Providing a definition of machine readable news, Nardon said: “I think it’s very straightforward. It’s a technology that enables you to process unstructured data like news in the form of voice or text, run analytics and add the results to an investment process.”
Agreeing on this high-level definition and drilling down, Gago added: “We have to go deep into each of the elements of machine readable news, these are things that are happening around us in the world that are relevant, but to what and for whom? When it comes to sentiment, there is no single definition as traders will take different things from the same event or press release.”
The challenge, he added, is getting a signal from the noise. The answer: machine-based solutions that not only extract signals, but also consider their time relevance, perhaps the relevance of something that happened 100 milliseconds ago, and how things that happen are related to each other. He explains: “What we are trying to do is move forward from press releases talking about a specific company. The alpha about the company could be generated three events away, or three elements away. So how do we cluster these things together? This is about finding signals that are relevant in a customised way for each user.”
From a data management perspective, knowledge graph technology can be used to tie data together despite their timing. “Some events happen through a long period of time, things that are building up towards a default, for example. If you are able to manage events, timestamps, links to different companies, elements from the macro level, and things that are related to more geopolitical elements in a knowledge graph, required information can be provided to you or an algorithm.”
Considering uses cases of machine readable news, a second audience poll showed 43% of respondents using the technology to identify market trends, and around 30% using it to evaluate market sentiment, discover correlation, or build predictive models.
The webinar speakers agreed that key uses of the technology are at a micro level making timestamps critical to success. Amen said: “Accurate timestamps are important for all sorts of usability, especially in low latency cases such as high frequency trading.” He went on to describe an event such as a press conference held by a senior US politician. “This would have a low latency reaction,” he says. “For example, market makers suddenly widening spreads because there’s something the politician has said, and they need to be quickest to respond before they are steamrolled by people taking liquidity. This is not necessarily about P&L, it’s also about risk management.“
A final audience poll questioned the challenges of integrating intelligent machine readable news. Some 63% of respondents cited huge volumes of data, 50% data accuracy, 38% integration, 38% lack of budget/resource, and 13% data timeliness.
Gago noted accuracy and timeliness as key challenges faced by Moody’s Analytics’ clients. On accuracy, he said: “You need to have an algorithm that knows a news story is talking about particular companies and has a high level of accuracy. If you get a false positive and use the signal, you may lose a lot of money. The first thing our clients ask for is SLAa on the accuracy of our models.”
On timing, he added: “If you go past the millisecond barrier you’re late for market makers and similar types of signals. The challenge here is that typical data infrastructure in the cloud that you would build as a pipeline for many other cases, doesn’t cut it here. You need to be in collocated infrastructure where there is no network latency between the generation of a story, the structuring of the story, data extraction, and the delivery of content.”
The challenge of huge volumes of data can be addressed on a practical basis by using a third-party service provider that transforms unstructured news content into machine readable feeds, and selecting required feeds. Amen said: “As a starting point, look at the types of daily news traders tend to follow, as well as social media, then automate that process through machine readable news. Ultimately, you want to combine several different sources to get a good insight into what’s happening.”
Nardon agreed on a measured approach that starts with clear goals for a machine readable news project. “There is a tendency for people to throw a lot of data into their models and make things complicated.” He said. “You need to be able to trust the models that you build for trading, and to trust them, you need to understand them. The simpler they are, the better.”
Providing some final advice for the webinar audience, Amen said: “Use your domain expertise to filter the news in the right way, and then do the number crunching – but have a nice idea to begin with to reduce the scale of the problem.” Gago added: “Test, test and keep testing, and make processes as simple as possible, perhaps using smaller data sets, and then test again.”
Subscribe to our newsletter