As many of you know, we like to spend some time with the marketplace to try to understand some of the nuances of the topics we cover, particularly as we prepare for our big conferences, like the forthcoming Intelligent Trading Summit in New York on May 13.
Attendees of our London equivalent last month will recall a keynote presentation from Steve Wilcockson, industry manager, financial services, at Mathworks, on the skills gap within the trading technology community, especially with respect to emerging talent.
While Steve’s presentation raised important issues with respect to human resource development in this space, he had also helped us in our understanding of a key element of what we perceive to be intelligent trading, namely: machine learning. As we develop our agenda for the May 13 Intelligent Trading Summit in New York, we’ve found Steve’s insight invaluable, particularly with respect to some of the panels we’re pulling together on risk analytics, data-driven research and the like.
A couple of months back, Steve and his colleague – Tanya Morton, Applications Engineering Manager at Mathworks – were kind enough to brief me on some of the use-cases for the company’s MatLab analytical framework. We all know MatLab as a key component of many financial institutions’ proprietary analytical applications development, but I thought it would be good to hear from the horse’s mouth on just how this framework is being used.
Specifically, we talked about the role of machine learning in application development, and reviewed some of the results enjoyed by Mathworks customers. As we journey down the road toward Intelligent Trading, I found the session extremely useful. And I learned that machine learning has a major contribution to make as firms deal with the challenges of developing Big Data, algorithmic trading, sentiment analysis, fraud detection and risk applications.
Before we get into details of the use-cases, it’s worth taking a moment to ask: what is machine learning?
As I see it (and I hope Steve and Tanya will forgive me for the over-simplification), machine learning is all about developing applications/processes that learn from the data they consume in order to optimize performance.
Getting more technical, Mathworks says: “Machine learning algorithms use computational methods to ‘learn’ information directly from data without assuming a predetermined equation as a model. They can adaptively improve their performance as you increase the number of samples available for learning. Machine learning algorithms that develop decision-making rules by learning from labeled training data are known as ‘supervised learning’ algorithms. ‘Unsupervised learning’ algorithms can uncover useful patterns and structures from unlabeled data.”
The use of the term ‘adaptive’ in this description certainly resonates. For one thing, the most exciting aspect of the Algorithmic Trading Directory we published for several years – now being considered for a revival – related to the emergence of so-called adaptive algorithms by many of the major sell side firms and prime brokerage players. These, at the time, were seen as a key factor in the ability to differentiate algorithmic service offerings from ‘standard’ models like VWAP and the rest.
For another, we have been hearing about adaptive trading from market practitioners for some time, and kicked around the term as we considered launching Intelligent Trading Technology. In the event, we decided the former was a subset of the latter – the rest is history, as they say.
But back to the matter in hand.
A typical machine learning workflow, according to Steve and Tanya, involves importing the data set for analysis, cleansing it and identifying variables of interest. The data is then applied to the model, and the model ‘trained’, with measurements taken to assess the accuracy of the model to the real world environment.
An iterative approach allows improvement of the model as it consumes more data and effectively ‘learns’ from it. Different models can be applied to the same data, and the best-performing model selected for the task in hand.
Enough of the theory. What’s the real-world application?
Steve and Tanya were kind enough to supply me with a set of quotes from real customers that do a better job of explaining how machine learning is helping them than a thousand of my words ever could. Here are a few:
“I use Bayesian estimation, Markov Chain Monte Carlo, dynamic Bayesian networks, Hidden Markov Modelling and various classification algorithms: svms [support vector machines] and decision trees.” – US investment manager
“I developed and traded my own intra-day, trend-following G10 FX strategies, which used a unique combination of traditional machine learning algorithms (Neural and Bayesian Networks) with a Genetic Algorithm optimization wrapper.” – UK investment banker
“I risk-managed a guy who was terrible for over-fitting. His models were optimised to within an inch of his life and did not work out of sample. They were too oriented to the noise…” – UK prop trading firm
“No matter what cool algorithms we threw at the testbench and then live, simple linear modelling worked surprisingly well; we could understand the model, apply judgment over risk factors and model parameters. Far more satisfying.” – UK systematic fund manager
“I use a range of machine learning classification algorithms to aggregate useful index, stock and economic information from which I build my portfolio strategies.” – Portfolio advisor, Tier 1 UK bank
“We are going to use machine learning tools to analyze predictability in publically available daily stock returns.” – US prop trading firm
“I would like to hear your experience on the use of state space models in stat arb. I do believe they offer a superior way to model the equilibrium dynamically allowing it to evolve through time. The tricky part is how to deal with the risk of over-fitting.”- US hedge fund
“I started to use state space models to get a framework for testing parameter stability to avoid over-fitting.” – Danish fund manager
The term ‘over-fitting’ cited by several of these practitioners refers to the practice of making an overly complex model to explain idiosyncrasies in the data under study. Often, the data will have some degree of error or random noise within it, and attempting to make the model conform too closely can introduce new errors to the model and impact is predictive performance.
I hope this gives a little flavour of what can be done with machine learning. The majority of the above examples are targeted on quantitative analysis of market opportunities, but machine learning also has applications within the risk and fraud detection spaces.
For the New York Intelligent Trading Summit, we’ll be looking at how machine learning – and Big Data and other emerging technologies – can be deployed within the pre-trade risk and data-driven research areas, among others.
In particular, our 2pm panel – Technology Advances for High Performance Trading and Analytics – will consider recent technology innovation at both the hardware platform and system software level in order to reduce latency for compute-based trading and support big data-driven analytics.
After that, at 4.15pm, we’ll have a panel looking at Data-Driven Analytics and Research, moderated by Mike Mayhew of Integrity Research Associates, to explore how the buy side is co-opting big data technologies to drive new forms of data-driven research.
We hope you can come join us, and learn more about this hot topic.