When discussing big data, the point is often made that deriving value from the data, through analytics, is where the real business win is. Panopticon Software has been in the analysis business – specifically data visualisation – since before big data was a common term. So we found out more about the company, and how it is involved in the capital markets, from its managing director, Willem De Geer.
Q: How did Panopticon Software get its start, and what is its focus?
A: Development of Panopticon visual analysis tools began in 1999 at an emerging market brokerage, Brunswick Direct. The CTO, Sam Giertz, was facing a difficult challenge: how to efficiently deliver quality information on thousands of financial instruments in over 30 markets in a very short period of time. Giertz wanted his clients to be able to make sense of huge volumes of information quickly and then be able to make informed investment decisions based on that information.
As a co-founder at Brunswick Direct, I worked with the development team to engineer the first version of what is now the Panopticon visual data monitoring and analysis system. Panopticon Software AB was founded in Sweden as an independent company in 2002.
Q: What is data visualisation, who uses it, and for what?
A: Data visualisation encompasses everything from simple pie charts to quite sophisticated interactive visual data analysis tools. In many instances, relatively simple executive dashboard tools and infographics are adequate and can be quite useful. However, Panopticon goes well beyond the presentation of data. Our software is designed to help people isolate outliers, expose hidden trends and clusters, and really dig deep into very large, fast-changing data sets. We specialise in interactive data visualisation, which includes the ability to filter the data on the fly – even so called big data sources that contain billions of rows of data that is changing in real time.
Q: What kind of visualisations do you provide, and for what kinds of applications are they appropriate for?
A: Our software supports over 25 visualisations that are optimised for interactive analysis. We are quite well known for our Treemap and Heatmap tools, for example, since they offer multi-dimensional analysis capabilities that most other vendors simply can’t match.
Many of our visualisations are designed specifically to handle time series data; this is data that is time-stamped. Examples of time series data include trading data and market feeds. In most cases, time series data is stored in columnar databases like SAP Sybase IQ, Kx kdb+, OneTick, or Thomson Reuters Velocity Analytics – or even in specialised in-memory systems like SAP HANA. Processing, filtering and visualising time series data is quite demanding and we’ve made quite substantial R&D investments to solve those problems so the system not only can access and process the data effectively, we can do it so fast that users can filter, change hierarchies, and interact with the data on-the-fly without waiting for recalculations or new cubes to be built.
Probably we are most famous for our ability to visualise real-time streaming sources, including Complex Event Processing engines and message queues. There’s really no limit to speed of updates our software can handle; it’s fair to say that the software can handle data velocities that far exceed (by orders of magnitude) the capabilities of human comprehension. What I mean by that is, we’re putting data up on the screen is these visualisations and we actually have to incorporate specialised (and configurable) throttling controls to allow the designer to create dashboards that are easy to understand.
Q: What kind of datasets and sources can be ingested into Panopticon? Does that data have to be very structured? What volume and velocity of data can be accommodated?
A: Here’s an important point: Panopticon does not really “ingest” data. We connect to your data sources and we do not have an underlying data repository as part of our product. Most of our competitors’ systems do use an internal data repository, which we feel adds unnecessary complexity to the architecture. In addition, this requirement means that those other systems have a lot of latency since data must be first loaded into that internal repository before anything can be done with it. If you’re talking about big data volumes, which we certainly are in capital markets applications, this is far from non-trivial.
Basically, we let you leverage your existing data infrastructure and we connect to your data sources directly. We have native connectors for quite a wide variety of sources, including:
- Real-time streaming message buses, including Apache Active MQ and QPID
- CEP engines, including SAP’s Sybase Event Stream Processor, Oracle CEP, OneTick CEP, Kx kdb+tick and StreamBase
- Column-oriented tick databases, including Kx kdb+, SAP’s Sybase IQ, OneTick, and Thomson Reuters Velocity Analytics
- OData sources like the SAP Netweaver Gateway
- SAP HANA high performance in-memory analytic appliance
- Relational databases like IBM DB2, Oracle, Teradata, SQL Server, and MySQL
- Flat files, including Excel, SVG, CSV, and XML
Our in-memory data model allows you to federate data from multiple sources, create new calculated columns, and incorporate data from any number of sources into a single visualisation or dashboard.
You can also add new connectors for new data sources with minimal coding. For example, you could add a streaming connector for a proprietary message bus or a polling connector for a non-standard database or web service. Users can then access the data stored in these systems and federate it with information from other sources in their analytical dashboards.
Q: What technologies and techniques are used by Panopticon to cope with large and real time datasets and to provide timely analysis?
A: We’re in the sixth generation of the Panopticon product so it’s quite sophisticated of course. However, at the heart of everything is something we began development of in 1999. It’s our in-memory OLAP data model, called StreamCube. Our data model provides the rich OLAP functionality enables fast analysis, filtering, slicing and dicing, and so on that you see in the data visualisations. Our data model can handle traditional multidimensional analysis for static data (like those specified by the MDX language) as well as dynamic data streams, like real-time feeds from CEP engines or message buses.
The Panopticon StreamCube is specifically designed to make extremely fast calculations using continuously updated data. This involves being able to bind aspects of the data to multiple visuals and associated analysis controls. Most data models either cannot handle streaming data, or their performance degrades significantly when presented with large streaming feeds of real-time data. Essentially, the StreamCube is the component in our system that really differentiates us in terms of analytical capabilities.
Our data model supports multidimensional data, static time series analysis (time cubes), streaming time series analysis, aggregations and calculations, externally sourced aggregations, and custom aggregation calculations. The architecture also allows for multiple, independent, in-memory cubes to exist within a single instance of Panopticon.
Q: As trading firms increasingly explore big data sources and applications, where to you see Panopticon playing a role?
A: There are a lot of applications for this technology within the major banks and asset managers, but the most pronounced trend in use cases we’re seeing right now are related to risk monitoring and analysis. Our clients use Panopticon to monitor risk limit usage and to analyse risk data generated by external risk engines. Specific applications include monitoring and analysis of market risk, credit and counterparty risk, and liquidity risk.
As most of your readers probably know, the choice of VaR methodology is critical and better VaR models can lower a bank’s capital requirements and improve profitability. The demands placed on VaR and similar techniques have grown tremendously over the past few years. This growth is being driven by new products like correlation trading, multi-asset options, power-reverse dual currency swaps, swaps whose notional value amortizes unpredictably, and dozens of other such innovations. Today, the number of risk factors required to price the trading book at a global institution have grown to several thousand, and sometimes as many as 10,000. Valuation models have become increasingly complex and most banks are now in the process of integrating new stress-testing analytics that can anticipate a broad spectrum of macroeconomic changes.
A huge change in the market is also that senior people now want to see – and understand – the risk profile at any level of the organisation on a real-time basis. That is, they want to know that the information they’re looking at is up-to-the-second accurate. On top of that, they need tools that allow them to really get a handle on where they stand. Traditional reporting tools and simple infographics just don’t cut it given the complexity and volumes of data involved.
This is where Panopticon adds a lot of value to a bank’s risk management infrastructure. With Panopticon, managers can display and interact with hierarchies like book structure, product structure, counterparty structure, geographic, industrial, issuer rating, asset class, and so on. They can drill and jump between data dimensions (for example, jump from book to positions to instrument to counterparty). Our system can also handle non-additive data sets like Value at Risk (VaR) where the aggregates must be retrieved from an external risk calculation engine.
Other major application areas for Panopticon are transaction cost analysis, trader activity surveillance, performance and attribution, and profitability analysis. All of these areas involve huge amounts of time series data and the combination of our in-memory OLAP data model with our specialised time series data visualizations make our software a great fit with the banks’ requirements.