About a-team Marketing Services
The knowledge platform for the financial technology industry
The knowledge platform for the financial technology industry

A-Team Insight Blogs

Q&A: ParStream’s Mike Hummel on Bringing Low Latency to Big Data

Subscribe to our newsletter

Bringing low latency to the world of big data is what ParStream – which recently raised $5.6 million in series A funding – has been working on now for several, years, with some impressive results.  We talked to the company’s CEO Mike Hummel to find out more about the company and its technology.

Q: First, can you describe how ParStream got started, and what business problem are you looking to solve?

A: We founded ParStream in 2008 after we had identified a lack of database technology enabling real time big data applications for our customers.  Four years ago, our company got a contract to build a search engine for a travel package offering.  The application should be able to search through about seven billion data records against 20 parameters in less than 100 milliseconds.  We tried a lot of different database technologies, also NoSQL, but nothing worked.  This motivated us to develop our own technology.

Today ParStream enables customers across several industries to gain new insights from big data in real-time and it is currently used in marketing analytics, customer analytics, operations analytics, research and other scenarios.

Q: What are the benefits of choosing ParStream compared to a Hadoop approach?

A: ParStream is built for real-time analytics on big data while the data can be continuously imported. That enables the user to act on big data with low latency.  ParStream gives the flexibility of a full drill down in billion datasets, there is no need to use cubes, materialised views, projections or any other form of pre-aggregation.

Newer technologies such as Google’s MapReduce, and its open-source Hadoop derivative, are able to decompose the query into many independent pieces, just like the ParStream software.  But the MapReduce technology is more suited for batch-mode processing, rather than real-time analytics. ParStream’s customers had tried the MapReduce scheme and encountered those limitations.  In fact, Google itself abandoned MapReduce for query-type searching.

Another benefit of ParStream is its SQL interface.  Developers know SQL very well and SQL is the perfect interface to integrate analytic tools.  Therefore, ParStream can be easily adopted and integrated in the given infrastructure.

Q: What are the principal technology approaches that you’re leveraging to perform analytics with low latency?  And how do they complement one another?

A: As suggested by its name, the ParStream software performs parallel streaming of data structures. The technology is ideal for very large amounts of structured and semi-structured data – the database can have thousands of columns and billions of rows.  The secret is to parallelise each query such that it can be processed simultaneously on many cores spread across multiple nodes.  In a cluster environment, the data is sharded, i.e. stored on individual servers in a “shared nothing” environment.  As ParStream processes data locally on nodes, there is very little data traffic between the servers.  This is the reason ParStream’s performance can scale linearly with the cluster size; doubling processors or nodes doubles throughput.

But it’s not just about query parallelisation though.  ParStream’s real secret sauce is our index structure.  We invented a unique indexing technology, the High Performance Compressed Index (HPCI), which allows an ultra-fast search in compressed bitmap indices.  This unique approach has been gaining recognition for ParStream, including recently being named a “Cool Vendor” by Gartner Research.

Q: What part do GPUs play in your architecture?

A: From the start we have optimised ParStream to a massively parallel architecture and we use bitmap indices.  Both innovations bring a performance boost on modern CPUs but GPUs benefit from this technology even more.  The ParStream architecture and technology gives our customers the advantage of using CPUs today with the opportunity to switch to GPUs later if they like.  Today our customers prefer running ParStream on CPUs, in other words commodity hardware, because it is easier and cheaper to integrate in their data center.

Q: What kind of performance is achievable using ParStream, and how does the deployment model – customer infrastructure, cloud, appliance – impact that?

A: ParStream delivers sub-second response times on billios of data records whilst continuously importing new data at very high speed.

We have built solutions with ParStream that range from real-time bidding with average response times of seven milliseconds on the AWS cloud up to live-segmentation of web-click dat,a which needs multi-stage processing that executes in about two seconds on 10 billion records.  We all know that query response time depends on query type, data structure, data volume, etc. but these response times are typical.

ParStream is a software-only product that is available for many Linux distributions.  ParStream can be operated on a single server, a dedicated server-cluster or on virtualised infrastructures, like private or public clouds.  Of course, dedicated infrastructure provides some performance gain over virtualised environments, but ParStream runs very well on virtualised environments like AWS as well.

Q: How has ParStream been adopted for financial markets applications?

A: The ability to act fast on latest data is essential in financial markets.  ParStream is made for analytics in real time with low latency.  In other words: ParStream is developed for the requirements of financial market applications with no need for extra adoption.

Q: What’s coming next from ParStream in terms of company growth, technology directions, products?

A: We are hiring at our locations in Palo Alto and in Cologne, Germany!  So please give us a ring if you want to be part of the team, exploring the limits of technology regarding real-time analytics.

Several partnerships are going to be announced within the next month.  This will show that ParStream is an integral part of the big data movement.  As ParStream will continue to focus on its database technology, together with its partner network integrated real-time solutions can be offered.

Regarding our technology, we want to make it easier to use and better integrated with the Hadoop and the OLTP eco-system.  Furthermore we are planning to build in some further secret-sauce … psst.

Subscribe to our newsletter

Related content


Recorded Webinar: Re-architecting the trading platform for interoperability, resilience and profitability

Trading platforms have come a long way since the days of exchanging paper certificates and shouting across trading floors, pits and desks in the early 2000s, but there is progress still to be made as firms strive to reduce risk, increase profitability, and make their mark in digital assets trading. This webinar will review the...


Encompass Plans Corporate Digital Identity Platform Following Acquisition of CoorpID and Blacksmith KYC

Encompass Corporation, a provider of real-time digital Know Your Customer (KYC) profiles, has acquired CoorpID and Blacksmith KYC from ING to develop a platform that solves the critical challenge of identification and verification of corporate and institutional clients. ING will be a stakeholder and development partner to Encompass and will use the platform in the...


AI in Capital Markets Summit London

The AI in Capital Markets Summit will explore current and emerging trends in AI, the potential of Generative AI and LLMs and how AI can be applied for efficiencies and business value across a number of use cases, in the front and back office of financial institutions. The agenda will explore the risks and challenges of adopting AI and the foundational technologies and data management capabilities that underpin successful deployment.


Entity Data Management

Entity data management has historically been a rather overlooked area of the reference data landscape, but with the increase focus on managing risk, the industry is finally taking notice. It is now generally agreed to be critical to every financial institution; although the rewards for investment in entity data management appear to be rather small,...