The In-Memory term is increasingly in use, and in the context of big data. We asked Adi Paz, vice president of marketing and business development at GigaSpaces Technologies, to explain the integration of in-memory and big data technologies, and what benefits can be expected.
Q: Let’s begin with some GigaSpaces and XAP basics. Can you briefly describe what the company does, and what XAP is?
A: GigaSpaces is the pioneer of a new generation of in-memory computing platforms and a leading provider of big data solutions and cloud enabling technologies for mission-critical applications. The GigaSpaces solutions are designed from the ground up to run on any cloud environment, and offer a pain-free, evolutionary path to meet tomorrow’s IT challenges to enterprises in such industries as financial services, e-commerce, transportation/logistics, and telecom carriers.
XAP is a distributed, in-memory driven application platform that combines data access, messaging and business logic deployment, all in a single platform. It serves as a foundation for highly demanding applications, with extreme throughput, low latency and scalability requirements. XAP allows application developers to easily build and manage applications that handle large amounts of data, in near real time, while still maintaining high levels of data integrity and consistency.
Q: Given that XAP is an in-memory architecture, and RAM memory is limited, how does it address big data problems?
A: XAP is designed to handle the velocity aspect of big data. While it can maintain a few terabytes of data in-memory, the main focus is on processing massive amounts of data very quickly without compromising on consistency or integrity of the data. Some XAP deployments can process big data streams of hundreds of thousands of events per second.
While it’s not designed to maintain this data for the longer term, it’s geared to process this data in real time and extract valuable insights from it in real time, which allows for the business to respond and leverage this data much faster.
Q: Specifically in the latest release, XAP 9.5, there is enhanced integration with NoSQL databases like Cassandra. What’s the reason to do this?
A: As mentioned above, while XAP is very good at the velocity aspect of big data, and can process a lot of data very quickly, it cannot hold more than a few terabytes of data in-memory. In typical big data scenarios, this is only good for a few days, or weeks at best. Therefore storing data for the longer term (months and years), and then analysing and processing this data requires tools that are geared towards storing and retrieving large volumes of data, namely NoSQL databases.
Another interesting aspect here is that XAP can also serve as a transactional, low latency front end to a Cassandra cluster. So data access can be done using XAP, leveraging features such as full consistency, ACID transactions, and client side local caching, and then reliably delegated to Cassandra as needed.
We chose Cassandra due to its excellent write performance, elegant and powerful clustering model, its proven track record in handling serious big data loads, and perhaps more than anything else customer demand (some of our customers actually contributed part of their work on this front for the benefit of other users).
Q: How has XAP boosted the performance of Cassandra in your performance tests?
A: XAP has mainly boosted Cassandra in scenario related to repeatable reads of the same pieces of data. When such a scenario exists, the client application can use XAP as a mediating layer between itself and Cassandra. This mediating layer can fetch frequently-used data from the XAP in-memory data grid, and fall back to Cassandra if the data cannot be found in XAP. Furthermore, the XAP client-side libraries can cache frequently-used data on the client side, and then read it without ever leaving the boundaries of the local machine. This can result in massive performance gains, up to 2,000 times faster in some use cases.
Q: What kinds of financial applications do you seen benefitting from this integration?
A: Typically, financial applications are pretty conservative with their data consistency and integrity requirements. They need to be 100% foolproof, and in many cases also require ACID transactions. That makes NoSQL in general less likely to be used in such environments. By integrating with XAP, Cassandra can be fronted with a fully consistent and transactional in-memory data tier, thus making it much more appealing to financial applications. We’ve seen XAP + NoSQL being used in several kinds of financial applications, such as payment processing, fraud detection, and even online trading.
Q: Does XAP integrate with other big data technologies, such as Hadoop?
A: Yes, XAP integrates with MongoDB and Hadoop, and more data stores are expected to be implemented in the near future.
Q: What do you expect next for XAP, big data and financial markets applications?
A: We have high hopes for XAP and big data as far as financial markets go. We believe that with our recent integrations with NoSQL tools in general and Cassandra in particular, we make these tools accessible to a new set of applications that could not use them before due to consistency and transationality considerations.
XAP is well established in financial applications and employed by many of the biggest banks in the world, but that’s not news. What’s exciting here is that it can now serve as a bridgehead for these applications to utilise and leverage NoSQL while not having to compromise on their stringent data consistency requirements.