Pivotal – the EMC/VMware analytics spinout – has released Pivotal Data Dispatch (PDD), a platform for discovery of and analytical processing on data stored in disparate repositories. The product was developed at NYSE Euronext for its internal surveillance and regulatory purposes and has been in use there since 2007.
According to product documentation, PDD provides “on-demand access to big data while supporting existing visualisation and analytical tools. installed as a middleware service on a commodity linux cluster, Pivotal Data Dispatch enables it administrators to set up data sources, analytical platforms, business metadata and access policies.” Data extracted from sources can then be fed into Hadoop or an MPP database for analytic processing.
* Provisioning at Terabytes Per Hour – PDD gathers user-requested data from multiple sources, translates it to the destination format, applies user-defined and policy-driven transformations, and loads the data onto an analytical platform.
* Analytics Workflows – PDD includes a web-based workflow tool, so data workers can create multistep analytics workflows that include data movement, transformation and analytics.
* Heterogeneous Data Platform Support – PDD includes native adapters and resource monitoring for multiple platforms, including Pivotal HD and HAWQ, Pivotal greenplum Database (GPDB), Apache Hadoop, IBM Netezza, Oracle and SQL Server. It can also connect with any database through JDBC, and with most distributed file systems, such as NFS.
The NYSE/Pivotal collaboration came about when the exchange chose the Greenplum MPP database for its regulatory and surveillance needs, and built the middleware to populate the Greenplum analytics database from various internal data repositories. Greenplum was since aquired by EMC, and is now offered by Pivotal. EMC also provides storage for the Community Platform cloud offering, from NYSE Technologies, the exchange group’s commercial technology unit.
Says NYSE Euronext Chief Data Officer Steven Hirsch: “Pivotal Data Dispatch solves complex market surveillance and regulatory data needs, and is heavily utilised across NYSE Euronext production systems today. Together, we understand the pressures of managing big data, and the processing power behind Pivotal Data Dispatch technology allows us and their customers to store data reliably at significant scale and easily adapt to fast-paced changes in the big data universe.”