Pentaho Future-Proofs the Enterprise in Expanding Big Data Universe

Latest release adds new integration for Amazon EMR, SAP HANA and Apache Spark; announces study confirming high performance in large-scale Hadoop deployments

June 9, 2015, Hadoop Summit 2015, SAN JOSE, Calif. — As the big data universe evolves and expands to the cloud, Pentaho, a Hitachi Data Systems company, today announced support for integration with two popular big data technologies, Amazon Elastic MapReduce (EMR) and SAP HANA. Available today, Pentaho 5.4 offers new capabilities that build on a pragmatic and future-proof platform for big data orchestration and analytics at scale, further empowering organizations to drive value with Pentaho’s Big Data Blueprint use case designs.

In Pentaho 5.4 customers can now use Amazon EMR to natively transform and orchestrate data as well as design and run Hadoop MapReduce in-cluster on EMR. IT organizations now have new and powerful ways to operationalize a cloud-based data refinery architecture for on-demand governed delivery of data sets.

Amazon Web Services and Pentaho customer, The Lucky Group, consolidates, refines, and analyzes several Terabytes of data to drive a deeper understanding of retail customer segmentation, acquisition, and retention. According to Jay Khavani, Sr. Manager Business Intelligence, The Lucky Group, "Pentaho was indispensable in transitioning to a cloud-based data architecture, with an adaptive big data layer that ensures business continuity when incorporating new technologies."

Enterprises can now leverage SAP HANA’s high-performance capabilities on a wider variety of data. Pentaho 5.4’s integration with SAP HANA enables governed data delivery across multiple structured and unstructured sources.

Enterprises running Hadoop find that data variety and volumes increase over time, making reliable performance and scalability mission-critical priorities. Pentaho recently executed a controlled study that demonstrates sustained processing performance of Pentaho MapReduce running at scale on a 129-node Hadoop cluster. The results build on the value of the Pentaho platform, delivering high performance processing at enterprise scale in big data deployments.

Additional capabilities and enhancements in Pentaho 5.4 include:

  • Integration of Pentaho Data Integration (PDI) with Apache Spark, enabling orchestration of Spark jobs
  • New APIs to simplify embedding analytics into business applications and processes
  • A refreshed, modern look for Pentaho Data Integration
  • Ability to localize Pentaho in French, German and Japanese

"We continue to deliver on our vision to help organizations get value out of any data in any environment with Pentaho 5.4," said Christopher Dziekan, Chief Product Officer, Pentaho. "Our open and adaptable approach means customers choose the best technology for their businesses today without the worry of being locked-out in the future."


Pentaho, a Hitachi Group company, is a leading data integration and business analytics company with an enterprise-class, open source-based platform for diverse big data deployments. Pentaho’s unified data integration and analytics platform is comprehensive, completely embeddable and delivers governed data to power any analytics in any environment. Pentaho’s mission is to help organizations across multiple industries harness the value from all their data, including big data and IoT, enabling them to find new revenue streams, operate more efficiently, deliver outstanding service and minimize risk. Pentaho has over 15,000 product deployments and 1,500 commercial customers today including ABN-AMRO Clearing, BT, EMC, NASDAQ and Sears Holdings Corporation. For more information visit

You’re in the Right Place!

Hitachi Data Systems, Pentaho and Hitachi Insight Group have merged into one company: Hitachi Vantara.

The result? More data-driven solutions and innovation from the partner you can trust.

You’re in the Right Place!

REAN Cloud is now a part of Hitachi Vantara.
The result? Robust data-driven solutions and innovation, with industry-leading expertise in cloud migration and modernization.

You’re in the Right Place!

Hitachi Consulting and Hitachi Vantara have integrated into a new company under the Hitachi Vantara brand. We help you connect what’s now to what’s next.

You’re in the Right Place!

Waterline Data is now Lumada Data Catalog, provided by Hitachi Vantara. Lumada Data Catalog, available stand-alone, is now part of the Lumada Data Services portfolio.