How Data Lakes Can Reveal Data’s True Value

Tim Langley-Hawthorne
Chief Information Officer, Hitachi Vantara

July 8, 2021

For all of the buzz surrounding both artificial intelligence and data-driven management, many companies have seen mixed results in their quest to harness the value of enterprise data. To avoid those pitfalls, we mixed best-of-breed and proprietary solutions to develop our enterprise data platform (EDP), focusing much of our attention on a combination of smart changes in technology, culture and process for data lakes.

The EDP, as we call it internally, includes a full suite of organizational data, enterprise data tools and services capable of ingesting and processing 30TB+ of data sources and 50TB+ of data. The platform includes automation, security, virtualization, machine learning and artificial intelligence. Each year, we invest around $5 million in this platform, and in less than three years, we’ve returned multiples of that amount in both cost savings and additional new revenue for the company.

Now we want to explain how we got here and what it may mean for others on their quest to build a similar platform. We presented our key findings in a session at the recent Hitachi Social Innovation Forum 2021, Americas, on a topic called, “Hitachi on Hitachi: Finding Data’s Value in Our Enterprise Data Lake.”

To put this effort in perspective, it helps to understand the business opportunity and what obstacles you may encounter along the way. According to consistent reports over the years, workers spend nearly half of their time searching for and preparing data instead of gaining insights that can help drive better business outcomes. In an enterprise, harnessing data is often held back by a lack of collaboration, knowledge gaps, or even resistance to change. And many industry analyst firms indicate that enterprises continue to struggle to realize business value from their organization’s data and analytics investments.

Challenges and Mitigation Strategies

In developing the EDP, we identified four common data lake challenges and effective mitigation strategies.

User aversion. For most large enterprises the toughest obstacles are psychological and cultural. Figuring how to pry people away from data silos is, of course, psychological. It can give the hoarders a sense of control and also job security. You need overall company leadership, not just technology leadership, to emphasize the real power and value of bringing together multiple disparate enterprise datasets. You need top-down directives. You need to work with teams starting at the grassroots level.
Defining the right problem to solve. Odd as it may seem, in many data initiatives, the enterprise realizes later in the journey that it may not be tackling the right problem. All participants, including those in technical business functions, need to have a common understanding of the problem definition. This begins with comprehending the business vision, identifying data sources, recognizing the pain points and potential risks, and determining whether the desired benefits are obtainable. Cross-collaboration between various stakeholders is key to defining the problem. When the definition is confirmed, you can establish the foundation for building the data platform.
Data empathy. This approach is built on a willingness and dedication to deeply understand how your business stakeholders intend to use and interpret the data. We had the pain and problem, but on the benefit side we have trailblazers and early adopters. We were fortunate to connect with several business partners who shared a burning desire to give and receive value from their data. Some had specific use cases in mind; others wanted to collaborate to pinpoint their optimal use cases. How do the business teams prefer to consume the data and how often do they need updates?
Data quality and governance. Schema enforcements are critical to prevent mismatches between data sources and targets. We believe it’s really important to build a framework and automated alerting system for schema enforcement and data quality. It’s important to do this before you begin the work of data ingestion. Without this, your data may be unusable by business partners, and with the right framework you can help ensure high data quality. You can also quickly address problems if they arise.

Data Democratization and Project Wins

There are myriad ways that enterprises can achieve a healthy return on investment (ROI) from their EDP-style data lakes. In several of our significant wins thus far, we:

Increased our sales and marketing team’s ability to upsell and cross sell, which resulted in $27 million in incremental sales opportunities.
Introduced a new, automated, document-translation function for customer proposals that avoids using a costly, external translation service. Our ROI is $3 million in annual savings, and we’re looking to roll it out to legal and other functions with translation needs.
Used the EDP to connect and analyze pricing information against competitive pricing and sales histories. This approach was initiated by the Hitachi Vantara pricing team, in its search for new ways to improve margins. The analysis led to the creation of a pricing optimization tool which has delivered an estimated $32 million in incremental revenue.

Now we know, with a corporate-wide commitment to change, and carefully managed investments in modern enterprise data platforms, companies can finally deliver actionable business intelligence and tangible returns that wouldn’t otherwise be possible with existing data silos. Hitachi Vantara’s EDP project has achieved true data democratization. And in migrating so much data to the cloud, we are running more efficiently and at a lower cost than ever before.

Tim Langley-Hawthorne

Tim is responsible for global IT supporting Hitachi Vantara employees and portfolio of edge-to-core-to-cloud digital infrastructure and solutions. His background includes deep cross-functional experience in customer service, strategic sourcing and financial management.