As enterprises seek to accelerate the process of getting insights from their data, they face numerous sources of friction. Data sprawl across silos, diverse formats, the explosion of data volumes, and the fact that data is spread across many data centers and clouds and processed by many disparate tools, all act to slow the progress.
Long gone are the days when a single well-formed data warehouse could collect and distribute all the data needed to support a business. Today’s challenge of data management is like building dozens of warehouses and lakes, each of which must comply with regulations but also allow disparate teams to create, maintain and evolve them. As more data arrives, more people are required to understand it and make it useful. The demands for self-service become even more urgent. Oh, and by the way, these new data pipelines and repositories will live both on premises and in multiple public clouds.
As executives attempt to increase the pace at which insights are discovered, they encounter all sorts of friction:
- A large banking customer wants to integrate the data they’ve collected in multiple data stores. But their lines of businesses cannot leverage data fast enough to make decisions. Reliance on custom-coding data pipelines makes the process too expensive and clunky. As a result, they cannot build and manage pipelines fast enough to bring all the data together.
- Another enterprise customer is using multiple data catalogs, including Collibra. Although they can define the business terms in Collibra Data Governance, they cannot scale their current catalog to find the data behind it across the entire environment. Instead, this curation is manual, error-prone and very expensive. They have no way to provide timely access to data to their business users.
- Many of our industrial customers are collecting data from thousands of sensors across multiple sites. The voluminous data flowing in from the edge is in different formats. Unfortunately, the challenge to clean, aggregate and blend this operational technology (OT) data with IT data and then analyze it is prohibitive. In its raw form, it’s a liability, not an asset.
An affordable path forward that reduces friction and helps manage the complexity of the modern world must combine clever use of technology along with the emerging practices of DataOps. Through DataOps the larger process of making data useful is addressed as a whole. Teams that were split apart start working together using new forms of automation, artificial intelligence (AI) and machine learning (ML) to manage all the infrastructure and data and then make data visible and findable. This approach lays the foundation to create high-quality data, delivering that data in a way that is secure and governed. It allows people to make use of it on their own, with as much self-service as possible.
At Hitachi Vantara, our answer to bringing DataOps to life is the Lumada DataOps Suite, which speeds time to insight to enable improvement in a wide range of business outcomes. Supporting better customer service, improved operations, product development, compliance and anti-fraud measures, DataOps Suite gets results from the massive flood of data from the internet of things.
The Principles of DataOps Success
Hitachi is a DataOps company with more than 50 years of experience in integrating IT and OT across a diverse range of businesses. Here is why we are a global leader in DataOps:
- We offer both infrastructure and software for the integration, discovery and governance of data.
- We provide cloud-agnostic software offerings for hybrid and multicloud DataOps.
- We deliver professional services and training to ensure your teams develop data management capabilities in their journey to data management maturity.
The Lumada DataOps Suite brings Hitachi’s learning and experience to life in a product. In our view, the current landscape is screaming out for a new type of platform, built on the following principles:
- Intelligent automation.
- Composition that replaces coding.
- Open ecosystem.
- Repeatable success through methods.
Based on these principles, the Lumada DataOps Suite transforms a world of disparate data silos into one in which everyone can use a highly automated, governed data fabric for innovation.
The Anatomy of the Lumada DataOps Suite
Lumada DataOps Suite delivers multiple components, all working together as a single platform to accelerate the path to insight, with less mess and fewer handoffs. Without DataOps Suite, organizations are dependent on data services consulting. To achieve the same capability, they need to use a hodgepodge of different tools, which creates management nightmares over time. Here is how the four principles mentioned above come to life in Lumada DataOps Suite.
Intelligent automation and collaboration conquers data complexity and unlocks governed self-service. Enterprises need to quickly derive value from all different data silos and data formats. Using an intelligent engine replaces manual processes and injects DataOps into a company in a scalable way. This approach accelerates the journey from raw data to insight. Automation, intelligence and collaboration combine to address the far larger landscape of data management and data pipeline tools, repositories and analytics. A smarter system can build in rules about compliance and governance, so users can be guided during construction instead of fixing problems after the fact. Collaboration helps support self-service and cooperation across silos of IT, data engineering, data science and business units.
Composition replaces coding. Higher degrees of business agility and resilience are achieved by building flexibility into the infrastructure. Composable microservices enable organizations to be more nimble, adapting and responding to change quickly by applying a consistent approach to address challenges and use cases. This enables Lumada DataOps Suite functionality like data integration and data catalog to run as microservices configured with metadata and automatic policies. As a result, data pipelines can be intelligently built and operated at massive scale. Even the biggest data engineering teams would be quickly overwhelmed by an approach based on manual coding the pipelines for each and every data request.
Open ecosystem instead of “rip and replace.” To support integration and automation, modern systems must be configured and controlled by APIs and policies described by metadata. Such systems create an open ecosystem that can stand alongside existing data management applications and platforms and help bring them into the world of DataOps. At the same time, these systems allow data pipelines to support collection and transformation of data from the edge to the cloud. Using Lumada DataOps Suite, there’s no custom coding: It works right out of the box with flexibility architected into the platform. Our composable microservices can be applied anywhere along the DataOps journey. Just plug it into the existing environment and solution cores.
Repeatable, Proven, Scalable
Repeatable success through proven solutions. The tech must be supported by implementation methodologies and reference architectures. It is through repeated creation and reuse of such assets that a data culture based on DataOps principles comes to life, resulting in the agile innovation, lower risks and reduced costs we are all striving to achieve.
Updates to the Lumada DataOps Suite in 2021
The latest launch of the Lumada Data Ops Suite in August of 2021 includes availability of our new platform and improvements to the underlying product that it is made of to support massive scale, lower risk, decreased costs and faster innovation.
In the latest release, the Lumada DataOps Suite moves to general availability combining both Lumada Data Integration, built with Pentaho technology, and the Lumada Data Catalog into a governed data fabric. Key enhancements include:
- A cloud-native architecture that uses microservices to promote both massive scalability and composability.
- A common foundation and control plane for deployment and identity and access management that allows solutions to be constructed and managed the same way no matter where they run.
- Deep integration between Lumada Data Catalog and Lumada Data Integration, which allows the catalog users to initiate and orchestrate complex data flows in a simplified manner that supports self-service and results in large productivity gains.
The update to Pentaho 9.2 expands the data integration across all major public clouds.
- Full Microsoft Azure Cloud support is added, including connectivity to data in Azure SQL Database, Azure Data Lake Storage and Blob storage, and Azure HDInsight.
- Support for new data stores such as Cloudera Data Platform and MapR (HPE Ezmeral Data Fabric).
- Increased operational productivity through simplified upgrades and advanced logging for Pentaho Business Analytics.
The update to Lumada Data Catalog 6.1 improves data quality, support for advanced data governance technology, and better search and export.
- Taming of duplicate data with data rationalization and lower operating cost and risk.
- A bidirectional connector to Collibra Data Governance enables business terms defined in Collibra to be used by the Lumada Data Catalog to find data relevant to those terms across the enterprise.
- Improved search based on regular expressions and advanced methods supports data discovery and allows datasets to be precisely filtered. As a result, metadata can be easily extracted and ported to other systems.
The quest to reduce friction and accelerate time to insight requires a systematic response both on the level of technology and methodology. Lumada DataOps Suite along with DataOps practices show a way forward that can speed the process of finding new ways to improve business outcomes.
Radhika Krishnan is Chief Product Officer at Hitachi Vantara.