Challenge: Enormous data volumes and cumbersome, expensive and labor-intensive process to extract and manipulate data sets for customers.
Solution: Single data integration platform and lean data management process to streamline data extraction and standardize data sets provided to customers.
At the heart of U.K.’s energy industry is ElectraLink and its proprietary Data Transfer Service, a messaging and communications network to which more than 250 gas and electricity companies are connected. Every time a customer changes energy suppliers, or a new smart meter is installed, data is transferred over this network. It handles more than 130GB of data a month.
It’s about to handle much more. Nearly 300,000 smart meters are being installed each month as part of a government mandated program. By 2020, the U.K. will have more than 28 million smart meters. This massive program is leading to a 25% increase, year over year, in data traveling across ElectraLink’s Data Transfer Service network.
ElectraLink has amassed a treasure trove of energy market information. But it must comply with data access and usage restrictions. For example, energy suppliers have full access to their own data, but they can only access another supplier’s data as an aggregated and anonymized view.
ElectraLink provides these data services as a commercial offering. While energy data is structured and well defined, it’s complex and the volumes are enormous.
The company was relying on a cumbersome, expensive and labor-intensive process to run ad-hoc queries that tended to be modified for each application. It needed a lean data management process to streamline data extraction and standardize the data sets provided to customers.
ElectraLink turned to Pentaho Data Integration and Pentaho Business Analytics from Hitachi Vantara to integrate data from multiple sources in a single platform and make data available for analytics in near real time.
The company’s IT architecture includes Amazon Web Services (AWS) as its secure cloud services platform, a Vertica data warehouse, Egnyte for its FTP portal, and Pentaho Data Integration for process automation and orchestration to create the data sets. ElectraLink implemented Pentaho Data Integration in late 2016, but the company has plans to exploit more of the platform’s capabilities and dashboards as part of its product delivery model.
ElectraLink uploads data sets to a secure FTP portal, where customers can access the data. The entire Data Transfer Service network is refreshed nightly. “We spin up our environments, run the queries, package up the data, and place it back out on the FTP servers for the customers to get access to it,” explains Dan Hopkinson, ElectraLink’s head of Network and EMI services. “We keep our environments up for the minimum amount of time to reduce costs. Cost control, automation, streamlining and lean processes are incredibly important to us.”
ElectraLink also relies on Pentaho Data Integration to create a business glossary of processes and data flows for both electricity and gas data, the first dual-fuel catalog to support the energy industry. Pentaho’s automation capabilities are instrumental in keeping the business glossary up to date. “What I’m keen on is to link the catalog to Pentaho so we don’t have to change code every time there is a process change,” says Hopkinson.
Pentaho Data Integration is used to manage all the alerting and automation, which gives Hopkinson’s small four-person team time to concentrate on developing new data services and enhancing existing data sets for sale to customers on a subscription or volumetric basis. “We have a range of products, from very straightforward data feeds where we’re effectively just providing data on request, to much more complex analytics products,” explains Hopkinson.
For instance, one of the products ElectraLink has developed is a data set that exposes all the energy generation on the U.K.’s distribution network. That’s all the small wind farms, solar farms, small generators connected directly to the distribution grid, as opposed to the transmission grid. ElectraLink has been able to identify the embedded generators based on consumption data that flows across the network through pattern matching. “ElectraLink is the only entity that has this comprehensive view of energy generation across the whole of Great Britain. It’s a complex and incredibly valuable data set,” says Hopkinson.
The next stage of the ElectraLink’s journey is to leverage Pentaho’s automation capabilities to provide API access to the data. This approach will allow third parties to access data, such as price comparisons, in real time. It will streamline the change-of-supplier process and enable more customers to switch energy suppliers more easily.
Data Center Modernization
The new software-defined environment means we can support growing data volumes without having to invest in additional hardware.