Today’s backups are mission-critical systems that have been hardened over decades of evolution. The challenge facing financial services organizations is that their expectations for using stored data have changed dramatically, but their backup systems remain products of a previous age.
The job of backup used to be straightforward: Take a snapshot of the current data, store it in an archival format, and then move it off-site for regulation and business continuity reasons. Restoring that data was an arduous task, but one that only occurred in exceptional circumstances, such as a failure of the enterprise infrastructure or in response to a regulatory inquiry. As long as there was a plausible plan for meeting recovery point and time objectives (RPO/RTO), everything was okay.
The need for reliable data protection remains unchanged. Today, however, that same data is also raw material that is a major source of revenue for the thriving modern financial services business. A lot of effort and expense are being wasted in creating massive data stores for “just in case” scenarios. But because that same data is also needed to do work, it may be replicated dozens of times across the enterprise. Each copy may be used for a separate purpose, such as backup, disaster recovery (DR), legal workflows, e-discovery, business development and more. All that duplication is wasteful, inefficient and a potential vector for risk and error.
The demand for data to be available to meet operational objectives, the explosion in the amount of data being created, and all that duplication present serious challenges in terms of scale, storage costs, management and manpower. Given how much work and investment already goes into doing the job of data protection correctly, it only makes sense to get more out of those systems and maximize the return on that investment.
And it is entirely possible if the organization shifts perspective a bit. This starts by reframing the essential question. The old premise was: How can we make sure our data is protected?
The better and more relevant question to ask today is: How can we create an operational recovery and archiving process that maximizes the ability of our data to perform useful work throughout our dynamic and changing environment?
That word, “change,” is important. The cloud has opened up new ways to pull insights out of data assets. Object storage has further transformed the economics and malleability of stored information. It is therefore imperative that organizations change the systems they use for protecting data to take advantage of all the new opportunities now possible.
This shift requires that data no longer be treated like a static, lifeless thing kept in a storage locker. Instead, it should always be available and ready for reuse. Organizations have an opportunity today, powered by the cloud, to reconsider their approach to data protection from first principles, to support a whole new landscape of use cases. Instead of thinking of backup and archive solely as data storage solutions, separate from the business, they can be reimagined. They can be seen as part of an expansive integrated system capable of simultaneously meeting a wide range of ever-evolving needs, to support the full breadth of business operations.
There are three main categories for this data reuse:
- Technical use cases, such as disaster recovery testing or ransomware sandboxing.
- Business use cases, such as analytics, collaboration and reporting.
- Data compliance determination use cases, such as the EU’s General Data Protection Regulation (GDPR), which is creating significant pressure points across businesses and IT.
A Modern Take on the “Archive”
Every application in the enterprise fundamentally produces structured data, which is well-defined and readily searchable, and/or unstructured data, which does not have predefined data models and must be organized to be usable. In the old approach, all that data would be captured essentially “as is”: a freeze frame that left the work of making it usable to the future.
Today, by leveraging the power of automation and AI and ML technologies it is possible to create a comprehensive archive that automatically tags and catalogs everything and determines how best to “life cycle” this information. This tagging and cataloging process is the key to making data reusable for more than just data protection.
Every object to be stored is tagged with business-level information and defined by a set of attributes: when it was created, how big it is, who it belongs to, how long it is to be retained, and much more. With such information at hand, along with the expanded automation potential and backup-friendly capabilities of the cloud, it becomes far easier to manage data retention, data preservation, data security and data integrity. It is also easier to gain the invaluable flexibility of reusability while mitigating risk and maximizing cost benefits.
This new approach will inevitably result in a huge and fast-changing repository, but thanks to the cloud, it is readily managed. If an organization is committed to taking advantage of the benefits of digital transformation, the ability to uniformly store, recover, search, access, preserve and dispose of data will be critical to continued success. The cloud makes implementing such a new age data lake, built on a strong data catalog with an underlying object storage platform, both possible and affordable. It is realistic to have it all: a massive repository that is available to provide data intelligence and an effective operational recovery system for business operations.
Inderjeet Rana is CTO of Financial Services at Hitachi Vantara.