What You’ll Hear.


Is duplicate data weighing you down? For many organizations, duplicate data makes up a significant portion of their data lake operating cost. It arises organically – analysts and scientists improve a data set they’re working on, save it, continue, save it again. Their peers do the same. The next thing you know, you have 10, 25, 100 copies of a data set, with minimal true differences. Updating one without changing the other copies leads to bogus results, leaves sensitive data unsecured and even adds unnecessary cost pressures. In the past, cost to maintain duplicate data were covered by IT. But with moves to public cloud, funded with your op-ex budget, these duplicates become materially cost prohibitive. Duplicate data is also a risk for data breaches and other regulatory violations. It does no good to defend the “production” version, if the other copies are less protected. In this session we’ll discuss Lumada Data Catalog, and how its Data Rationalization feature can help you discover and tame your duplicate data.