Bytesize

The push and pull of AI innovation and ethics

Vol 4, February 2025

The push and pull of AI innovation and ethics

The AI landscape holds infinite potential, propelling us into a new digital era where processes that took days, weeks, or months now take just hours or even minutes. It’s unlocking deeper insights and value from data and automating workflows. But, more than that, AI has progressed to enable real-time decisioning and the rearchitecting of entire industries, driving huge value.

Our survey, How AI is Shifting Data's Foundation, demonstrates this. Three-quarters (76%) of respondents have progressed beyond the limited adoption of AI use cases, and for a third (37%), AI is already critical to their business.

But, as with any new venture, there are unknowns and risks. It’s in the complex and dynamically evolving algorithms that make real-time decisioning capabilities possible that we see the potential for ethical issues. Organizations must think ahead to mitigate any pitfalls of AI, which highlights the need for clear AI lifecycle planning, with each phase (i.e., ideation, development, deployment) underpinned by robust governance to ensure transparency, fairness, and control. With the eyes of the world on the corporate use of AI and the trust of customers hanging on how ethical that use is determined to be, these steps are crucial.

The regulatory risk

All the attention on AI, with public speculation and expert warnings about the potential risks of AI if left unchecked, has led to the introduction of AI regulations across the globe. Many will eventually enforce financial penalties for non-compliance – one example being the EU AI Act. And where there are no enforced regulations, there are plenty of conversations being held at a government level that may change that, particularly in the States where there are various bills at play – one such example being the California AI Transparency Act with penalties of $5,000 per day per violation.

The data challenge

With so much at stake and a clear need for care and attention where AI is concerned, our survey highlighted a worrying contradiction: only 38% of respondents enhance training data quality to explain their AI outputs, and almost a quarter (24%) don’t review the datasets they use to train AI for quality at all.

AI outputs are only as good (and as fair) as the data feeding the programs. Training AI models on poor-quality or incomplete data taints the model – no matter how sophisticated it is. (Imagine a Michelin-star chef expected to cook with rotten ingredients...).

Accuracy must improve quickly if AI models are to live up to the long-term hope, but impatience and a fear of falling behind are driving IT leaders to implement AI before their data infrastructure is ready. Only 5% say they are using sandboxes to test AI experiments, with 70% implementing AI testing and improving as they go.1 This approach risks poisoning AI models – together with user trust – while simultaneously opening the door to new security vulnerabilities.

Prepare for success and protect trust

Ahead of any big AI deployment, IT leaders should look to build quality datasets, addressing unstructured data and ensuring there is full control over dark data. This goes alongside keeping a tight hold on good quality data too (data that contains personal or sensitive information), as AI has the potential to seek out all data that IT leaders might not have full control over. By ensuring tight control and visibility of all data and the ability to rewind model training to remove compromised data (without needing to start from scratch) if a problem is found, IT leaders can ensure compliance, protect trust, and drive success.

Ten dimensions for data quality

To avoid getting caught out, ask these questions when you’re addressing data quality to ensure no stone is left unturned:

  1. Accurate: does the data reflect reality?
  2. Complete: is any crucial data missing?
  3. Consistent: is the data uniform in formats, units, and naming conventions?
  4. Unique: is there duplicate data?
  5. Reliable: is data based on typical scenarios?
  6. Accessible: is data accessible to models as needed?
  7. Timely: is the data current and relevant?
  8. Traceable: can data origins and processing be tracked?
  9. Tagged: is metadata available to add context and meaning?
  10. Clean: does our data contain personal, sensitive, or proprietary information?

Simon Ninan, SVP of Business Strategy at Hitachi Vantara, says: “The adoption of AI depends very heavily on trust of users in the system and in the output. Adoption is basically like your early experiences. If they’re tainted, it taints your future adoption. So, data quality matters from the outset, or at least achieving a base level.”

Learn more about how your AI models can meet the “Trust Threshold” with our AI and analytics solutions.

SOURCE: All statistics referenced can be found in our 2024 survey How AI is Shifting Data’s Foundation: an interview with 1,200 IT decision makers in large organizations from across 15 countries. 

Something take your fancy?

Want to discuss something you've read, let’s make it a date.