Skip to main content

Speakers

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more
Try DataCamp for BusinessFor a bespoke solution book a demo.

Laying the Foundations: Data Quality in the Age of AI

October 2023
Share

Like any data intiative, AI is set to fail without strong data quality. The key to unlocking the power of data quality lies in robust data governance. As we usher in the era of AI, ensuring the quality of the data that informs large language models is more crucial than ever before. In this session, Susan Walsh, Founder at The Classification Guru, and Scott Taylor, the Data Whisperer, walk us through how data leaders can make meaningful gains on their data quality initiatives, and the nuances of scaling a data quality initiative with AI in mind. 

Summary

Data quality and management are essential for effective data-driven decision-making and storytelling. Susan Walsh, the Classification Guru, and Scott Taylor, the Data Whisperer, discuss the subtleties of maintaining high data quality and the consequences of neglecting it. They highlight the need for consistent, organized, accurate, and trustworthy data to support business strategies. With engaging stories, including the infamous 7-Eleven data error and a $400 million fine faced by Citibank due to poor data governance, the speakers demonstrate the real-world consequences of data mismanagement. They address the difficulties of persuading management to invest in data quality and share strategies for implementing effective data management frameworks. Generative AI's role in data management is also discussed, emphasizing the importance of clean input data to avoid flawed AI outputs. The session concludes with practical advice for maintaining data quality, addressing organizational data challenges, and utilising external resources to mitigate bias.

Key Takeaways:

  • Data quality is vital for accurate data-driven decision-making and effective storytelling.
  • Inconsistent data formats can lead to significant business errors and financial consequences.
  • Convincing management to invest in data quality requires presenting it in terms of business impact.
  • Generative AI needs clean data inputs to provide reliable outputs.
  • Continuous vigilance and organizational buy-in are necessary to maintain high data quality.

Deep Dives

Significance of Data Quality

Data quality is essential for leveraging data for bu ...
Read More

siness insights. Susan Walsh, known as the Classification Guru, and Scott Taylor, the Data Whisperer, emphasize that poor data quality can derail even the best data-driven strategies. Scott introduces his philosophy "truth before meaning," highlighting the necessity of accurate data as a precursor to meaningful analysis. They illustrate their points with examples, such as Citibank's $400 million fine due to inadequate data governance, to demonstrate the tangible consequences of neglecting data quality. Susan shares her practical framework, COAT, which stands for Consistent, Organized, Accurate, and Trustworthy data, as a guide for organizations aiming to improve their data quality. "Make sure your data has its COAT on," advises Susan, highlighting the need for continuous attention to data management.

Challenges in Data Quality and Solutions

Achieving high data quality is not without its difficulties. Susan and Scott discuss the challenge of securing management support for data quality initiatives. They stress the significance of articulating data quality's impact on business outcomes, such as profitability and efficiency. Scott advocates for a narrative approach to gain support, suggesting that data management professionals tell compelling stories that connect data quality improvements to strategic business goals. Susan adds that data problems are often people problems, urging organizations to educate employees on the impact of their data handling on colleagues and the broader business. The speakers agree that prevention is more cost-effective than fixing the problem later, encouraging proactive investment in data management systems before issues arise.

The Role of Generative AI in Data Management

The rise of generative AI presents new opportunities and challenges for data management. Scott warns that AI's effectiveness depends on the quality of the input data, coining the phrase "artificial stupidity" to describe the consequences of feeding AI systems with poor data. He advises caution when relying on AI for data cleaning tasks, as AI can exacerbate existing data issues if not carefully managed. Susan echoes this sentiment, noting that while AI can assist with routine data tasks, it cannot replace the need for human oversight to ensure data integrity. They conclude that while AI can augment data management efforts, a fundamental understanding of data quality principles remains essential.

Practical Tips for Maintaining Data Quality

Maintaining high data quality requires continuous effort and organizational commitment. Susan and Scott share practical tips for embedding data quality practices into business processes. They recommend establishing clear data standards and engaging employees across departments to uphold these standards. Scott suggests utilising external data providers to supplement internal data efforts and mitigate bias. Susan stresses the importance of consistent data management practices, advising organizations to keep their "data COAT" on year-round to prevent regression into poor data habits. Both speakers advocate for a culture of continuous improvement, encouraging data professionals to regularly review and refine their data management processes.


Related

webinar

Radar Data & AI Literacy Edition: Laying the Foundations: Data Quality in the Age of AI

Join Susan Walsh and Scott Taylor as they walk us through how data leaders can make meaningful gains on their data quality initiatives, and the nuances of scaling a data quality initiative with AI in mind. 

webinar

Scaling Data Quality in the Age of Generative AI

Explore the nuances of scaling data quality for generative AI applications, including the unique challenges and considerations that come into play.

webinar

Increasing Your Organization's Data & AI Maturity

John Thompson, the Head of AI at EY, and Robin Sutara, a Field Chief Data Strategy Officer at Databricks, teach you how to assess your data and AI maturity, and how to improve it.

webinar

Building Trust in AI: Scaling Responsible AI Within Your Organization

Explore actionable strategies for embedding responsible AI principles across your organization's AI initiatives.

webinar

Adding Value in Pharma Through Data & AI Transformation

In this session three pharmaceutical executives, with experience as Chief Data Officers and strategic consultants, discuss techniques to improve your digital capabilities.

webinar

From Data Literacy to AI Literacy

Join data literacy pioneers, Jordan Morrow & Valerie Logan, as they discuss the emergence of AI literacy, key steps leaders can take to foster it, and more. 

Hands-on learning experience

Companies using DataCamp achieve course completion rates 6X higher than traditional online course providers

Learn More

Upskill your teams in data science and analytics

Learn More

Join 5,000+ companies and 80% of the Fortune 1000 who use DataCamp to upskill their teams.

Don’t just take our word for it.