Skip to main content

Post-Deployment Data Science

Hakim Elakhrass talks about post-deployment data science, the real-world use cases for tools like NannyML, the potentially catastrophic effects of unmonitored models in production, the most important skills for modern data scientists to cultivate, and more.

Aug 2022
View Transcript

Key Quotes

Discrimination and bias in a model is unethical and the impact can be catastrophic to a business. Unfortunately, this can simply be that when you built your model that you didn't see bias in certain demographics because you didn't have enough of them in your data. Over time, more and more of a certain demographic enters your data that the model can't properly make good decisions for. That is extremely detrimental from a financial and business perspective, because if your model is discriminating against a certain segment, then you're obviously not doing the best for the company. Worst of all, it’s not fair to the people you're making predictions about.

Actually putting models into production is what will set you apart as a data scientist. it's an important skill that, unfortunately, not many data scientists have. They should also really have a grasp of the business impact of the model. A model is more than just its performance or technical metrics. Why are you building this model? What value does it add and how is that value changing over time? How is it impacting other departments? Obviously data scientists need technical skills, but they must also have a deep intuition about why they are doing what they are doing.

Key Takeaways

1

Whether or not you know what actually happens in the real world after the prediction, understanding model performance is still challenging from both an engineering perspective and a data science perspective.

2

Data scientists need to cultivate a thorough understanding of a model’s potential business impacts as well as the technical metrics of the model.

3

Making machine learning tools open source builds trust with users and enables a community-based approach for getting feedback.

About Hakim Elakhrass


Photo of Hakim Elakhrass
Guest
Hakim Elakhrass

Hakim Elakhrass is the Co-Founder and CEO of NannyML, an open-source python library that allows data scientists to estimate post-deployment model performance, detect data drift, and link data drift alerts back to model performance changes. Originally, Hakim started a machine learning consultancy with his NannyML co-founders, and the need for monitoring quickly arose, leading to the development of NannyML.


Photo of Adel Nehme
Host
Adel Nehme

Adel is a Data Science educator, speaker, and Evangelist at DataCamp where he has released various courses and live training on data analysis, machine learning, and data engineering. He is passionate about spreading data skills and data literacy throughout organizations and the intersection of technology and society. He has an MSc in Data Science and Business Analytics. In his free time, you can find him hanging out with his cat Louis.

Related

How to Become a Data Scientist in 8 Steps

Find out everything you need to know about becoming a data scientist, and find out whether it’s the right career for you!

Jose Jorge Rodriguez Salgado

12 min

How Data Science is Changing Soccer

With the Fifa 2022 World Cup upon us, learn about the most widely used data science use-cases in soccer.
Richie Cotton's photo

Richie Cotton

Top Machine Learning Use-Cases and Algorithms

Machine learning is arguably responsible for data science and artificial intelligence’s most prominent and visible use cases. In this article, learn about machine learning, some of its prominent use cases and algorithms, and how you can get started.
Vidhi Chugh's photo

Vidhi Chugh

15 min

Inside the Generative AI Revolution

Martin Musiol talks about the state of generative AI today, privacy and intellectual property concerns, the strongest use cases for generative AI, and what the future holds.

Adel Nehme's photo

Adel Nehme

32 min

A Complete Guide to Data Augmentation

Learn about data augmentation techniques, applications, and tools with a TensorFlow and Keras tutorial.
Abid Ali Awan's photo

Abid Ali Awan

15 min

Understanding Text Classification in Python

Discover what text classification is, how it works, and successful use cases. Explore end-to-end examples of how to build a text preprocessing pipeline followed by a text classification model in Python.
Moez Ali 's photo

Moez Ali

12 min

See MoreSee More