HomeTutorialsPython

Python NaN: 4 Ways to Check for Missing Values in Python

Explore 4 ways to detect NaN values in Python, using NumPy and Pandas. Learn key differences between NaN and None to clean and analyze data efficiently.
Feb 2024  · 5 min read

In the world of data science and analytics, encountering missing data is more a rule than an exception. Missing values can skew analysis, lead to incorrect conclusions, and generally disrupt the flow of data processing. Addressing these gaps is crucial for maintaining the integrity of your analysis. This article aims to equip you with different ways of identifying NaN (Not a Number) values in Python.

The Short Answer: Use either NumPy’s `isnan()` function or Pandas `.isna()` method

When dealing with missing values in Python, the approach largely depends on the data structure you're working with.

For Single Values or Arrays: Use NumPy

NumPy's `isnan()` function is ideal for identifying NaNs in numeric arrays or single values, offering a straightforward and efficient solution. Here it is in action!

``````import numpy as np

# Single value check
my_missing_value = np.nan
print(np.isnan(my_missing_value))  # Output: True

# Array check
my_missing_array = np.array([1, np.nan, 3])
nan_array = np.isnan(my_missing_array)
print(nan_array)  # Output: [False  True False]``````

For DataFrames: Use Pandas

Pandas provides comprehensive methods like `.isna()` and `.isnull()` to detect missing values across DataFrame or Series objects, seamlessly integrating with data analysis workflows.

``````import pandas as pd
import numpy as np

my_dataframe = pd.DataFrame({
'Column1': ["I", "Love", np.nan],
'Column2': ["Python", np.nan, "The Best"]
})

print(my_dataframe.isna())``````

When you run this code, the output will indicate the presence of NaN values in a more interesting context, as shown below:

``````   Column1  Column2
0    False    False
1    False     True
2     True    False``````

The Difference Between `NaN` and `None`

Understanding the distinction between `NaN` and `None` is crucial in Python. `NaN` is a floating-point representation of "Not a Number," used primarily in numerical computations. `None`, on the other hand, is Python's object representing the absence of a value akin to null in other languages. While `NaN` is used in mathematical or scientific computations, None is more general-purpose, indicating the lack of data.

4 Ways to Check for NaN in Python

Navigating through datasets to identify missing values is a critical step in data preprocessing. Let's explore four practical methods to check for `NaN` values in Python, continuing with the engaging examples we've already used.

1. Checking for NaN using np.isnan()

As we saw earlier, NumPy provides a straightforward approach to identifying `NaN` values in both single values and arrays, which is essential for numerical data analysis.

``````import numpy as np

# Checking a single value
print(np.isnan(np.nan))  # Output: True

# Checking an array
my_array = np.array([1, 5, np.nan])
print(np.isnan(my_array))  # Output: [False False  True]``````

2. Checking for `NaN` using `pd.isna()`

Pandas simplifies detecting NaN values in data structures, from scalars to complex DataFrames, making it invaluable for data manipulation tasks.

``````import pandas as pd

# Checking a single value
print(pd.isna(np.nan))  # Output: True

# Checking a pandas Series
my_series = pd.Series(["Python", np.nan, "The Best"])
print(my_series.isna())  # Output: [False  True  False]

# Checking a pandas DataFrame
my_dataframe = pd.DataFrame({
'Column1': ["I", "Love", np.nan],
'Column2': ["Python", np.nan, "The Best"]
})

print(pd.isna(my_dataframe)) # Output a DataFrame with True for missing values``````

3. Checking for `NaN` in DataFrames using Pandas `.isna()` or `.isnull()` methods

Pandas DataFrames also offer the `.isna()` and `.isnull()` methods to effortlessly pinpoint missing values across datasets, providing a clear overview of data completeness.

``````import pandas as pd

# Create a dataframe with missing values
my_dataframe = pd.DataFrame({
'Column1': ["I", "Love", np.nan],
'Column2': ["Python", np.nan, "The Best"]
})

print(my_dataframe.isna())
# Output:
#    Column1  Column2
# 0    False    False
# 1    False     True
# 2     True    False

print(my_dataframe.isnull())
# Output:
#    Column1  Column2
# 0    False    False
# 1    False     True
# 2     True    False
``````

4. Checking for `NaN` in DataFrames using `math.isnan()`

For individual number checks, the `math.isnan()` function offers a simple yet effective solution, especially when dealing with pure Python data types.

``````import math

# Assuming my_number is a float or can be converted to one
my_number = float('nan')
print(math.isnan(my_number))  # Output: True``````

Identifying and managing NaN values is a fundamental step in cleaning and preparing your data for analysis. Whether you're working with arrays, series, or data frames, understanding the tools and methods available in Python to deal with missing data is essential. For further exploration, check out the following resources:

Author

Adel is a Data Science educator, speaker, and Evangelist at DataCamp where he has released various courses and live training on data analysis, machine learning, and data engineering. He is passionate about spreading data skills and data literacy throughout organizations and the intersection of technology and society. He has an MSc in Data Science and Business Analytics. In his free time, you can find him hanging out with his cat Louis.

Topics

Track

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Python Fundamentals

15 hours hr
Grow your programmer skills. Discover how to manipulate dictionaries and DataFrames, visualize real-world data, and write your own Python functions.
See Details
Start Course

Track

Importing & Cleaning Data

13 hours hr
Gain the real-world data prepping skills you need to reveal the insights that matter! Discover how to import, clean, and work with APIs and web data.
Certification available

Course

Cleaning Data in Python

4 hr
102.1K
Learn to diagnose and treat dirty data and develop the skills needed to transform your raw data into accurate insights!
See More
Related

Becoming Remarkable with Guy Kawasaki, Author and Chief Evangelist at Canva

Richie and Guy explore the concept of being remarkable, growth, grit and grace, the importance of experiential learning, imposter syndrome, finding your passion, how to network and find remarkable people, measuring success through benevolent impact and much more.

Richie Cotton

55 min

Mastering the Pandas .explode() Method: A Comprehensive Guide

Learn all you need to know about the pandas .explode() method, covering single and multiple columns, handling nested data, and common pitfalls with practical Python code examples.

5 min

Seaborn Heatmaps: A Guide to Data Visualization

Learn how to create eye-catching Seaborn heatmaps

Joleen Bothma

9 min

Test-Driven Development in Python: A Beginner's Guide

Dive into test-driven development (TDD) with our comprehensive Python tutorial. Learn how to write robust tests before coding with practical examples.

Amina Edmunds

7 min

Exponents in Python: A Comprehensive Guide for Beginners

Master exponents in Python using various methods, from built-in functions to powerful libraries like NumPy, and leverage them in real-world scenarios to gain a deeper understanding.

Satyam Tripathi

9 min

Python Linked Lists: Tutorial With Examples

Learn everything you need to know about linked lists: when to use them, their types, and implementation in Python.

Natassha Selvaraj

9 min

See MoreSee More