Skip to main content

course

Fraud Detection in Python

Intermediate

Updated 12/2024

Learn how to detect fraud using Python.

Start course for free

Included for FreePremium or Teams

PythonMachine Learning4 hours16 videos57 exercises4,800 XP18,902Statement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies

Course Description

A typical organization loses an estimated 5% of its yearly revenue to fraud. In this course, you will learn how to fight fraud by using data. For example, you'll learn how to apply supervised learning algorithms to detect fraudulent behavior similar to past ones, as well as unsupervised learning methods to discover new types of fraud activities. Moreover, in fraud analytics you often deal with highly imbalanced datasets when classifying fraud versus non-fraud, and during this course you will pick up some techniques on how to deal with that. The course provides a mix of technical and theoretical insights and shows you hands-on how to practically implement fraud detection models. In addition, you will get tips and advice from real-life experience to help you prevent making common mistakes in fraud analytics.

Prerequisites

Unsupervised Learning in Python Supervised Learning with scikit-learn

1

Introduction and preparing your data

Introduction to fraud detection

Checking the fraud to non-fraud ratio

Plotting your data

Increasing successful detections using data resampling

Resampling methods for imbalanced data

Applying SMOTE

Compare SMOTE to original data

Fraud detection algorithms in action

Exploring the traditional way to catch fraud

Using ML classification to catch fraud

Logistic regression combined with SMOTE

Using a pipeline

2

Fraud detection using labeled data

Review of classification methods

Natural hit rate

Random Forest Classifier - part 1

Random Forest Classifier - part 2

Performance evaluation

Performance metrics for the RF model

Plotting the Precision Recall Curve

Adjusting your algorithm weights

Model adjustments

Adjusting your Random Forest to fraud detection

GridSearchCV to find optimal parameters

Model results using GridSearchCV

Ensemble methods

Logistic Regression

Voting Classifier

Adjust weights within the Voting Classifier

3

Fraud detection using unlabeled data

Normal versus abnormal behavior

Exploring your data

Customer segmentation

Using statistics to define normal behavior

Clustering methods to detect fraud

Scaling the data

K-means clustering

Elbow method

Assigning fraud versus non-fraud

Detecting outliers

Checking model results

Other clustering fraud detection methods

Assessing smallest clusters

Checking results

4

Fraud detection using text

Using text data

Word search with dataframes

Using list of terms

Creating a flag

Text mining to detect fraud

Removing stopwords

Cleaning text data

Topic modeling on fraud

Create dictionary and corpus

Flagging fraud based on topics

Interpreting the topic model

Finding fraudsters based on topic

Fraud Detection in Python

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review

Included withPremium or Teams

Join over 15 million learners and start Fraud Detection in Python today!

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.