Fraud Detection in R

Learn to detect fraud with analytics in R.

4 heures16 vidéos49 exercices6 978 apprenantsDéclaration de réalisation

Créez votre compte gratuit

En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.

Formation de 2 personnes ou plus ?

Essayer DataCamp for Business

Apprécié par les apprenants de milliers d'entreprises

Description du cours

The Association of Certified Fraud Examiners estimates that fraud costs organizations worldwide $3.7 trillion a year and that a typical company loses five percent of annual revenue due to fraud. Fraud attempts are expected to even increase further in future, making fraud detection highly necessary in most industries. This course will show how learning fraud patterns from historical data can be used to fight fraud. Some techniques from robust statistics and digit analysis are presented to detect unusual observations that are likely associated with fraud. Two main challenges when building a supervised tool for fraud detection are the imbalance or skewness of the data and the various costs for different types of misclassification. We present techniques to solve these issues and focus on artificial and real datasets from a wide variety of fraud applications.

Pour les entreprises

Formation de 2 personnes ou plus ?

Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.

1
Introduction & Motivation
Gratuit
This chapter will first give a formal definition of fraud. You will then learn how to detect anomalies in the type of payment methods used or the time these payments are made to flag suspicious transactions.
Jouez Au Chapitre Maintenant
Introduction & Motivation
50 xp
Imbalanced class distribution
100 xp
Cost of not detecting fraud
100 xp
Time features
50 xp
Circular histogram
100 xp
Suspicious timestamps
100 xp
Frequency features
50 xp
Frequency feature for one account
100 xp
Frequency feature for multiple accounts
100 xp
Recency features
50 xp
Recency feature
100 xp
Comparing frequency & recency
100 xp
2
Social network analytics
In the second chapter, you will learn how to use networks to fight fraud. You will visualize networks and use a sociology concept called homophily to detect fraudulent transactions and catch fraudsters.
Jouez Au Chapitre Maintenant
Social network analytics
50 xp
Analyzing a network
100 xp
Overlapping edges
100 xp
Fraud and social network analysis
50 xp
Looking for homophily in a network
100 xp
Visualizing node attributes
100 xp
Social network based inference
50 xp
Relational vs non-relational models
50 xp
Relational neighbor classifier
100 xp
Social network metrics
50 xp
Degree, closeness & betweenness
100 xp
Adding network features
100 xp
3
Imbalanced class distributions
Fortunately, fraud occurrences are rare. However, this means that you're working with imbalanced data, which if left as is will bias your detection models. In this chapter, you will tackle imbalance using over and under-sampling methods.
Jouez Au Chapitre Maintenant
Dealing with imbalanced datasets
50 xp
How to deal with class imbalance?
50 xp
Visualizing patterns in the data
100 xp
Random over-sampling
100 xp
Random under-sampling
50 xp
Shrinking the majority group
100 xp
Combining ROS & RUS
100 xp
Synthetic Over-sampling
50 xp
Have you met SMOTE?
50 xp
SMOTE
100 xp
From dataset to detection model
50 xp
Build your own detection model
100 xp
True cost of fraud detection
100 xp
4
Digit analysis and robust statistics
In this final chapter, you will learn about a surprising mathematical law used to detect suspicious occurrences. You will then use robust statistics to make your models even more bulletproof.
Jouez Au Chapitre Maintenant
Digit analysis using Benford's law
50 xp
Benford's Law for first digit
100 xp
Conformity of census data
100 xp
Benford's Law for fraud detection
50 xp
Conformity to Benford's Law
50 xp
Fire insurance claims
100 xp
Payments data set
100 xp
Detecting univariate outliers
50 xp
Computing robust z-scores
100 xp
Boxplot
100 xp
Detecting multivariate outliers
50 xp
Multivariate outlier detection
100 xp

Pour les entreprises

Formation de 2 personnes ou plus ?

Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.

ensembles de données

Chapter 1 datasets Chapter 2 datasets Chapter 3 datasets Chapter 4 datasets

collaborateurs

Hadrien Lacroix

Sara Billen

Chester Ismay

prérequis

Unsupervised Learning in R Supervised Learning in R: Classification

Bart Baesens

Professor in Analytics and Data Science at KU Leuven

Sebastiaan Höppner

PhD researcher in Data Science at KU Leuven

Tim Verdonck

Professor at KU Leuven

Qu’est-ce que les autres apprenants ont à dire ?

Inscrivez-vous 15 millions d’apprenants et commencer Fraud Detection in R Aujourd’hui!

Créez votre compte gratuit

Google LinkedIn Facebook

En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.

Description du cours

.css-10r9e5n{-webkit-margin-end:8px;margin-inline-end:8px;}.css-1309hh9{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;-webkit-margin-end:8px;margin-inline-end:8px;}Formation de 2 personnes ou plus ?

Introduction & Motivation

Social network analytics

Imbalanced class distributions

Digit analysis and robust statistics

Formation de 2 personnes ou plus ?

Qu’est-ce que les autres apprenants ont à dire ?

Inscrivez-vous .css-ou6dz6{color:#03ef62;}15 millions d’apprenants et commencer Fraud Detection in R Aujourd’hui!

Créez votre compte gratuit

Formation de 2 personnes ou plus ?

Inscrivez-vous 15 millions d’apprenants et commencer Fraud Detection in R Aujourd’hui!