Skip to main content

course

Feature Engineering for NLP in Python

Advanced

4+

Updated 12/2024

Learn techniques to extract useful information from text and process them into a format suitable for machine learning.

Start course for free

Included for FreePremium or Teams

PythonMachine Learning4 hours15 videos52 exercises4,200 XP25,228Statement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies

Course Description

In this course, you will learn techniques that will allow you to extract useful information from text and process them into a format suitable for applying ML models. More specifically, you will learn about POS tagging, named entity recognition, readability scores, the n-gram and tf-idf models, and how to implement them using scikit-learn and spaCy. You will also learn to compute how similar two documents are to each other. In the process, you will predict the sentiment of movie reviews and build movie and Ted Talk recommenders. Following the course, you will be able to engineer critical features out of any text and solve some of the most challenging problems in data science!

Prerequisites

Introduction to Natural Language Processing in Python Supervised Learning with scikit-learn

1

Basic features and readability scores

Introduction to NLP feature engineering

Data format for ML algorithms

One-hot encoding

Basic feature extraction

Character count of Russian tweets

Word count of TED talks

Hashtags and mentions in Russian tweets

Readability tests

Readability of 'The Myth of Sisyphus'

Readability of various publications

2

Text preprocessing, POS tagging and NER

Tokenization and Lemmatization

Identifying lemmas

Tokenizing the Gettysburg Address

Lemmatizing the Gettysburg address

Text cleaning

Cleaning a blog post

Cleaning TED talks in a dataframe

Part-of-speech tagging

POS tagging in Lord of the Flies

Counting nouns in a piece of text

Noun usage in fake news

Named entity recognition

Named entities in a sentence

Identifying people mentioned in a news article

3

N-Gram models

Building a bag of words model

Word vectors with a given vocabulary

BoW model for movie taglines

Analyzing dimensionality and preprocessing

Mapping feature indices with feature names

Building a BoW Naive Bayes classifier

BoW vectors for movie reviews

Predicting the sentiment of a movie review

Building n-gram models

n-gram models for movie tag lines

Higher order n-grams for sentiment analysis

Comparing performance of n-gram models

4

TF-IDF and similarity scores

Building tf-idf document vectors

tf-idf weight of commonly occurring words

tf-idf vectors for TED talks

Cosine similarity

Range of cosine scores

Computing dot product

Cosine similarity matrix of a corpus

Building a plot line based recommender

Comparing linear_kernel and cosine_similarity

Plot recommendation engine

The recommender function

TED talk recommender

Beyond n-grams: word embeddings

Generating word vectors

Computing similarity of Pink Floyd songs

Congratulations!

Feature Engineering for NLP in Python

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review

Included withPremium or Teams

Don’t just take our word for it

*4

from 13 reviews

54%

23%

8%

0%

15%

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

Gildas T.

7 days

Great course, learned a lot. I can already apply my knowledge to get insights from textual data, and not only that, but also understand and predict sentiments. Fantastic. However, whilst the teaching content are great, some code blocks or line needs to be updated. Thank you for the course!

Hamed H.

8 months

The course is really Straight forward and it contains many tasks to do which is really nice

Ankush B.

9 months

Excellent course for learning concepts related to feature engineering in NLP and their application in Python.

Cherlynn A.

about 1 year

Thank you. It was incredible.

Pierre-Etienne T.

over 1 year

Synthetic, exciting and relevant

"Great course, learned a lot. I can already apply my knowledge to get insights from textual data, and not only that, but also understand and predict sentiments. Fantastic. However, whilst the teaching content are great, some code blocks or line needs to be updated. Thank you for the course!"

Gildas T.

"The course is really Straight forward and it contains many tasks to do which is really nice"

Hamed H.

"Excellent course for learning concepts related to feature engineering in NLP and their application in Python."

Ankush B.

Join over 15 million learners and start Feature Engineering for NLP in Python today!

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.