Direkt zum Inhalt
StartseitePython

Case Study: School Budgeting with Machine Learning in Python

Learn how to build a model to automatically classify items in a school budget.

Kurs Kostenlos Starten
4 Stunden15 Videos51 Übungen59.276 LernendeTrophyLeistungsnachweis

Kostenloses Konto erstellen

GoogleLinkedInFacebook

oder

Durch Klick auf die Schaltfläche akzeptierst du unsere Nutzungsbedingungen, unsere Datenschutzrichtlinie und die Speicherung deiner Daten in den USA.
Group

Trainierst du 2 oder mehr?

Versuchen DataCamp for Business

Beliebt bei Lernenden in Tausenden Unternehmen


Kursbeschreibung

Data science isn't just for predicting ad-clicks-it's also useful for social impact! This course is a case study from a machine learning competition on DrivenData. You'll explore a problem related to school district budgeting. By building a model to automatically classify items in a school's budget, it makes it easier and faster for schools to compare their spending with other schools. In this course, you'll begin by building a baseline model that is a simple, first-pass approach. In particular, you'll do some natural language processing to prepare the budgets for modeling. Next, you'll have the opportunity to try your own techniques and see how they compare to participants from the competition. Finally, you'll see how the winner was able to combine a number of expert techniques to build the most accurate model.
Für Unternehmen

Trainierst du 2 oder mehr?

Verschaffen Sie Ihrem Team Zugriff auf die vollständige DataCamp-Plattform, einschließlich aller Funktionen.
DataCamp Für UnternehmenFür eine maßgeschneiderte Lösung buchen Sie eine Demo.
  1. 1

    Exploring the raw data

    Kostenlos

    In this chapter, you'll be introduced to the problem you'll be solving in this course. How do you accurately classify line-items in a school budget based on what that money is being used for? You will explore the raw text and numeric values in the dataset, both quantitatively and visually. And you'll learn how to measure success when trying to predict class labels for each row of the dataset.

    Kapitel Jetzt Abspielen
    Introducing the challenge
    50 xp
    What category of problem is this?
    50 xp
    What is the goal of the algorithm?
    50 xp
    Exploring the data
    50 xp
    Loading the data
    50 xp
    Summarizing the data
    100 xp
    Looking at the datatypes
    50 xp
    Exploring datatypes in pandas
    50 xp
    Encode the labels as categorical variables
    100 xp
    Counting unique labels
    100 xp
    How do we measure success?
    50 xp
    Penalizing highly confident wrong answers
    50 xp
    Computing log loss with NumPy
    100 xp
  2. 2

    Creating a simple first model

    In this chapter, you'll build a first-pass model. You'll use numeric data only to train the model. Spoiler alert - throwing out all of the text data is bad for performance! But you'll learn how to format your predictions. Then, you'll be introduced to natural language processing (NLP) in order to start working with the large amounts of text in the data.

    Kapitel Jetzt Abspielen
  3. 3

    Improving your model

    Here, you'll improve on your benchmark model using pipelines. Because the budget consists of both text and numeric data, you'll learn to how build pipielines that process multiple types of data. You'll also explore how the flexibility of the pipeline workflow makes testing different approaches efficient, even in complicated problems like this one!

    Kapitel Jetzt Abspielen
Für Unternehmen

Trainierst du 2 oder mehr?

Verschaffen Sie Ihrem Team Zugriff auf die vollständige DataCamp-Plattform, einschließlich aller Funktionen.

Mitwirkende

Collaborator's avatar
Hugo Bowne-Anderson
Collaborator's avatar
Yashas Roy
Collaborator's avatar
Casey Fitzpatrick

Voraussetzungen

Supervised Learning with scikit-learn
Peter Bull HeadshotPeter Bull

Co-founder of DrivenData

Mehr Anzeigen

Was sagen andere Lernende?

Melden Sie sich an 15 Millionen Lernende und starten Sie Case Study: School Budgeting with Machine Learning in Python Heute!

Kostenloses Konto erstellen

GoogleLinkedInFacebook

oder

Durch Klick auf die Schaltfläche akzeptierst du unsere Nutzungsbedingungen, unsere Datenschutzrichtlinie und die Speicherung deiner Daten in den USA.