Skip to main content

How to Practice SQL Using any Dataset with Workspace

Learn how DataCamp Workspace optimizes the experience of working with Jupyter notebooks and SQL. Discover how to effortlessly write SQL queries, connect to databases, analyze CSV files, and leverage the power of AI assistance
May 2023  · 9 min read

DataCamp Workspace builds on Jupyter notebooks, optimizing the experience to make you more productive.

One common pain point data analysts and data scientists have is that it can be tricky to get a Jupyter notebook hooked up to a database to start working. The problem can be even more acute for beginners: how can you learn to write SQL if you can't even connect to a database?

Workspace makes it effortless to write SQL - whether you are practicing your skills or working. This tutorial shows you how to get writing SQL in Workspace in seconds.

This tutorial assumes that you've used SQL before. If you haven't, read our SQL Tutorial for Beginners first, then try the examples from that tutorial in Workspace using the techniques shown here!

Getting Started

To use Workspace, you'll need to sign up to DataCamp. (A free account is fine to get started.)

Once you have your account, use the top navigation bar to go to Workspace.

DataCamp Workspace menu

This tutorial will use the built-in NBA Shooting Data dataset. If you prefer to use your own dataset, skip to the section below.

Use the left-hand navigation to go to the Datasets page.

DataCamp Workspace menu

Type "nba" into the search box and click on the dataset.

NBA dataset search

You'll see a preview of the dataset. Read through it, then click "Use Dataset". 

NBA Datase overview

Uploading your own dataset

You can use SQL to analyze any table-like data in Workspace. Currently, CSV files are best supported, so for learning purposes, if you supply your own dataset, it is recommended to use a CSV file.

From the Overview page, click "Empty" to create an empty workspace.

Create a new workspace

Give your workspace a name, then click "Create Workspace."

Naming your workspace

In the left-hand toolbar, open the Files tool.

Files tool

In the top-left of the tool, drag a data file into the file pane to upload it. (You can also click "browse files" or click the ⊕ button then Upload).

Using SQL With Data In CSV Files

The NBA shooting data is in the nba_players_shooting.csv file. In the left-hand toolbar, open the Files tool to see the file.

Files tool NBA dataset

The notebook in the workspace contains some Python code to import the dataset. However, in this case, we want to use SQL.

At the end of the notebook, click 'Add SQL.'


In the "Select source" dropdown, select "DataFrames and CSVs."

Select data source

To write a SQL query against a CSV file, only one change to standard SQL is required. In the FROM clause, instead of naming the table, write the path to the CSV file in single quotes.

Let's calculate the proportion of shots made for each shooter, ordered from best to worst. In the SQL cell, type the following code.

	SUM((SCORE = 'MADE')::DOUBLE) / COUNT(*) AS prop_score_made
	FROM 'nba_players_shooting.csv'
	GROUP BY shooter
	ORDER BY prop_score_made DESC

Shots made query output

Note that the dialect of SQL used when you are working with a CSV file is DuckDB. Read more about this in DuckDB makes SQL a first-class citizen on DataCamp Workspace.

Continue working in Python, R, or no-code charts

In the last example, we kept the default settings for the query return format. Look closely at the top of the SQL cell, and you'll see that it shows the return value as a DataFrame named df. That means you can continue to work on the results in Python (or in R, for R workspaces) or draw no-code plots.

Add chart

Click 'Add Chart' to add a no-code chart.

Add chart

Set the following options.

  • Type: Bar
  • X-axis: prop_score_made
  • Y-axis: SHOOTER

Chart customization

Using SQL With Data In Data Frames

The SQL query that was run against the CSV file created a data frame named df. You can also write SQL queries against these pandas or R data frames. The only difference from standard SQL is that in the FROM clause, you provide the data frame name instead of the table name.

Here, we'll reuse the previous results in df and split the shooter column into separate first and last names. Add a SQL cell, select "DataFrames and CSVs" as the source, and use the following code.

str_split(shooter, ' ')[1] AS first_name,
str_split(shooter, ' ')[2] AS last_name,

Using Sample Databases

Workspace also provides several sample databases that you can use to practice your SQL skills.

Add a SQL cell, and in the source dropdown, scroll down to the "Sample Integrations" section. Here, we'll use "Bicycle Sales."

Sample integrations

A default query will appear. For the tutorial, let's replace it with a simpler query.

MIN(list_price) AS cheapest_list_price
FROM products
GROUP BY product_name

When you use a connection to a database (rather than writing SQL against a CSV file or dataframe), you get the option to return the results as a query. In the dropdown, select "Query." You can also name the resulting value. Here, change "query" to "best_price."

Rename query

Running the query gives the following output.

Query output

You can now use this result to write further SQL against. Essentially, you've written a common table expression (covered in the SQL Commands for Data Scientists tutorial) without having to use the common table expression syntax.

Add another SQL cell, and write this further query.

	FROM best_price
	WHERE cheapest_list_price > 10000

Running this query returns the following results.

Query output 2

Connecting to Other Databases

Workspace also allows you to connect to many types of database, including PostgreSQL, MySQL, Redshift, bigQuery, Athena, SQL Server, MariaDB, and Oracle Database. Full details of how to connect are provided in the Workspace Docs.

In the left-hand toolbar, click on "Integrations," then click the ⊕ button to create a new integration.

Choose the type of database that you are connecting to.

Connecting to external database

You'll be asked to provide connection details (these vary by the type of database).

Database connection console

Using the AI Assistant

While writing SQL can be fun and useful, getting an AI to write it for you can help you learn and make you more productive. You can read about the AI Assistant in From Data to Insights: Get There Faster with the DataCamp Workspace AI Assistant.

Add a SQL cell. In the right-hand context menu, click "Generate."

Generate function

A textbox appears at the top of the SQL cell. In this, write a task for the AI. Choose your own task, or try the following.

Return the product names and list prices for all mountain bikes made by Trek that are in stock.

After typing your query, press Enter to get the AI to generate SQL code.

AI suggested query

If you think that the code is reasonable, click "Accept & run" (or click "Accept," then "Run" later on) to see the results of the query.

Query output

Get the AI to Explain Your Errors

You can also use the AI built-in to Workspace to explain why errors occur in your SQL. Suppose you got confused while writing the query for the expensive bikes and used a Fortran-style .GT. instead of `>`.

    FROM best_price
    WHERE cheapest_list_price .GT. 10000

This results in an error. In the bottom-left of the SQL cell, click "Fix Error."

Error in query

The AI provides the correct code, followed by an explanation of what you did wrong.

Suggested query


You've seen how to avoid setting up a database by writing SQL queries against CSV files and data frames. You used sample databases to practice writing code against. And you saw how to use AI to generate SQL with natural language and fix your errors.

These features make it much easier to get started writing SQL and to be more productive while learning and working.

Take it to the Next Level

Get started with SQL in Workspace by logging in or signing up for a DataCamp account. You can also master some of the SQL fundamentals in our skill track, which will give you all the skills you need to interact with and query your data.

Photo of Richie Cotton
Richie Cotton

Richie helps organizations get from a vague sense of "hey we ought to get better at using data" to having realistic plans to become successful data-driven organizations. He's been a data scientist since before it was called data science, and has written several books and created many DataCamp courses on the subject.

Expand Your Knowledge

Joining Data in SQL

BeginnerSkill Level
4 hr
Level up your SQL knowledge and learn to join tables together, apply relational set theory, and work with subqueries.
See DetailsRight Arrow
Start Course
See MoreRight Arrow

Enhanced Security at Your Fingertips: DataCamp Workspace Now ISO/IEC 27001:2017 Compliant

We are excited to announce that DataCamp Workspace is now ISO/IEC 27001:2017 compliant, the rigorous international standards for managing and protecting information assets
Filip Schouwenaars's photo

Filip Schouwenaars

3 min

Run Data Hackathons with DataCamp Workspace

With DataCamp Workspace, running data hackathons becomes easy and fun. Explore how Workspace solves common pitfalls and the steps to organize your own hackathon.
Filip Schouwenaars's photo

Filip Schouwenaars

9 min

How To Use Workspace AI-Powered Notebooks for Every Data Skill Level

Find out how DataCamp Workspace and its AI Assistant can boost your data science workflow - regardless of your skill level.
Alena Guzharina's photo

Alena Guzharina

6 min

Schedule Data Notebooks to Automate Business Metric Reporting

Leverage the scheduling capabilities of DataCamp Workspace to automatically report on business metrics with insightful visualizations
Filip Schouwenaars's photo

Filip Schouwenaars

5 min

10 Portfolio-Ready SQL Projects for All Levels

Select your first—or next—SQL project to practice your current SQL skills, develop new ones, and create an outstanding professional portfolio.
Elena Kosourova's photo

Elena Kosourova

11 min

QUALIFY: The SQL Filtering Statement You Never Knew You Needed

Learn about the SQL QUALIFY clause, an essential yet lesser-known filtering method in SQL. Understand its syntax, uses, and how it differs from other SQL filtering methods.
Kurtis Pykes 's photo

Kurtis Pykes

8 min

See MoreSee More