Skip to main content
HomeBlogData Literacy

Data Demystified: What Exactly is Data?

Welcome to Data Demystified! A blog-series breaking down key concepts everyone should know about in data. In the first entry of the series, we’ll answer the most basic question of them all, what exactly is data?
Sep 2022  · 4 min read

Welcome to the first part of our month-long data demystified series. As part of Data Literacy Month, this series will clarify key concepts from the world of data, answer the questions you may be too afraid to ask and have fun along the way.

Data Demystified

In this entry, we’ll be defining what exactly data is and how you get started with it. 

What is Data?

Data are facts or pieces of information. They are often measurements ("Maria is 165cm tall") or observations ("the die rolled a six"), or opinions ("I rate this video game 4 out of 5 stars"). Usually, data is collected to be analyzed to find insights, draw a conclusion, or make a decision.

Although we often think of data as numbers, there are other data types too. 

  • Data that can be true or false are called logical ("Mohammed passed the exam"). 
  • Data that can take several named values are called categorical data ("Anna's hair is blonde; Wei's hair is black").
  • Dates and times are a type of data. 

Text can also be considered data, for example, reviews of products. Since many data analyses require numbers rather than words, text data usually need to be processed to turn it into numbers as part of the analysis.

Similarly, images can be considered data, though they typically need processing to turn into numbers to be analyzed.

What is a Dataset?

If you’ve ever encountered an Excel spreadsheet, you’ve encountered a dataset. In a nutshell, a dataset is a collection of data. More specifically, several pieces of the same type of data. In the first example, we had one height measurement. If we had several measurements of people's heights, that would be considered a dataset.

Consider this table, with the names of four people in the first column, their heights in the second column, and their hair color in the third column. This is a dataset.

Name

Height (cm)

Hair color

Maria

165

red

Mohammed

182

black

Anna

173

blonde

Wei

160

black

This rectangular format is the most common dataset structure. There are two common names for a dataset with this shape. Database users call these datasets "tables." By contrast, data scientists call these datasets "data frames." In fact, that's where the name of DataCamp's DataFramed podcast comes from.

Whatever you want to call them, these rectangular datasets have some features worth knowing about.

  • Each row represents the details of a person. Rows are sometimes called "observations" or "records."
  • Each column represents a property of the observations. Columns are sometimes called "variables" or "features."
  • Every value in a column should have the same type of data. For example, all the heights are numbers.
  • Different columns can have different types of data. For example, the name and hair color columns contain text data, but the heights are numeric data.

Data is Everywhere

The beauty of data is that it is everywhere! Businesses will have datasets on who their users or customers are, what they are buying, how much they are spending, how much money the company has, who the employees are, what assets the company has, and many other things.

In daily life, you may use a weather forecasting app, get directions from a mapping app, or send messages to friends and family through a chat app. All these apps are powered by data and will make use of several datasets.

Any product you use has been manufactured somewhere that collects data on how that product was made. The shop you bought it from will have collected data on who is buying that product and when.

In fact, for almost any question you can think of, there is sure to be a dataset that can help you answer it.

Get Started with Data Today

We hope you enjoyed this short introduction to data. In the next entry of data demystified, we’ll be looking at the different types of data, from quantitative to qualitative data, to image and text and data, and more. If you want to start your data learning journey today, check out the following resources. 

Topics

Data Literacy Courses

course

Understanding Data Science

2 hours
611.8K
An introduction to data science with no coding involved.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related
Quantitative vs. Qualitative Data

blog

Data Demystified: Quantitative vs. Qualitative Data

In the second entry of data demystified, we’ll take a look at the two most common data types: Quantitative vs Qualitative Data. For more data demystified blogs, check out the first entry in the series.
Richie Cotton's photo

Richie Cotton

5 min

An Overview of Descriptive Statistics

blog

Data Demystified: An Overview of Descriptive Statistics

In the fifth entry of data demystified, we provide an overview of the basics of descriptive statistics, one of the fundamental areas of data science.
Richie Cotton's photo

Richie Cotton

6 min

Data Science Concept

blog

What is Data Science? Understanding Data Science from Scratch

In this blog, we delve into what data science is and explore the answers to frequently asked questions about the various aspects of data science.
DataCamp Team's photo

DataCamp Team

16 min

blog

An Introduction to Data Ethics: What is the Ethical Use of Data?

Learn everything you need to know about data ethics, including the key principles and how they’re applied to your data.

Christine Cepelak

15 min

image5.png

blog

Data Demystified: Data Visualizations that Capture Trends

In part eight of data demystified, we’ll dive deep into the world of data visualization, starting off with visualizations that capture trends.
Richie Cotton's photo

Richie Cotton

10 min

Poster

blog

Data Demystified: The Difference Between Data Science, Machine Learning, Deep Learning, and Artificial Intelligence

In the third entry of data demystified, we’ll define the most common pieces of jargon you hear in data science today. From machine learning to deep learning, here’s all the subfields of data you need to know.
Richie Cotton's photo

Richie Cotton

5 min

See MoreSee More