Skip to main content
HomeTutorialsPython

What is GitHub Copilot: A Complete Beginner's Guide to Pair Programming

Explore how GitHub Copilot works with Visual Studio Code. Learn about its features, pricing, and practical applications for students and developers.
May 2024  · 8 min read

Whether you're tackling a complex project or a mundane, time-intensive task, GitHub Copilot can streamline your coding efforts. In this tutorial, you'll discover how GitHub Copilot works, explore its key features, and learn how it can significantly enhance your productivity and coding efficiency. Let's dive in. 

What is GitHub Copilot? 

GitHub Copilot is a groundbreaking AI-based programming assistant launched by GitHub in 2021. It uses OpenAI's Codex model, which is a descendant of the GPT models that focused its training on a diverse range of programming languages and coding contexts. For this reason, GitHub Copilot is thought to be more capable than ChatGPT for code-writing tasks. 

The most striking feature of GitHub Copilot is its seamless integration into popular development environments, such as Visual Studio Code. By embedding directly into the code editor, GitHub Copilot acts as a pair programmer that offers real-time suggestions, code completions, and recommendations, all of which we will explore in-depth.

Exploring GitHub Copilot Features

GitHub Copilot has many great time-saving features. Here is a list of the most important ones that can speed up the lifecycle of your data science project. 

Chat interface in your editor

GitHub Copilot integrates a ChatGPT-like chat interface directly into your IDE, eliminating the need to switch between your editor and external websites to correct code.

Let's look at the following example: Here we ask GitHub Copilot to generate the code required to translate a text from English to Italian using the OpenAI model.

GitHub Copilot sidebar in VS Code IDE

GitHub Copilot for the command line interface

For those who wok in the terminal, GitHub Copilot in the CLI provides a chat-like interface within the command line. Here, we have asked Github Copilot to explain the command sudo apt-get

GitHub Copilot inside the command line interface

GitHub Copilot for docs

GitHub Copilot can provide AI-generated answers by sourcing information directly from documentation, which can save a lot of time and effort. 

GitHub Copilot for Docs

AI-powered pull requests

It's no surprise that GitHub Copilot integrates with GitHub. GitHub Copilot provides a feature to describe the changes in a GitHub repository and review them for a pull request. 

GIF based on video taken from GitHub Next

How to Get Started Using GitHub Copilot

Now that we have explored GitHub Copilot's impressive features, let's learn how to set up and use it in Visual Studio Code. To do this, we first need to take care of two administrative tasks: We need to install Visual Studio Code and sign up for and install GitHub Copilot.

Installing Visual Studio Code

We install VS Code by visiting the Visual Studio Code website and following the instructions. The website includes how-to videos if you have trouble. 

Downloading Visual Studio CodeDownloading Visual Studio Code

Signing Up for GitHub and Installing GitHub Copilot

To install GitHub Copilot, we first need to create a GitHub account. If you're looking to test GitHub Copilot without a long-term commitment, consider opting for the free 30-day trial.

Signing Up for GitHubSigning Up for GitHub

Setting up GitHub Copilot with Visual Studio Code

We then enter Visual Studio Code and install two extensions from the marketplace: GitHub Copilot and GitHub Copilot Chat. You just need to press the “Install” button and sign in to GitHub.

Setting up Gitub CopilotSetting up GitHub Copilot

Using GitHub Copilot Inside Visual Studio Code

To test GitHub Copilot, we will use the Seoul Bike Sharing dataset, one of many curated datasets through DataLab. Our objective is to predict the number of public bikes rented per hour in Seoul’s bike-sharing system based on weather information, such as temperature, humidity, wind speed, and other variables.

Using GitHub Copilot to import data into VS Code

Let's start by importing a CSV file and viewing the first five rows. GitHub Copilot gets to work right away by autofilling the suggested CSV. We press "Tab" to accept its suggestions.

Importing data into Github CopilotImporting data into Github Copilot

Using GitHub Copilot to display a plot

As a next step, we choose to create a visualization. A correlation matrix with a heatmap is as good a choice as any to illustrate GitHub Copilot's intelligence. We see that GitHub Copilot not only writes the code for our correlation matrix but also finishes our sentence when we make the request.

Writing code for a plot in Github CopilotWriting code for a plot in Github Copilot


Oops - we have obtained an error because we didn’t remove the categorical variables from the correlation matrix, which is a common mistake. We can fix this error by adding a comment and a new piece of code. GitHub Copilot finds the correct columns to remove, correcting the error.

Displaying a plot in Github CopilotDisplaying a plot in Github Copilot

Using GitHub Copilot to prepare data for training

After exploring the data, it’s time to preprocess it before training our model. For this exercise, we choose an ordinary least squares linear regression. To do this, we need to encode the categorical variables using one-hot encoding.

As we type our request, GitHub Copilot makes predictions and code suggestions. It even started to work with us as we had second thoughts about including one of our variables.

Preparing data in Github CopilotPreparing data in Github Copilot

GitHub Copilot helps us with all of the necessary steps in our workflow, including choosing our independent variables, splitting our data into training and test sets, and cleaning our data to be ready for our model.

Creating a train / test split in Github CopilotCreating a train / test split in Github Copilot

Using GitHub Copilot to evaluate our model

The last phase of our mini-project is to train and evaluate our linear regression model. GitHub Copilot helps us find model statistics on our training data and then evaluates the model's performance on the testing set.

Training our model in Github CopilotTraining our model in Github Copilot

Evaluating our model in Github CopilotEvaluating our model in Github Copilot

The mean squared error is higher than we expected, which makes us consider another model. We switch to a random forest model, to test the output, and we see the error is much lower than before. If you want to explore these models in much greater detail, check out our Machine Learning Fundaments in Python course.

Viewing model statistics in Github CopilotViewing model statistics in Github Copilot

Exploring Github Copilot Plans and Pricing

There are three different GitHub Copilot’s plans available depending on your needs.

  • Copilot Individual is the least expensive plan. It allows you to use GitHub Copilot in an IDE or on the command line. It’s free for students and teachers. All the features covered in our tutorial are included in this plan.
  • Copilot Business is a subscription appropriate for business purposes. It allows access to GitHub Copilot’s services as a member of the organization.
  • Copilot Enterprise is the most complete plan for larger enterprise accounts that need additional customization. 

Github Copilot pricing structureGithub Copilot pricing structure

GitHub Copilot Alternatives

Let's now explore some compelling alternatives to GitHub Copilot. The following three companies are all at the forefront of the generative AI revolution and provide generative AI solutions to assist with code creation. 

The DataLab IDE

  • DataLab: DataCamp's very own DataLab is an AI-enabled notebook. Simply attach the data source, ask the AI what you need, and get insights. The required notebooks are already installed. DataLab is perfect for novices looking to learn as well as business professionals needing to leverage AI to create compelling presentations for decision-makers. 
  • TabNine: TabNine is an alternative that also provides AI code completions and AI chat agents and it works with many popular IDEs.
  • SonarQube: SonarQube is geared towards software development. With SonarQube, developers would upload data and use SonarQube to receive AI-assisted and quality-assured code.

Conclusion

We've just finished a complete data science project in a matter of minutes using GitHub Copilot. It proves to be a useful asset for speeding up all aspects of the data science workflow, everything from displaying plots to model building with a train/test workflow.

If you found this tutorial helpful and want to get started with GitHub Copilot, we recommend DataCamp's video Pair Programming with GitHub Copilot. Another course, GitHub concepts, will be a great companion course, especially if you feel unsure about GitHub.

Thanks for reading!


Photo of Eugenia Anello
Author
Eugenia Anello
Topics

Learn More with DataCamp

Course

Introduction to Python

4 hr
5.6M
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

Understanding GitHub: What is GitHub and How to Use It

Discover the uses of GitHub, a tool for version control and collaboration in data science. Learn to manage repositories, branches, and collaborate effectively.