Skip to main content
HomeR

Introduction to Bioconductor in R

Learn to use essential Bioconductor packages for bioinformatics using datasets from viruses, fungi, humans, and plants!

Start Course for Free
4 hours14 videos54 exercises14,902 learnersTrophyStatement of Accomplishment

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
Group

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies


Course Description

Much of the biological research, from medicine to biotech, is moving toward sequence analysis. We are now generating targeted and whole genome big data, which needs to be analyzed to answer biological questions. To help you get started, you will be introduced to The Bioconductor project. Bioconductor is and builds the infrastructure to share software tools (packages), workflows and datasets for the analysis and comprehension of genomic data. Bioconductor is a great platform accessible to you, and it is a community developed open software resource. By the end of this course, you will be able to use essential Bioconductor packages and get a grasp of its infrastructure and some built-in datasets. Using BSgenome, Biostrings, IRanges, GenomicRanges, TxDB, ShortRead and Rqc with real datasets from different species is going to be an exceptional experience!
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.
DataCamp for BusinessFor a bespoke solution book a demo.

In the following Tracks

Analyzing Genomic Data in R

Go To Track
  1. 1

    What is Bioconductor?

    Free

    In this chapter, you will get hands-on with Bioconductor. Bioconductor is the specialized repository for bioinformatics software, developed and maintained by the R community. You will learn how to install and use bioconductor packages. You'll be introduced to S4 objects and functions, because most packages within Bioconductor inherit from S4. Additionally, you will use a real genomic dataset of a fungus to explore the BSgenome package.

    Play Chapter Now
    Introduction to the Bioconductor Project
    50 xp
    Bioconductor version
    100 xp
    BiocManager to install packages
    100 xp
    The role of S4 in Bioconductor
    50 xp
    S4 class definition
    50 xp
    Interaction with classes
    100 xp
    Introducing biology of genomic datasets
    50 xp
    Discovering the yeast genome
    100 xp
    Partitioning the yeast genome
    100 xp
    Available genomes
    50 xp
  2. 2

    Biostrings and When to Use Them?

    Biostrings are memory efficient string containers. Biostring has matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences. How efficient you can become by using the right containers for your sequences? You will learn about alphabets, and sequence manipulation by using the tiny genome of a virus.

    Play Chapter Now
  3. 3

    IRanges and GenomicRanges

    The IRanges and GenomicRanges packages are also containers for storing and manipulating genomic intervals and variables defined along a genome. These packages provide infrastructure and support to many other Bioconductor packages because of their enriching features. You will learn how to use these containers and their associated metadata, for manipulation of your sequences. The dataset you will be looking at is a special gene of interest in the human genome.

    Play Chapter Now
  4. 4

    Introducing ShortRead

    ShortRead is the package for input, manipulation and assessment of fasta and fastq files. You can subset, trim and filter the sequences of interest, and even do a report of quality. An extra bonus towards the last exercises will give you the tools for parallel quality assessment, wink, wink Rqc. Exciting enough, for this you will use plant genome sequences!

    Play Chapter Now
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

In the following Tracks

Analyzing Genomic Data in R

Go To Track

datasets

Zika Genomic DNA datasetA. Thaliana Short Reads with Quality datasetHuman Gene & Transcript ID datasetYeast Genome dataset

collaborators

Collaborator's avatar
David Campos
Collaborator's avatar
Shon Inouye
Collaborator's avatar
Richie Cotton
James Chapman HeadshotJames Chapman

Curriculum Manager, DataCamp

James is a Curriculum Manager at DataCamp, where he collaborates with experts from industry and academia to create courses on AI, data science, and analytics. He has led nine DataCamp courses on diverse topics in Python, R, AI developer tooling, and Google Sheets. He has a Master's degree in Physics and Astronomy from Durham University, where he specialized in high-redshift quasar detection. In his spare time, he enjoys restoring retro toys and electronics.

Follow James on LinkedIn
See More
Paula Martinez HeadshotPaula Martinez

Data Scientist and Bioinformatician

Paula Andrea Martinez is currently working at The Life Sciences infrastructure ELIXIR Europe. She empowers life scientists by training them in software skills, data analysis, visualization and data stewardship best practices. She also advocates for open and reproducible science as evidenced by her volunteer roles with The Carpentries. Paula gained her PhD in applied Bioinformatics from The University of Queensland, using computational methods to study genomic diversity. She is particularly interested in R, databases, community building, open science, and diversity in STEM.
See More

What do other learners have to say?

Join over 15 million learners and start Introduction to Bioconductor in R today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.