Skip to main content
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
HomeResourcesWebinars

Data Science for Spreadsheet Users

Webinar

Every day, millions of knowledgeable workers head into the office and open up a spreadsheet, a workbook, or a series of workbooks to accomplish routine tasks. These tasks take time. They can be repetitive and sometimes even downright boring. So in the 21st century, when most routine tasks can be automated, what’s a better way? Chris Cardillo, Data Scientist at DataCamp, challenges anyone spending their days in and out of spreadsheets to pick up a coding language. He covers the tangible benefits of incorporating code into an everyday workflow, as well as examples of where code enhances or replaces work done in spreadsheets. Finally, he covers some R language equivalents for spreadsheet techniques and how to get started when you’re ready to take the first step.

You can find the slides here.

Summary

Data science programming skills offer significant benefits not only to data scientists but also to spreadsheet power users. The session explores how these skills can enhance daily operations in fields like digital advertising through increased accessibility, efficiency, and collaboration. By connecting the gap between traditional spreadsheet functions and programming languages such as R and Python, users can automate repetitive tasks, minimize human error, and simplify data manipulation processes. The discussion draws parallels between spreadsheet functions and R, demonstrating how coding can be an efficient tool for data analysis and reporting. Additionally, the session explores the ease of transitioning from spreadsheets to scripting workflows, emphasizing the potential for improved data handling and reporting efficiency.

Key Takeaways

  • Data science skills are beneficial beyond traditional data science roles, enhancing efficiency in various fields.
  • Programming languages like R and Python can automate repetitive tasks and reduce manual errors.
  • R and Python provide powerful libraries for data extraction and manipulation, surpassing Excel’s limitations.
  • Understanding the parallels between spreadsheet functions and coding can ease the transition to data science.
  • Community resources and forums play an important role in learning and problem-solving in data science.

Deep Dives

Accessibility, Efficiency, and Collaboration

Data science programming enhances accessibility, efficiency, and collaboration in data handling. With tools like R and Python, users can automate data extraction from ...
Read More

APIs and SQL databases, overcoming Excel’s row limitations. This transition allows users to handle large datasets efficiently, reducing sluggish performance and manual errors. Chris Cardillo explains, “If you've ever tried to pivot a 750,000 row raw dataset and analyze it, you'll notice that your performance gets a little sluggish.” By coding repetitive tasks, users can save time and focus on analysis rather than data aggregation, thereby minimizing human error and improving data integrity.

Application in Scripting Workflows

Moving from spreadsheets to scripting workflows involves understanding the concept of scripts as a set of instructions that produce desired outcomes. Chris illustrates this transition through real-world examples from digital advertising, where scripting reduced manual workload significantly. He emphasizes the power of libraries such as HTTR, DBI, and SQLAlchemy in automating data retrieval and processing. This automation not only saves time but also ensures consistent and accurate data handling, making the data more shareable and understandable across teams. As Chris states, “Your work becomes more shareable when it's encased in code.”

Spreadsheet Equivalents in R

Understanding the equivalence of spreadsheet functions in R is important for users transitioning to data science. Chris demonstrates how functions like VLOOKUP in spreadsheets compare to left join in R for merging datasets. Similarly, creating new columns and summarizing data in R can be achieved using mutate and summarize functions, respectively. This understanding connects the gap between traditional data handling and data science, allowing users to utilize the power of programming for complex data manipulations with just a few lines of code.

Learning Resources and Community Support

Chris highlights several resources and communities that support learning data science. Platforms like Datacamp offer interactive learning experiences, enabling users to practice coding in real-time. Books such as "R for Data Science" and "Python Data Science Handbook" provide in-depth knowledge for beginners. Additionally, the active data science community, including meetups, forums, and conferences, offers ample opportunities for networking and problem-solving. As Chris suggests, “The combination of Google and Stack Overflow is a pretty great combination” for troubleshooting and learning.

Chris Cardillo Headshot
Chris Cardillo

Data Scientist at DataCamp

View More Webinars

Related

webinar

Spend Less Time in Spreadsheets with SQL

Hate wrangling data in spreadsheets? SQL does it better.

webinar

Make the most of your organization’s data with business intelligence

Learn how to scale data insights in your organization with business intelligence

webinar

Data Skills to Future-Proof Your Organization

Discover how to develop data skills at scale across your organization.

webinar

Data Science for Business Leaders

Here's how to build a high-performance data team aligned with company strategy.

webinar

Inside the Data Science Workflow

Learn all the steps to drive actionable insight in the data science workflow.

webinar

Democratizing Data Science at Your Company

Data science isn't just for data scientists. It's for everyone at your company.

Hands-on learning experience

Companies using DataCamp achieve course completion rates 6X higher than traditional online course providers

Learn More

Upskill your teams in data science and analytics

Learn More

Join 5,000+ companies and 80% of the Fortune 1000 who use DataCamp to upskill their teams.

Don’t just take our word for it.