Using the tidyverse of R for data handling and visualization

R is a very powerful language for statistical computing in many disciplines of research and has a steep learning curve. The software is open source, freely available and has a thriving community.

ZPID invited Roland Krause and Aurélien Ginolhac from the University of Luxembourg, to give a workshop about the tidyverse of R. The tidyverse is a collection of R packages that are designed to promote code understandability.

It greatly simplifies:

  • data importing
  • cleaning
  • processing
  • visualization
  • reproducible workflows using pipelines (%>%)

The three day course provides a complete introduction to data handling in the tidyverse. The course will not go deep into statistics but rather into getting data ready, some exploratory analysis, visualization and handling models. Each day will be a mixture of lectures and practicals with the opportunity to post questions and to join timed Q&As.

  • Day 1 will review the basics of R, loading data via the vroom package and basic cleaning of text using regular expressions.
  • Day 2 will introduce tidying and organising data via the tidyr and dplyr packages as well as ggplot2 for vizualisation.
  • Day 3 will look at functional programming tools using the purrr package, which greatly simplifies repeating operations. RMarkdown documents enable reproducible and automated reporting. Many statistical packages have complicated and idiosyncratic data structures. The broom package helps to convert them to consistent data structures. 

Preparing data takes up to 80% of the time spent in analysis — speeding this up is the goal of this course.

The workshop takes place from Wednesday, September 8, to Friday, September 10, 9 am to 5 pm CET (Brussels/Berlin/Rome), and will be held in an online format. Participation is free of charge.

Registration is closed

Contact person

Dr. Stefanie Mueller
Head of Study Planning, Data Collection, and Data Analysis Services

Requirements

Registration

The registration is closed.

Software

You will need R, RStudio, and the tidyverse to complete the exercises. Follow these instructions to install and test the software. Communication during the workshop will be via our Gitter channel. In order to pose questions, you need to join Gitter using a GitHub, Twitter, or GitLab account.

Prior knowledge

Basic experience with R or another programming language is beneficial. If you do not have experience with programming at all or if you want to brush up your R knowledge, we recommend that you either take part in our introductory crash course (see below) or complete a simple free online course.

Introductory crash course

Prior to the workshop, on September 3, from 10 am to 4:30 pm CET, there will be an additional free online crash course on Base-R covering the topics 1) introduction into R, 2) reading, saving, and viewing data, 3) selecting and changing objects in R, and 4) descriptive statistics.
The course will be taught by Lisa Spitzer, PhD student at ZPID.  Follow this link to view the course schedule and the crash course materials.