1 Required skills ✎ Polishing

This book assumes some existing knowledge of data wrangling and visualization in R/tidyverse. Specifically, familiarity with RStudio, reproducible reports written in {quarto} (.qmd) documents, the ‘pipe’ (|> and %>%), the packages {dplyr}, {tidyr}, {forcats}, and {ggplot2}, and the concept of Tidy Data (Wickham, 2014).

If need help with these skills, please see Ian’s other book “Reproducible Data Processing and Visualization in R and tidyverse” at wrangling.tidyver.se.

If you’re enrolled in our class but haven’t already taken Ian’s “Reproducible data processing and visualization” class based on that book, or a comparable class, we encourage you to rapidly make your way through it. In previous years, students have taken this simulation course without much familiarity with R when they already have some familiarity with other coding languages such as Python and Matlab. Sometimes, students sign up for this seminar with low confidence in their R abilities. It is entirely possible to succeed in this course without strong existing R skills, but it unavoidably means more self-guided learning and practice for you.

1.1 Check your skills

Later content in this book relies on you having an understanding of ‘Tidy Data’; the workflow we define and use is built around this concept. Specifically, because most data analysis functions don’t return data in a ‘Tidy’ format, we need to be able to extract their results in Tidy format. Importantly, when learners struggle or make errors when trying to build simulations, it is very often because their workflow is not Tidy.

Test your knowledge: What does Tidy Data refer to?

Tidy Data is a set of technical ideas about how data should be structured defined by Hadley Wickham, the main developer of {tidyverse} (Wickham, 2014).

Test your knowledge: What are the three rules that make a dataset Tidy according to Wickham?

Each variable is a column; each column is a variable.
Each observation is a row; each row is an observation.
Each value is a cell; each cell is a single value.

Test your skills

Ready to test your data wrangling skills? Download and complete the exercises for this chapter.

# Required skills <span class="badge badge-draft3">✎ Polishing</span> This book assumes some existing knowledge of data wrangling and visualization in R/tidyverse. Specifically, familiarity with RStudio, reproducible reports written in {quarto} (.qmd) documents, the 'pipe' (`|>` and `%>%`), the packages {dplyr}, {tidyr}, {forcats}, and {ggplot2}, and the concept of Tidy Data ([Wickham, 2014](https://www.jstatsoft.org/article/view/v059i10/0)). If need help with these skills, please see Ian's other book "Reproducible Data Processing and Visualization in R and tidyverse" at [wrangling.tidyver.se](https://wrangling.tidyver.se). If you're enrolled in our class but haven't already taken Ian's "Reproducible data processing and visualization" class based on that book, or a comparable class, we encourage you to rapidly make your way through it. In previous years, students have taken this simulation course without much familiarity with R when they already have some familiarity with other coding languages such as Python and Matlab. Sometimes, students sign up for this seminar with low confidence in their R abilities. It is entirely possible to succeed in this course without strong existing R skills, but it unavoidably means more self-guided learning and practice for you. ## Check your skills Later content in this book relies on you having an understanding of 'Tidy Data'; the workflow we define and use is built around this concept. Specifically, because most data analysis functions don't return data in a 'Tidy' format, we need to be able to extract their results in Tidy format. Importantly, when learners struggle or make errors when trying to build simulations, it is very often because their workflow is not Tidy. ::: {.callout-note collapse="true"} ## Test your knowledge: What does Tidy Data refer to? Tidy Data is a set of technical ideas about how data should be structured defined by Hadley Wickham, the main developer of {tidyverse} ([Wickham, 2014](https://www.jstatsoft.org/article/view/v059i10/0)). ::: ::: {.callout-note collapse="true"} ## Test your knowledge: What are the three rules that make a dataset Tidy according to Wickham? 1. Each variable is a column; each column is a variable. 2. Each observation is a row; each row is an observation. 3. Each value is a cell; each cell is a single value. ::: ::: {.callout-tip title="Test your skills"} Ready to test your data wrangling skills? Download and complete the [exercises for this chapter](../exercises/1_required_skills_exercises.qmd). :::