# dependencies
library(dplyr)
library(tidyr)25 Exercises for ‘Creating simulation experiments’ chapter ✎ Very rough draft
25.1 TO-DO:
25.1.1 Make explicit not to write functions
25.1.2 How to make an R chunk; add the chunks into the exercises
25.1.3 Make exercise 3 wording clearer
25.1.4 Make variables named
25.2 Exercises
25.2.1 Exercise 1: Basic parameter grid
Use expand_grid() to create a parameter grid for a simulation where:
n_per_conditionis 20, 50, or 100mean_interventionis 0.2 or 0.8sdis 1correlation_between_conditionsis 0.3 or 0.5.
How many rows does the resulting grid have? Verify using nrow() and distinct().
25.2.2 Exercise 2: Filtering implausible combinations
Starting from the grid below, use filter() to remove any rows where n_per_condition is less than 50 and mean_intervention is less than 0.3. How many rows remain?
expand_grid(
n_per_condition = c(20, 50, 100, 200),
mean_intervention = c(0.1, 0.3, 0.5),
sd = c(0.5, 1)
)| n_per_condition | mean_intervention | sd |
|---|---|---|
| 20 | 0.1 | 0.5 |
| 20 | 0.1 | 1.0 |
| 20 | 0.3 | 0.5 |
| 20 | 0.3 | 1.0 |
| 20 | 0.5 | 0.5 |
| 20 | 0.5 | 1.0 |
| 50 | 0.1 | 0.5 |
| 50 | 0.1 | 1.0 |
| 50 | 0.3 | 0.5 |
| 50 | 0.3 | 1.0 |
| 50 | 0.5 | 0.5 |
| 50 | 0.5 | 1.0 |
| 100 | 0.1 | 0.5 |
| 100 | 0.1 | 1.0 |
| 100 | 0.3 | 0.5 |
| 100 | 0.3 | 1.0 |
| 100 | 0.5 | 0.5 |
| 100 | 0.5 | 1.0 |
| 200 | 0.1 | 0.5 |
| 200 | 0.1 | 1.0 |
| 200 | 0.3 | 0.5 |
| 200 | 0.3 | 1.0 |
| 200 | 0.5 | 0.5 |
| 200 | 0.5 | 1.0 |
25.2.3 Exercise 3: Dependent parameters with mutate()
You are simulating a reading comprehension study. Your parameters are:
n_passages: the number of passages participants read (5, 10, or 20)passage_difficulty: “easy” or “hard”
The total time allowed (time_limit_minutes) depends on the other two parameters: participants get 2 minutes per easy passage and 4 minutes per hard passage. Use expand_grid() and mutate() to create the parameter grid with time_limit_minutes derived from the other columns.
example_data <- expand_grid(
n_passages = c(5, 10, 20),
passage_difficulty = c("easy", "hard")
) %>%
mutate(time_limit_minutes =
case_when(
passage_difficulty == "easy" ~ n_passages * 2,
passage_difficulty == "hard" ~ n_passages * 4
)
)25.2.4 Exercise 4: Choosing the right design
For each scenario below, determine whether you would use (a) a fully-crossed design, (b) a non-fully-crossed design with filter(), or (c) a design with dependencies using mutate().
You vary sample size (50, 100, 200) and effect size (0.2, 0.5, 0.8), and all combinations are of interest and kept.
You vary the number of predictors (2, 5, 10) and the number of observations (20, 50, 100, 500), but you want to exclude cases where the number of observations is smaller than 10 times the number of predictors.
You vary the number of items on a test (10, 20, 40) and want the total test time to always equal 2 minutes per item.
You are simulating a clinical trial and vary the dropout rate (5%, 15%, 30%) and treatment effect size (0.3, 0.5, 0.7). All combinations are realistic because patients drop out for many reasons unrelated to efficacy.
You are simulating ecological data on bird species counts across habitats. You vary habitat type (forest, wetland, urban) and survey area size (1 km², 5 km², 25 km²), but want to exclude large survey areas for urban habitats because contiguous urban green spaces larger than 5 km² are unrealistic in your study region.
25.2.5 Exercise 5: Build your own simulation design
Think of a research question from an area of research you find interesting. Define at least three parameters you would want to vary, create a parameter grid using expand_grid(), and include at least one dependency or filter. Use nrow() and distinct() to sanity-check your grid. Write a brief comment (2–3 sentences) explaining why you chose each parameter value and why you included the dependency or filter.