18  The impact of one form of careless responding on power and the false positive rate ✎ Very rough draft

18.1 Assignment

Background/Rationale of the exercise:

Many surveys that use a Likert scale or a slider have a default response. E.g., when you load the page all answers already have a default answer of “0” on a -3 to +3 scale. Careless or lazy responding is common. Some participants simply leave the default answers and click “next” in the survey. Many researchers don’t use attention checks in their surveys, or don’t use good ones, and these careless or lazy responses are not excluded. This stimulation seeks to quantify the impact of this type of responding on the results. Please note there any many other forms of careless responding - this is just one example and doesn’t provide a full answer to this question.

Exercise:

Write R code from scratch, but using our established workflow, that does the following:

  • Data generation function
    • Simulate two independent groups, control and intervention, drawn from a normal distribution. The mean and SD of both conditions should be variables.
  • Corrupt data function
    • You can use the corrupt data function I provide you with below. This replaces a proportion of the whole dataset’s ‘score’ column with a default value (in this case zero). You should use the usual mutate() and pmap() workflow to create a new column, corrupted_data, from an existing column named generated_data.
  • Analyze data function
    • Fit a Student’s t-test and extract the p value in a tidy tibble.
  • An expand grid call using:
    • n per condition = 100
    • mean = 0 for the control group
    • mean = 0 or 0.50 for the intervention group (two scenarios, population effect exists or does not)
    • SD = 1
    • proportion of straight line responders = 0 or 0.1
    • 1000 iterations
    • using set.seed(42)
  • Summarize across iterations
    • Summarise the proportion of significant p values in all simulated conditions in a table or plot
    • Provide a description and interpretation of the results: How does this form of straight line responding affect the false positive rate? How does it affect power? (briefly, in two or three sentences)

18.2 Dependencies

library(tidyverse)
library(scales)
library(sn)
library(janitor)
library(effsize)
library(faux)

18.3 Generate data function

18.4 Contaminate data function

contaminate_data <- function(data, proportion_straightline_responder, value = 0) {
  data %>% 
    mutate(is_straightline_responder = runif(n()) < proportion_straightline_responder,    # Bernoulli(proportion)
           score       = if_else(is_straightline_responder, value, score)) %>% 
    ungroup()
}

18.5 Analyze data function

18.6 Simulation parameters

18.7 Run simulation

set.seed(42)

18.8 Summarize results across iterations

[written description and interpretation of results here]

Remember that this is just one narrow simulation of the impact of one type of lazy/careless responding on one type of analysis - other forms and other analyses can be affected very differently.