E3: Statistical analysis using parametric tests


Axel Antoine, Sylvain Malacria, and Géry Casiez. 2017. ForceEdge: Controlling Autoscroll on Both Desktop and Mobile Computers Using the Force. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI ‘17). ACM, New York, NY, USA, 3281-3292. DOI: https://doi.org/10.1145/3025453.3025605

The goal is to replicate the results presented on page 3285 at the bottom right, section Results and trial time.

Run a first ANOVA using all blocks

Goal: run a first ANOVA to notice there an effect of learning.

  1. Get the data from the first experiment. The CSV file provides columns with the participant number, task (Select, Move), technique (ForceEdge, Baseline), block number, dragging distance, repetition number, trial number (the participant had to successfully complete a trial before moving to the next one), trial completion time in seconds, the overshoot distance in lines, and, last, the success or error for the trial.
  2. Load the data in R
  3. Use the filter command from dplyr package to keep only the data where success is True (it should remain 2880 trials).
  4. Now we want to aggregate data to compute the mean time for participant, task, technique, block and distance. Use the group_by and summarise functions from dplyr.
  5. Convert the data to long format using melt function (see examples provided in the course)
  6. Define factors task, technique, block and distance (see examples provided in the course)
  7. Run the ANOVA using ezANOVA (see examples provided in the course)
  8. Compare the results you obtain with the ones reported in the paper. You should obtain the same p and F values. Size effects will be different as R provides “general eta-square” while the results provided in the paper were obtained using SPSS which gives “eta-square” values.
  9. When relevant, run pairwise comparisons to understand where a significant effect or interaction comes from, using pairwise.t.test

Remove block 1 and re-run the ANOVA

Remove Block 1 and re-run the ANOVA and check you find the same results as provided in the paper.

Note on the violation of the Test for Sphericity

ezANOVA provides the Mauchly’s Test for Sphericity, allowing you to check the assumptions for running the ANOVA are met. Mauchly’s Test for Sphericity outputs a table with W and p values. If the p-value is below 0.05, it means that the assumptions are violated for the corresponding independent variable or interaction between different independent variables. Greenhouse-Geisser and Huynh-Feldt corrections have be applied to the degrees of freedom and p-value. Generally the epsilon of Greenhouse-Geisser is used but if the epsilon of Greenhouse-Geisser (GGe in the table) is greater than 0.75, you should use the epsilon of Huynh-Feldt.

You apply the corrections by multiplying the degrees of freedom by the epsilon of Greenhouse-Geisser or Huynh-Feldt.

For example, you will notice that the sphericity is violated for the independent variable distance (p<0.05) and the epsilon of Greenhouse-Geisser is 0.5766639. As a result the original degrees of freedom F(2,30) need to be corrected into F(1.19, 17.96) and the p-value for Greenhouse-Geisser should be reported (p<0.0001).

anova_apa function from apa package automatically applies the corrections and presents the results of the ANOVA using APA style.


Follow the submission instructions on the course information page. Provide the Rmd file and its html output (using the Knit button on the interface) showing all the different steps to run the ANOVAs. Make good use of markdown syntax to present your results in a way that is easy to read and understand. In your solution notes (integrated in the Rmd file), describe any problems you ran into, and the main resource(s) you used (blog posts, online tutorials, stackoverflow posts, papers, textbooks, etc.). These resources should have brief descriptions of what the resource is and how it helped you.