1 Preparing data

1.1 Loading data

data = read.table("https://cs.uwaterloo.ca/~gcasiez/cs889f18/exercises/fittsdata.csv", header=TRUE, sep=",")
#data = read.table("fittsdata.csv", header=TRUE, sep=",")

1.2 Preparing data

We need to add a new column error to indicate 0 or 1 if there is an error as err represents the number of errors made before selecting the target. Directly using err to compute the error rates results in wrong error rates (you can get above 100% in some cases!). We also do some transformations to use more appropriate names.

data_trans =  data %>% 
              dplyr::rename(device = ismobile) %>% # rename ismobile with more appropriate name
              mutate(device = ifelse(device == "True", "mobile", "desktop")) %>% # set proper names
              mutate(w = ifelse(w == 0.035, "Ws", "Wl")) %>% # set proper names
              mutate(d = ifelse(d == 0.35, "Ds", "Dl")) %>% # set proper names
              mutate(error = ifelse(err == 0, 0, 1)) # add a new column error to indicate if there is an error

2 Evaluating H1

H1 is “Pointing using indirect interaction (e.g. using a mouse/touchpad) results in lower error rates, especially for small targets”. We need to investigate the effect of device and target size on error rate to answer this question. More interesting, this hypothesis suggests it would exist an interaction between device and target size.

First compute error rate for each participant x device x d x w condition. There is no hypothesis on orientation so we can aggregate the orientations. Also we need some level of aggregation to compute error rates.

data_aggr =   data_trans %>% 
              group_by(participant, device, d, w) %>% 
              summarise(errRate = mean(error)*100) 

2.1 Run ANOVA on error rate

Convert data using the long format

data.long = melt(data_aggr, id = c("participant","device","d", "w","errRate"))

Indicate what are the independent variables (factors)

data.long$device = factor(data.long$device)
data.long$d = factor(data.long$d)
data.long$w = factor(data.long$w)

We have a mixed design with d and w administrated within subject and device administrated between subject.

results = ezANOVA(data.long, dv=.(errRate), wid=.(participant), within=.(d,w), between=.(device), detailed = TRUE)
Effect DFn DFd SSn SSd F p p<.05 ges
1 (Intercept) 1 11 13452.002543 3868.064 38.2548059 0.0000686 * 0.5597425
2 device 1 11 9498.115521 3868.064 27.0107416 0.0002958 * 0.4730468
3 d 1 11 228.862047 4249.803 0.5923763 0.4577113 0.0211726
5 w 1 11 5721.551176 1132.920 55.5529577 0.0000127 * 0.3509717
4 device:d 1 11 190.756214 4249.803 0.4937448 0.4968628 0.0177098
6 device:w 1 11 4095.941937 1132.920 39.7692308 0.0000579 * 0.2790831
7 d:w 1 11 25.429116 1329.693 0.2103646 0.6554112 0.0023976
8 device:d:w 1 11 8.514213 1329.693 0.0704346 0.7956130 0.0008041
kable(results$`Mauchly's Test for Sphericity`)
kable(results$`Sphericity Corrections`)
kable(anova_apa(results, sph_corr ="gg", print=FALSE))
effect text
(Intercept) F(1, 11) = 38.25, p < .001, petasq = .78
device F(1, 11) = 27.01, p < .001, petasq = .71
d F(1, 11) = 0.59, p = .458, petasq = .05
w F(1, 11) = 55.55, p < .001, petasq = .83
device:d F(1, 11) = 0.49, p = .497, petasq = .04
device:w F(1, 11) = 39.77, p < .001, petasq = .78
d:w F(1, 11) = 0.21, p = .655, petasq = .02
device:d:w F(1, 11) = 0.07, p = .796, petasq < .01
kable(ezStats(data.long, dv=.(errRate), wid=.(participant), between=.(device)))
device N Mean SD FLSD
desktop 7 3.571429 6.672321 11.44711
mobile 6 30.681818 11.831286 11.44711
kable(ezStats(data.long, dv=.(errRate), wid=.(participant), within=.(w)))
w N Mean SD FLSD
Wl 13 5.594406 8.121377 12.61424
Ws 13 26.573427 26.623986 12.61424
kable(ezStats(data.long, dv=.(errRate), wid=.(participant), within=.(w), between=.(device)))
device w N Mean SD FLSD
desktop Wl 7 1.298701 3.436041 8.761203
desktop Ws 7 5.844156 10.066681 8.761203
mobile Wl 6 10.606061 9.389050 8.761203
mobile Ws 6 50.757576 16.618387 8.761203

2.2 Interpretation of the ANOVA

The ANOVA shows a signiciant main effect of device (\(F_{1, 11} = 27.01, p < .001, \eta^2 = .71\)), width (\(F_{1, 11} = 55.55, p < .001, \eta^2 = .83\)) and significant device x width interaction (\(F_{1, 11} = 39.77, p < .001, \eta^2 = .78\)) on error rate. Overall the error rate is 30% with mobile and 3.5% in the desktop condition and increases from 5.6% for the large target to 26.6% for the small one. Let’s analyze where the interaction between device and width comes from.

Note that the ANOVA does not provide results for Mauchly’s Test for Sphericity. This is normal as the documentation says “Only reported for effects >2 levels because sphericity necessarily holds for effects with only 2 levels.” Error rates do not necessarily follow a normal distribution, as can be confirmed by the following Shapiro test.

check_normal = data_aggr %>%
                group_by(device, w) %>%
                summarize(p_value = shapiro.test(errRate)$p)
device w p_value
desktop Wl 0.0000017
desktop Ws 0.0000545
mobile Wl 0.0055549
mobile Ws 0.4626207

2.3 Investigating the significant device x width interaction

First investigate using t-test, even we know distributions are not normal.

pw <- pairwise.t.test(errRate, interaction(device,w), p.adj = "bonferroni")
desktop.Wl mobile.Wl desktop.Ws
mobile.Wl 0.7513503 NA NA
desktop.Ws 1.0000000 1e+00 NA
mobile.Ws 0.0000000 3e-07 0
 ezPlot(data.long, dv=.(errRate), wid=.(participant), within=.(w), between=.(device), x=.(w), do_lines = FALSE)

Also investigate using Mann-Whitney pairwise comparisons that are robust to violations of normality. Results show the same significant differences.

pander(pairwise.wilcox.test(errRate, interaction(device,w), p.adj = "bonf"))
  • method: Wilcoxon rank sum test
  • data.name: errRate and interaction(device, w)
  • p.value:

      desktop.Wl mobile.Wl desktop.Ws
    mobile.Wl 0.2076 NA NA
    desktop.Ws 1 1 NA
    mobile.Ws 4.494e-05 0.001942 0.0002633
  • p.adjust.method: bonferroni

kable(aggregate( errRate~device+w, data.long, mean ))
device w errRate
desktop Wl 1.298701
mobile Wl 10.606061
desktop Ws 5.844156
mobile Ws 50.757576

2.4 Run the analysis using Aligned Rank Transform (ART)

ART allows to transform non-normal data to run ANOVA. This was not covered in the previous exercises but this is the best way to analyze the error rate.

m <- art(errRate ~ device*d*w + (1|participant), data=data.long)
Term F Df Df.res Pr(>F)
device device 27.0101715 1 11 0.0002958
d d 3.6782768 1 33 0.0638140
w w 40.3932928 1 33 0.0000003
device:d device:d 0.9121743 1 33 0.3464826
device:w device:w 29.2477257 1 33 0.0000055
d:w d:w 1.2459526 1 33 0.2723890
device:d:w device:d:w 0.0762207 1 33 0.7842075

Results show the same significant effect of device and width and same significant interaction between device and width.

2.5 Conclusion

Subsequent pairwise comparisons on the significant device x width interaction does not reveal signficiant difference between mobile and desktop for Wl but significant differences between the two device for Ws. As a result we can say that H1 is supported. However these results need to be considered with care as the physical size in millimeters of the targets was not the same between the two devices. The targets were indeed smaller on the mobile screen as their size is defined in percentage of screen width and mobile screens were smaller. Further experiments showing targets with the same physical size on both devices would be required to fully answer this question. However web browers harly allow to determine screens’ pixel densities.

3 Evaluating H2

H2 is “Pointing times increase with longer distances and smaller target widths”. We need to investigate the effect target distance and size on movement time to answer this question.

3.1 Visualizing data

We first plot the raw data from the participants

data_noerr =   data_trans %>% 
              filter(err == 0)
ggplot() +
  geom_jitter(aes(interaction(device,d,w), time), data = data_noerr, colour = I("red"), position = position_jitter(width = 0.05)) +
  scale_y_continuous(limits = c(0, 2000)) +
  theme(panel.background = element_blank(),
        panel.grid.major.y = element_line( size = .1, color = "grey"))

We can observe some outliers in the distribution. Maybe some participants got distracted when selecting some targets. We remove the outliers by removing trials 2 standard deviations away from the mean.

data_filt =   data_trans %>% 
              filter(err == 0) %>%
              group_by(device,d,w) %>%
              filter(!(abs(time - mean(time)) > 3*sd(time)))

ggplot() +
  geom_jitter(aes(interaction(device,d,w), time), data = data_filt, colour = I("red"), position = position_jitter(width = 0.05)) +
  scale_y_continuous(limits = c(0, 2000)) +
  theme(panel.background = element_blank(),
        panel.grid.major.y = element_line( size = .1, color = "grey"))

We consider only trials with no error and aggregate by participant, device, d and w.

data_aggr =   data_filt %>% 
              filter(err == 0) %>% 
              group_by(participant, device, d, w) %>%   
              summarise(time = mean(time))

3.2 Run ANOVA on movement time

Convert data using the long format

data.long = melt(data_aggr, id = c("participant","device","d", "w","time"))

Indicate what are the independent variables (factors)

data.long$device = factor(data.long$device)
data.long$d = factor(data.long$d)
data.long$w = factor(data.long$w)
results = ezANOVA(data.long, dv=.(time), wid=.(participant), within=.(d,w), between=.(device), detailed = TRUE)
Effect DFn DFd SSn SSd F p p<.05 ges
1 (Intercept) 1 11 37225570.511 485826.67 842.8546737 0.0000000 * 0.9799869
2 device 1 11 415283.404 485826.67 9.4027721 0.0107288 * 0.3532828
3 d 1 11 63380.160 111200.75 6.2695777 0.0293019 * 0.0769555
5 w 1 11 96306.091 61204.33 17.3086943 0.0015884 * 0.1124386
4 device:d 1 11 2146.772 111200.75 0.2123591 0.6539018 0.0028159
6 device:w 1 11 3535.460 61204.33 0.6354135 0.4422430 0.0046291
7 d:w 1 11 29074.086 101983.47 3.1359487 0.1042528 0.0368358
8 device:d:w 1 11 3210.703 101983.47 0.3463084 0.5680968 0.0042057
kable(results$`Mauchly's Test for Sphericity`)
kable(results$`Sphericity Corrections`)
kable(anova_apa(results, sph_corr ="gg", print=FALSE))
effect text
(Intercept) F(1, 11) = 842.85, p < .001, petasq = .99
device F(1, 11) = 9.40, p = .011, petasq = .46
d F(1, 11) = 6.27, p = .029, petasq = .36
w F(1, 11) = 17.31, p = .002, petasq = .61
device:d F(1, 11) = 0.21, p = .654, petasq = .02
device:w F(1, 11) = 0.64, p = .442, petasq = .05
d:w F(1, 11) = 3.14, p = .104, petasq = .22
device:d:w F(1, 11) = 0.35, p = .568, petasq = .03
kable(aggregate( time~device, data.long, mean ))
device time
desktop 928.8313
mobile 749.5688
kable(aggregate( time~w, data.long, mean ))
w time
Wl 803.0594
Ws 889.1301
kable(aggregate( time~d, data.long, mean ))
d time
Dl 881.0068
Ds 811.1827

3.3 Conclusion

The ANOVA reveals a significant main effect of device (\(F_{1, 11} = 9.4, p = .011, \eta^2 = .46\)), d (\(F_{1, 11} = 6.27, p = .029, \eta^2 = .36\)) and w (\(F_{1, 11} = 17.31, p = .002, \eta^2 = .61\)). There is no need to run pairwise comparisons as there are only 2 levels for each independent variable. It shows that mobile (0.75 s) is signficantly faster than desktop (0.93 s). Time increases with smaller target widths (from 0.81 s for the largest target to 0.89 s for the smallest one) and with the longest distance (from 0.81 s for the smallest distance to 0.88 s for the largest one). We can conclude that H2 is supported.

4 Evaluating H3

H3 is “Pointing times follow Fitts’ law for both direct and indirect devices”. This requires to compute the linear relationship between movement and index of difficulty, computed as \(log_2(\frac{d}{w}+1)\) using the MacKenzie formulation, for each device.

We first keep trials with no error, remove outliers and compute the index of difficulty. The common practice to compute linear regression for Fitts’ law is to aggregate all data for each (d,w).

data_aggr2 =   data %>% 
              filter(err == 0) %>% 
              mutate(id = log2(d/w+1)) %>%
              dplyr::rename(device = ismobile) %>% 
              mutate(device = ifelse(device == "True", "mobile", "desktop")) %>% 
              group_by(device, d, w, id) %>%   
              filter(!(abs(time - mean(time)) > 3*sd(time))) %>%
              summarise(time = mean(time)) 
device d w id time
desktop 0.35 0.035 3.459432 915.0685
desktop 0.35 0.070 2.584963 895.9474
desktop 0.70 0.035 4.392317 1024.2361
desktop 0.70 0.070 3.459432 887.2267
mobile 0.35 0.035 3.459432 752.7353
mobile 0.35 0.070 2.584963 671.0667
mobile 0.70 0.035 4.392317 805.7241
mobile 0.70 0.070 3.459432 722.6429

4.1 Plot

ggplot(data_aggr2, aes(x=id, y=time, group = device, color = device)) + 
  geom_point() + 
  geom_smooth(method=lm) +
  expand_limits(x = c(0,5), y = c(0,1500)) +
  scale_color_manual(values=c("red", "blue"), labels=c('Desktop','Mobile'), name="Device") +
  labs(title="Fitts regressions", x="ID (bit)", y="Mean time (ms)") +
  theme(panel.background = element_blank(),
        panel.grid.major.y = element_line( size = .1, color = "grey"))

4.2 Compute linear regression for mobile device

data_aggr3 = data_aggr2 %>% filter(device == 'mobile')
summary(lm(time ~ id, data_aggr3))
## Call:
## lm(formula = time ~ id, data = data_aggr3)
## Residuals:
##        1        2        3        4 
##  15.7808  -0.7583  -0.7108 -14.3117 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   479.30      41.68  11.501  0.00748 **
## id             74.48      11.80   6.313  0.02419 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 15.08 on 2 degrees of freedom
## Multiple R-squared:  0.9522, Adjusted R-squared:  0.9283 
## F-statistic: 39.85 on 1 and 2 DF,  p-value: 0.02419

The adjusted \(R^2\) value is 0.93 and the p value of the model is below 0.05, suggesting it follows Fitts’ law very well.

4.3 Compute linear regression for mobile desktop

data_aggr4 = data_aggr2 %>% filter(device == 'desktop')
summary(lm(time ~ id, data_aggr4))
## Call:
## lm(formula = time ~ id, data = data_aggr4)
## Residuals:
##      1      2      3      4 
## -14.50  29.34  27.50 -42.34 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   680.50     117.56   5.788   0.0286 *
## id             72.00      33.28   2.163   0.1630  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 42.55 on 2 degrees of freedom
## Multiple R-squared:  0.7006, Adjusted R-squared:  0.5509 
## F-statistic:  4.68 on 1 and 2 DF,  p-value: 0.163

The adjusted \(R^2\) value is 0.55 and the p value of the model is above 0.05, suggesting it does not follow Fitts’ law very well. The visual inspecting of the graph suggest participants took longer than expected to select the target with the lowest index of difficulty. Further investigations would be required to understand why.

4.4 Conclusion

We can only partially validate H3 from the results of the experiment.