Loading packages

library(simr)

library(effectsize)

Defining covariates

part <- factor(1:40)
place.residence <- factor(c("A_cat", "B_abroad"))
time <- factor(c("A_prepand", "B_postpand"))
place.residence_full <- rep(rep(place.residence, each=20),2)
time_full <- rep(time, each=40)
part_full <- rep(part, each=2)

covars <- data.frame( participant =part_full, place.residence=place.residence_full, time=time_full)

head(covars)

##   participant place.residence      time
## 1           1           A_cat A_prepand
## 2           1           A_cat A_prepand
## 3           2           A_cat A_prepand
## 4           2           A_cat A_prepand
## 5           3           A_cat A_prepand
## 6           3           A_cat A_prepand

Creating the Generalized Mixed Effects Model

Determining the estimates

We have to define intercept and slops for the fixed effects.

The model has 1 intercept.

Every fixed effect has 2 levels = 1 slope per fixed effect

What is the likelihood of Catalans celebrating caga tió when they are abroad compared to when they are in Cat?
What is the likelihood of Catalans celebrating caga tió after the pandemic when compared to before the pandemic?

The interaction (2x2) has 1 slope = 1 slope per interaction

Does time (before pandemic-after pandemic) moderate the likelihood of Catalans celebrating caga tió with respect to their place of residence (abroad or in Cat)?

Calculating the log odds ratio

We have to calculate the log odds ratio to include them in our model.

We will first determine the effect size (based on previous studies!)

We will work with a medium effect size

The statistic for effect sizes is Cohen’s d (small = 0.2, medium = 0.5, large = 0.8)

d_to_oddsratio(0.5, log = FALSE)

## [1] 2.476632

or <- c(2.48, 2.48, 2.48, 2.48)
fixed <- log(or)

Adding random effects

Random intercept for participants

rand <- 0.4

http://eshinjolly.com/pymer4/rfx_cheatsheet.html

MakeGlmer function from simr package

gmodel <- makeGlmer(x ~  place.residence * time + (1|participant) , family="binomial",fixef=fixed, VarCorr=rand,  data=covars)

summary(gmodel)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: x ~ place.residence * time + (1 | participant)
##    Data: covars
## 
##      AIC      BIC   logLik deviance df.resid 
##    117.7    129.6    -53.9    107.7       75 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0000  0.8463  0.8463  0.8463  1.0000 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  participant (Intercept) 0.4      0.6325  
## Number of obs: 80, groups:  participant, 40
## 
## Fixed effects:
##                                        Estimate Std. Error z value Pr(>|z|)  
## (Intercept)                              0.9083     0.4939   1.839   0.0659 .
## place.residenceB_abroad                  0.9083     0.6988   1.300   0.1937  
## timeB_postpand                           0.9083     0.6980   1.301   0.1932  
## place.residenceB_abroad:timeB_postpand   0.9083     0.9886   0.919   0.3582  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) plc.B_ tmB_ps
## plc.rsdncB_ -0.707              
## timB_pstpnd -0.708  0.500       
## plc.rsB_:B_  0.500 -0.707 -0.706

Running Power

Power for place of residence

power.fixed <- powerSim(gmodel, nsim=100, test = compare(~time +(1|participant)), progress = FALSE)

power.fixed

## Power for model comparison, (95% confidence interval):
##       26.00% (17.74, 35.73)
## 
## Test: Likelihood ratio
##       Comparison to ~time + (1 | participant)
## 
## Based on 100 simulations, (38 warnings, 1 error)
## alpha = 0.05, nrow = 80
## 
## Time elapsed: 0 h 0 m 37 s

Power for the interaction

power.interaction <- powerSim(gmodel, nsim=100, test = compare(~time + place.residence  +(1|participant)), progress = FALSE)

power.interaction

## Power for model comparison, (95% confidence interval):
##        6.00% ( 2.23, 12.60)
## 
## Test: Likelihood ratio
##       Comparison to ~time + place.residence + (1 | participant)
## 
## Based on 100 simulations, (32 warnings, 0 errors)
## alpha = 0.05, nrow = 80
## 
## Time elapsed: 0 h 0 m 40 s

Extending the model

model_ext <- extend(gmodel, along = 'participant', n=150)
nrow(getData(model_ext))

## [1] 300

summary(model_ext)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: x ~ place.residence * time + (1 | participant)
##    Data: covars
## 
##      AIC      BIC   logLik deviance df.resid 
##    117.7    129.6    -53.9    107.7       75 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0000  0.8463  0.8463  0.8463  1.0000 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  participant (Intercept) 0.4      0.6325  
## Number of obs: 80, groups:  participant, 40
## 
## Fixed effects:
##                                        Estimate Std. Error z value Pr(>|z|)  
## (Intercept)                              0.9083     0.4939   1.839   0.0659 .
## place.residenceB_abroad                  0.9083     0.6988   1.300   0.1937  
## timeB_postpand                           0.9083     0.6980   1.301   0.1932  
## place.residenceB_abroad:timeB_postpand   0.9083     0.9886   0.919   0.3582  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) plc.B_ tmB_ps
## plc.rsdncB_ -0.707              
## timB_pstpnd -0.708  0.500       
## plc.rsB_:B_  0.500 -0.707 -0.706

Power for place of residence in extended model

power.fixed <- powerSim(model_ext, nsim=10, test = compare(~time +(1|participant)), progress = FALSE)

power.fixed

## Power for model comparison, (95% confidence interval):
##       90.00% (55.50, 99.75)
## 
## Test: Likelihood ratio
##       Comparison to ~time + (1 | participant)
## 
## Based on 10 simulations, (1 warning, 0 errors)
## alpha = 0.05, nrow = 300
## 
## Time elapsed: 0 h 0 m 5 s

Power for the interaction in extended model

power.interaction <- powerSim(model_ext, nsim=10, test = compare(~time + place.residence  +(1|participant)), progress = FALSE)

power.interaction

## Power for model comparison, (95% confidence interval):
##       20.00% ( 2.52, 55.61)
## 
## Test: Likelihood ratio
##       Comparison to ~time + place.residence + (1 | participant)
## 
## Based on 10 simulations, (1 warning, 0 errors)
## alpha = 0.05, nrow = 300
## 
## Time elapsed: 0 h 0 m 5 s

You need 16 times the sample size when power is 80% for main effect to estimate an interaction with power 80%!

Source(https://statmodeling.stat.columbia.edu/2018/03/15/need-16-times-sample-size-estimate-interaction-estimate-main-effect/)