Lightnews — Scholar-powered news

Giacomo Bignardi @bignardi.bsky.social · 8h

Neat! I have brms code for replicating (and going beyond) alpha too incase its useful: www.bignardi.co.uk/8_bayes_reli...

Estimating Mean Score Reliability with RMU

www.bignardi.co.uk

1

Giacomo Bignardi @bignardi.bsky.social · 9h

Hmm, interesting, I've had this on my reading list but have not read it yet, the intro is so clear and great! I disagree that reliability implies equal precision across scores, but it seems like an interesting way of going beyond reliability...

1 1

Giacomo Bignardi @bignardi.bsky.social · 9h

Good point, would be interesting to try (and also force myself to learn generalizability theory in more detail)!

1

Giacomo Bignardi @bignardi.bsky.social · 10h

Helping a PhD student with SDT models was actually the original inspiration for this side project! I have example code applying the method to a SDT model below (actually borrowing code from @matti.vuorre.com's helpful blog on this) ->

Tutorial: Calculating RMU reliability for a go/no-go task

www.bignardi.co.uk

3

Giacomo Bignardi @bignardi.bsky.social · 14h

Ah i misunderstood, no introduction rewrite needed yet then 😅. Good to know it gives similar results to the other approaches!

1

Giacomo Bignardi @bignardi.bsky.social · 14h

Exciting! Just to check - can you get confidence intervals from PSI & EAP too?

1

Reposted by Giacomo Bignardi

Magnus Johansson @pgmj.bsky.social · 16h

This is really neat. I have borrowed the reliability() function to my `easyRasch` package, and use plausible values instead of fully Bayesian estimation to produce similar estimates/CIs, see code example below. RMU point estimates are similar to EAP reliability.

pgmj.github.io/easyRasch/re...

Giacomo Bignardi @bignardi.bsky.social · 9d

New preprint with @rogierk.bsky.social @paulbuerkner.com - we introduce "relative measurement uncertainty" - a reliability estimation method that's applicable across a broad class of Bayesian measurement models (e.g., generative-, computational- and item response theory-models osf.io/h54k8

OSF

osf.io

2 1 7

Giacomo Bignardi @bignardi.bsky.social · 2d

I've added a new example to our paper's repo, demonstrating how our reliability method replicates Cronbach's alpha for a simple model, but also how our method can account for: (i) binary data, (ii) varying numbers of items per pps, & (iii) improvement over trials www.bignardi.co.uk/8_bayes_reli...

2

Giacomo Bignardi @bignardi.bsky.social · 9d

Thanks Edwin! I only learned from the best 😜

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

Great! Sorry just saw this before my replies :)

Giacomo Bignardi @bignardi.bsky.social · 9d

I think both methods should converge to the same answer as the number of draws -> ∞. I think your approach is 1-V_w/V whereas empirical reliability is defined as v_a/(v_a+v_w) - but it should be the same as draws -> ∞. Pic from onlinelibrary.wiley.com/doi/10.1002/...

Giacomo Bignardi @bignardi.bsky.social · 9d

Hi Ruben - i had the same thought - and it wasn't easy finding the reference in an old IRT manual! In your post, do you divide the variance of subjects' posterior means by the total variance in MCMC draws across subjects? The best description of ER is in here by Phil Chalmers tinyurl.com/4f3vv5e8

Difference between empirical and marginal reliability of an IRT model

I am using the mirt library in R to fit an instrument (binary responses) comprising two dimensions. In the mirt documentation are mentioned two types of reliability. I would like to ask what is the

tinyurl.com

1

Giacomo Bignardi @bignardi.bsky.social · 9d

...

Giacomo Bignardi @bignardi.bsky.social · 9d

Any (ideally helpful) comments/feedback/critique welcome! Email at: [email protected]

1

Giacomo Bignardi @bignardi.bsky.social · 9d

Alongside the preprint osf.io/h54k8 I've uploaded an example of how to calculate RMU with go/no-go task data, comparing RMU with split-half and test-retest reliability estimates www.bignardi.co.uk/8_bayes_reli... - more software support to come if there's demand.

Tutorial: Calculating RMU reliability for a go/no-go task

www.bignardi.co.uk

1 3

Giacomo Bignardi @bignardi.bsky.social · 9d

In the paper, we also demonstrate that our method yields a similar point estimate to using the ratio of subjects' posterior mean variance (PMV) divided by PMV + average posterior variance. However, our approach also provides credible intervals, which are essential when sample sizes are small.

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

We also compared RMU to coefficient alpha and H (Study 1) and split-half reliability (Study 2). RMU generally had lower bias and error, as well as better coverage. However, the main benefit of RMU is that it can be applied across a wide range of Bayesian measurement models - unlike alpha.

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

We ran 3 simulation studies to evaluate RMU's bias, accuracy and coverage (% of times the 95% credible intervals included the right answer) across linear factor, signal detection theory and reinforcement learning models. Results were good-to-great across studies and simulation conditions.

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

Our method involves taking two random draws (without replacement) from each subject's posterior and computes the correlation between draws. We repeat this process many times and use the distribution of correlations to calculate credible intervals or posterior probabilities (e.g., reliability > .70).

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

The intuition behind our method is that the amount of overlap between posteriors is important: in the example above, if the posteriors became significantly broader, we would be much more uncertain in each estimate and struggle to distinguish between participants, reducing reliability.

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

Our reliability method can be applied to most fitted Bayesian measurement models. In the example below, I've used a Bayesian signal detection model to estimate posterior distributions for each subject's performance on a go/no-go test. Each curve shows our certainty in each subject's estimate.

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

While estimating reliability with questionnaires is straightforward, there are few methods when deriving individual differences from computational models. Thus, reliability is often poor for widely used computational measures when evaluated with retest data (e.g., pubmed.ncbi.nlm.nih.gov/36940888/ )

Individual differences in computational psychiatry: A review of current challenges - PubMed

Bringing precision to the understanding and treatment of mental disorders requires instruments for studying clinically relevant individual differences. One promising approach is the development of com...

pubmed.ncbi.nlm.nih.gov

1 1

Giacomo Bignardi @bignardi.bsky.social · 9d

New preprint with @rogierk.bsky.social @paulbuerkner.com - we introduce "relative measurement uncertainty" - a reliability estimation method that's applicable across a broad class of Bayesian measurement models (e.g., generative-, computational- and item response theory-models osf.io/h54k8

OSF

osf.io

2 7 18

Giacomo Bignardi @bignardi.bsky.social · 16d

...

Giacomo Bignardi @bignardi.bsky.social · 16d

Inspecting the regression coefficients, we found that going to bed hungry was the strongest and most consistent predictor out of all the food insecurity questions.❗We should note that the survey took place during school term, and food insecurity is likely even worse during the school holidays.

1