Giacomo Bignardi
@bignardi.bsky.social
500 followers 1.1K following 88 posts
Research Associate at the Social, Genetic & Developmental Psychiatry Centre, Kings College London. Interests: developmental psychology, data science, coffee. www.bignardi.co.uk
Posts Media Videos Starter Packs
bignardi.bsky.social
Hmm, interesting, I've had this on my reading list but have not read it yet, the intro is so clear and great! I disagree that reliability implies equal precision across scores, but it seems like an interesting way of going beyond reliability...
bignardi.bsky.social
Good point, would be interesting to try (and also force myself to learn generalizability theory in more detail)!
bignardi.bsky.social
Helping a PhD student with SDT models was actually the original inspiration for this side project! I have example code applying the method to a SDT model below (actually borrowing code from @matti.vuorre.com's helpful blog on this) ->
Tutorial: Calculating RMU reliability for a go/no-go task
www.bignardi.co.uk
bignardi.bsky.social
Ah i misunderstood, no introduction rewrite needed yet then 😅. Good to know it gives similar results to the other approaches!
bignardi.bsky.social
Exciting! Just to check - can you get confidence intervals from PSI & EAP too?
Reposted by Giacomo Bignardi
pgmj.bsky.social
This is really neat. I have borrowed the reliability() function to my `easyRasch` package, and use plausible values instead of fully Bayesian estimation to produce similar estimates/CIs, see code example below. RMU point estimates are similar to EAP reliability.

pgmj.github.io/easyRasch/re...
bignardi.bsky.social
New preprint with @rogierk.bsky.social @paulbuerkner.com - we introduce "relative measurement uncertainty" - a reliability estimation method that's applicable across a broad class of Bayesian measurement models (e.g., generative-, computational- and item response theory-models osf.io/h54k8
OSF
osf.io
bignardi.bsky.social
I've added a new example to our paper's repo, demonstrating how our reliability method replicates Cronbach's alpha for a simple model, but also how our method can account for: (i) binary data, (ii) varying numbers of items per pps, & (iii) improvement over trials www.bignardi.co.uk/8_bayes_reli...
bignardi.bsky.social
Thanks Edwin! I only learned from the best 😜
bignardi.bsky.social
Great! Sorry just saw this before my replies :)
bignardi.bsky.social
I think both methods should converge to the same answer as the number of draws -> ∞. I think your approach is 1-V_w/V whereas empirical reliability is defined as v_a/(v_a+v_w) - but it should be the same as draws -> ∞. Pic from onlinelibrary.wiley.com/doi/10.1002/...
bignardi.bsky.social
Hi Ruben - i had the same thought - and it wasn't easy finding the reference in an old IRT manual! In your post, do you divide the variance of subjects' posterior means by the total variance in MCMC draws across subjects? The best description of ER is in here by Phil Chalmers tinyurl.com/4f3vv5e8
Difference between empirical and marginal reliability of an IRT model
I am using the mirt library in R to fit an instrument (binary responses) comprising two dimensions. In the mirt documentation are mentioned two types of reliability. I would like to ask what is the
tinyurl.com
bignardi.bsky.social
Any (ideally helpful) comments/feedback/critique welcome! Email at: [email protected]
bignardi.bsky.social
Alongside the preprint osf.io/h54k8 I've uploaded an example of how to calculate RMU with go/no-go task data, comparing RMU with split-half and test-retest reliability estimates www.bignardi.co.uk/8_bayes_reli... - more software support to come if there's demand.
Tutorial: Calculating RMU reliability for a go/no-go task
www.bignardi.co.uk
bignardi.bsky.social
In the paper, we also demonstrate that our method yields a similar point estimate to using the ratio of subjects' posterior mean variance (PMV) divided by PMV + average posterior variance. However, our approach also provides credible intervals, which are essential when sample sizes are small.
bignardi.bsky.social
We also compared RMU to coefficient alpha and H (Study 1) and split-half reliability (Study 2). RMU generally had lower bias and error, as well as better coverage. However, the main benefit of RMU is that it can be applied across a wide range of Bayesian measurement models - unlike alpha.
bignardi.bsky.social
We ran 3 simulation studies to evaluate RMU's bias, accuracy and coverage (% of times the 95% credible intervals included the right answer) across linear factor, signal detection theory and reinforcement learning models. Results were good-to-great across studies and simulation conditions.
bignardi.bsky.social
Our method involves taking two random draws (without replacement) from each subject's posterior and computes the correlation between draws. We repeat this process many times and use the distribution of correlations to calculate credible intervals or posterior probabilities (e.g., reliability > .70).
bignardi.bsky.social
The intuition behind our method is that the amount of overlap between posteriors is important: in the example above, if the posteriors became significantly broader, we would be much more uncertain in each estimate and struggle to distinguish between participants, reducing reliability.
bignardi.bsky.social
Our reliability method can be applied to most fitted Bayesian measurement models. In the example below, I've used a Bayesian signal detection model to estimate posterior distributions for each subject's performance on a go/no-go test. Each curve shows our certainty in each subject's estimate.
bignardi.bsky.social
While estimating reliability with questionnaires is straightforward, there are few methods when deriving individual differences from computational models. Thus, reliability is often poor for widely used computational measures when evaluated with retest data (e.g., pubmed.ncbi.nlm.nih.gov/36940888/ )
Individual differences in computational psychiatry: A review of current challenges - PubMed
Bringing precision to the understanding and treatment of mental disorders requires instruments for studying clinically relevant individual differences. One promising approach is the development of com...
pubmed.ncbi.nlm.nih.gov
bignardi.bsky.social
New preprint with @rogierk.bsky.social @paulbuerkner.com - we introduce "relative measurement uncertainty" - a reliability estimation method that's applicable across a broad class of Bayesian measurement models (e.g., generative-, computational- and item response theory-models osf.io/h54k8
OSF
osf.io
bignardi.bsky.social
Inspecting the regression coefficients, we found that going to bed hungry was the strongest and most consistent predictor out of all the food insecurity questions.❗We should note that the survey took place during school term, and food insecurity is likely even worse during the school holidays.