Lightnews — Scholar-powered news

Josh McDermott

@joshhmcdermott.bsky.social

650 followers 190 following 47 posts

Working to understand how humans and machines hear. Prof at MIT; director of Lab for Computational Audition. https://mcdermottlab.mit.edu/

Posts Replies Media Videos

Josh McDermott

@joshhmcdermott.bsky.social

82. A model of continuous speech recognition reveals the role of context in human speech perception - Gasser Elbanna

96. Machine learning models of hearing demonstrate the limits of attentional selection of speech heard through cochlear implants - Annesya Banerjee

November 14, 2025 at 3:27 PM

Josh McDermott

@joshhmcdermott.bsky.social

58. Optimized models of uncertainty explain human confidence in auditory perception - Lakshmi Govindarajan

64. Source-location binding errors in auditory scene perception - Sagarika Alavilli

November 14, 2025 at 3:27 PM

Josh McDermott

@joshhmcdermott.bsky.social

Lakshmi happens to be on the job market right now, so I will end by recommending that everyone try to hire him. He is a wonderful colleague, and I expect he will do many other important things. (end)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

Added bonus: expressing stimulus-dependent uncertainty enables models that perform regression (i.e., yielding continuous valued outputs), which has previously not worked very well. (15/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

Lots more in the paper. The approach is applicable to any perceptual estimation problem. We hope it will enable the study of confidence to be extended to more realistic conditions, via models that can operate on images or sounds. (14/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

He measured human confidence for pitch, and found that confidence was higher for conditions with lower discrimination thresholds. The model reproduced this general trend. (13/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

Lakshmi used the same framework to build models of pitch perception that represent uncertainty. The models generate a distribution over fundamental frequency. (12/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

By contrast, simulating bets using the softmax distribution of a standard classification-based neural network does not yield human-like confidence, presumably because the distribution is not incentivized to have correct uncertainty. (11/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

The model can also be used to select natural sounds whose localization is certain or uncertain. When presented to humans, humans place higher bets on the sounds with low model uncertainty, and vice versa. (10/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

The model replicates patterns of localization accuracy (like previous models) but also replicates the dependence of confidence on conditions. Here confidence is lower for sounds with narrower spectra, and at peripheral locations: (9/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

To simulate betting behavior from the model, he mapped a measure of the model posterior spread to a bet (in cents). (8/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

Lakshmi then tested whether the model’s uncertainty was predictive of human confidence judgments. He ran experiments in which people localized sounds and then placed bets on their localization judgment: (7/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

The model was trained on spatial renderings of lots of natural sounds in lots of different rooms. Once trained, it produces narrow posteriors for some sounds, and broad posteriors for others: (6/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

He first applied this idea to sound localization. The model takes binaural audio as input and estimates parameters of a mixture distribution over a sphere. Distributions can be narrow, broad, or multi-modal, depending on the stimulus. (5/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

Lakshmi realized that models could be trained to output parameters of distributions, and that by optimizing models with a log-likelihood loss function, the model is incentivized to correctly represent uncertainty. (4/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

Standard neural networks are not suitable for several reasons, including because cross-entropy loss is known to induce over-confidence. (3/n)

November 9, 2025 at 9:34 PM

Josh McDermott

@joshhmcdermott.bsky.social

Uncertainty is inevitable in perception. It seems like it would be useful for people to explicitly represent it. But it has been hard to study in realistic conditions, as we haven’t had stimulus-computable models. (2/n)

November 9, 2025 at 9:34 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news