Auditory-Visual Speech Association (AVISA)
banner
avsp.bsky.social
Auditory-Visual Speech Association (AVISA)
@avsp.bsky.social
The official(ish) account of the Auditory-VIsual Speech Association (AVISA) AV 👄 👓 speech references, but mostly what interests me avisa.loria.fr
An embodied multi-articulatory multimodal language framework: A commentary on Karadöller et al
journals.sagepub.com/doi/10.1177/...
"we believe it shows that our understanding of the role of gesture in language is incomplete and lacks crucial insight when co-sign gesture is not accounted for"
An embodied multi-articulatory multimodal language framework: A commentary on Karadöller, Sümer and Özyürek - Rachel Miles, Shai Lynne Nielson, Deniz İlkbaşaran, Rachel I Mayberry, 2025
While many researchers working in spoken languages have used modality to distinguish language and gesture, this is not possible for sign language researchers. W...
journals.sagepub.com
January 26, 2026 at 9:22 PM
The involvement of endogenous brain rhythms in speech processing www.sciencedirect.com/science/arti... Reviews oscillation-based theories (dynamic attending, active sensing, asymmetric sampling in time, segmentation theories) & evidence > Naturalistic paradigms and resting-state data key to progress
The involvement of endogenous brain rhythms in speech processing
Endogenous brain rhythms are at the core of oscillation-based neurobiological theories of speech. These brain rhythms have been proposed to play a cru…
www.sciencedirect.com
January 23, 2026 at 9:03 PM
Children Sustain Their Attention on Spatial Scenes When Planning to Describe Spatial Relations Multimodally in Speech & Gesture onlinelibrary.wiley.com/doi/10.1111/... "How do children allocate visual attention to scenes as they prepare to describe them multimodally in speech and co-speech gesture?"
onlinelibrary.wiley.com
January 20, 2026 at 10:55 PM
Effects of Visual Input in Virtual Reality on Voice Production: Comparing Trained Singers & Untrained Speakers www.jvoice.org/article/S089... Study examined if visual spatial cues in immersive virtual reality (room size, speaker-to-listener distance) are associated with changes in vocal production 🗣️
The Effects of Visual Input in Virtual Reality on Voice Production: Comparing Trained Singers and Untrained Speakers
This study examined whether visual spatial cues presented in immersive virtual reality (IVR)—room size and speaker-to-listener distance—are associated with changes in vocal production, and whether the...
www.jvoice.org
January 19, 2026 at 12:28 PM
Distinct Temporal Dynamics of Speech & Gesture Processing: Insights From ERP Across L1 and L2 psycnet.apa.org/fulltext/202... "results point to potentially distinct neural and temporal dynamics in processing speech versus gestures" -> speech processing earlier as gestures recruit later stages (?)
APA PsycNet
psycnet.apa.org
January 17, 2026 at 9:28 AM
An automated pipeline for efficiently generating standardized, child-friendly AV language stimuli www.sciencedirect.com/science/arti... Creating engaging AV stims difficult % time-consuming; automated tools simplify & accelerate stim generation; here, an automated pipeline for AV stims from text 😀
An automated pipeline for efficiently generating standardized, child-friendly audiovisual language stimuli
Creating engaging language stimuli suitable for children can be difficult and time-consuming. To simplify and accelerate the process, we developed an …
www.sciencedirect.com
January 17, 2026 at 5:21 AM
Learning realistic lip motions for humanoid face robots www.science.org/doi/10.1126/... "anthropomorphic robots often fail to achieve lip-audio synchronization, resulting in clumsy and lifeless lip behaviors" Here: soft silicone lips actuated by a 10–degree-of-freedom mechanism- not uncanny at all 😉
Learning realistic lip motions for humanoid face robots
We propose a self-supervised facial action transformer that enables multilingual lip synchronization in humanoid robots.
www.science.org
January 16, 2026 at 8:46 PM
Revisiting the Benefits of Hand Gestures in L2 Pronunciation: journals.sagepub.com/doi/10.1177/... Between-subjects study (pretest/post-test), 39 Japanese learners of Mandarin practiced Mandarin aspirated stops, /u/ & T3 Sandhi for 4 session with hand gestures (phonetic/articulatory) vs no gestures.
Sage Journals: Discover world-class research
Subscription and open access journals from Sage, the world's leading independent academic publisher.
journals.sagepub.com
January 14, 2026 at 10:13 PM
The temporal effects of auditory and visual immersion on speech level in virtual environments pubs.aip.org/asa/jasa/art... A room's appearance can suggest its acoustics but do people adjust their speech based on this visual information? Study looked at whether AV information affected speech level ..
The temporal effects of auditory and visual immersion on speech level in virtual environments
Speech takes place in physical environments with visual and acoustic properties, yet how these elements and their interaction influence speech production is not
pubs.aip.org
January 13, 2026 at 9:50 PM
Age-related increases in speech rhythm in typically developing children pubs.aip.org/asa/jasa/art... Used envelope-based measures of speech rhythm in typically developing children to look at what changed over preschool & school-aged years - syllabic-level rhythms...⬆️
Age-related increases in speech rhythm in typically developing children
The purpose of the current study was to examine speech rhythm in typically developing children throughout the preschool and school-aged years. A better understa
pubs.aip.org
January 13, 2026 at 9:48 PM
End-to-end audio-visual learning for cochlear implant sound coding simulations in noisy environments pubs.aip.org/asa/jel/arti... Study tested an AV speech enhancement (AVSE) module integrated with ElectrodeNet-CS (ECS) model to form an end-to-end CI system, AVSE-ECS has potential in sound coding
End-to-end audio-visual learning for cochlear implant sound coding simulations in noisy environments
The cochlear implant (CI) is a successful biomedical device that enables individuals with severe-to-profound hearing loss to perceive sound through electrical s
pubs.aip.org
January 10, 2026 at 12:32 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Finally out: www.eneuro.org/content/earl...

fMRI during naturalistic story listening in noise, looking at event-segmentation and ISC signatures. Listeners stay engaged and comprehend the gist even in moderate noise.

with @ayshamota.bsky.social @ryanaperry.bsky.social @ingridjohnsrude.bsky.social
Neural signatures of engagement and event segmentation during story listening in background noise
Speech in everyday life is often masked by background noise, making comprehension effortful. Characterizing brain activity patterns when individuals listen to masked speech can help clarify the mechan...
www.eneuro.org
January 9, 2026 at 7:43 PM
The role of visual cues on word recognition in noise in monolingual and bilingual toddlers journals.sagepub.com/doi/abs/10.1... 36 monolinguals & 36 bilinguals kids (24 - 35 months) online Looking task: look at the X. Bilinguals worse than monolinguals in noise with no visual speech cues
Sage Journals: Discover world-class research
Subscription and open access journals from Sage, the world's leading independent academic publisher.
journals.sagepub.com
January 8, 2026 at 9:07 PM
Making faces www.science.org/doi/10.1126/... Reports on Ianni et al (2026) that cortical control of facial muscle production during social signalling in macaques is similar to that of voluntary action, suggesting that they share coordinated systems of intentional control
Making faces
Facial expressions are produced through a coordinated system of voluntary and emotional pathways
www.science.org
January 8, 2026 at 8:43 PM
No Difference in Face Scanning Patterns Between Monolingual & Bilingual Infants at 5 Months of Age onlinelibrary.wiley.com/doi/10.1111/... Do bilinguals use visual speech cues more than monolinguals? Study checked gaze of 474 monolingual & 101 bilingual 5 month old infants to mouths of moving faces
No Difference in Face Scanning Patterns Between Monolingual and Bilingual Infants at 5 Months of Age
It has been suggested that bilinguals take greater advantage of visual speech cues than monolinguals. Therefore, in a sample of 474 (47.3% females) monolingual and 101 (48.5% females) bilingual infa...
onlinelibrary.wiley.com
December 30, 2025 at 10:22 PM
Parallel encoding of speech in human frontal and temporal lobes www.nature.com/articles/s41... "Together, these results support the existence of robust long-range parallel inputs from low-level auditory areas to apical areas in the frontal lobe of the human speech network."
Parallel encoding of speech in human frontal and temporal lobes - Nature Communications
Whether high-order frontal lobe areas receive raw speech input in parallel with early speech areas in the temporal lobe is unclear. Here, the authors show that frontal lobe areas get fast low-level sp...
www.nature.com
December 30, 2025 at 7:17 AM
Vocalisations are coupled with movement of all limbs throughout infancy www.nature.com/articles/s41... "infants at 4, 6, 9 and 12 months of age. Limb movements were tightly coupled with vocalisation onsets at all time points across infancy"
Vocalisations are coupled with movement of all limbs throughout infancy - Scientific Reports
Scientific Reports - Vocalisations are coupled with movement of all limbs throughout infancy
www.nature.com
December 30, 2025 at 7:17 AM
Audiovisual Speech Perception in Aging Cochlear Implant Users and Age-Matched Nonimplanted Adults journals.lww.com/ear-hearing/... Looked at age-related differences in cross-sensory & multisensory benefits in AV speech identification in aging CI users & age-matched non-CI users
journals.lww.com
December 28, 2025 at 12:14 AM
Musical Training & Perceptual History Shape Alpha Dynamics in AV Speech Integration www.mdpi.com/2076-3425/15... McGurk stims & EEG "musicians appear to employ enhanced top-down control involving frontal regions, whereas nonmusicians rely more on sensory-driven processing" osf.io/b3c5j/overview
www.mdpi.com
December 24, 2025 at 11:31 PM
December 23, 2025 at 10:25 AM
"artificial general cleverness" 😆https://mathstodon.xyz/@tao/115722360006034040
Terence Tao (@[email protected])
933 Posts, 113 Following, 20.7K Followers · Professor of #Mathematics at the University of California, Los Angeles #UCLA (he/him).
mathstodon.xyz
December 21, 2025 at 11:17 AM
Delta-band cortical speech tracking predicts audiovisual speech-in-noise benefit from natural and simplified visual cues www.sciencedirect.com/science/arti... Degraded face vids -> some gains in AV speech comprehension; animations of speech envelope not integrated with auditory speech; Δ band = gain
Delta-band cortical speech tracking predicts audiovisual speech-in-noise benefit from natural and simplified visual cues
Humans comprehend speech in noisy environments more effectively when they can see the talker’s facial movements. While the benefits of audiovisual (AV…
www.sciencedirect.com
December 19, 2025 at 12:31 PM
The Scope and Limits of Iconic Prosody: Head Angle Predicts f0 Changes While Object Size Effects Are Absent direct.mit.edu/opmi/article... Looked at the relationship between head angle & f0 in iconic prosody & object size on lip opening +formant frequencies - but is f0/head angle 'direct' coupling?
The Scope and Limits of Iconic Prosody: Head Angle Predicts f0 Changes While Object Size Effects Are Absent
Abstract. The relation between the fundamental frequency of the voice (f0) and vertical space has been shown in previous studies; however, the underlying mechanisms are less clear. This study investig...
direct.mit.edu
December 19, 2025 at 4:12 AM
The Role of Speech Reading During Visual Word Processing in Hearing Children pubs.asha.org/doi/10.1044/... Is the STS process active in speech reading related to word reading? STS active in visual word-rhyming but only weakly linked to reading -Seems STS phonological processing a better candidate
The Role of Speech Reading During Visual Word Processing in Hearing Children: A Functional Magnetic Resonance Imaging Study
Purpose: Speech reading, or the ability to identify speech components from visual cues of the face, contributes to the development of phonologica...
pubs.asha.org
December 18, 2025 at 1:13 AM