Cui Ding
cuiding.bsky.social
Cui Ding
@cuiding.bsky.social
Actually, we are also trying to working on a solution, but for Chinese, mainly.
December 1, 2025 at 11:03 PM
Take-home Message
🔹 We formalize input quality in reading as mutual information.
🔹 We link it to measurable human behavior.
🔹 We show multimodal LLMs can model this effect quantitatively.
Bottom-up information matters — and now we can measure how much it matters.
November 2, 2025 at 11:06 AM
Key Result 2: Information from Models
Using fine-tuned Qwen2.5-VL and TransOCR, we estimated the MI between images and word identity.
MI systematically drops: Full > Upper > Lower — perfectly mirroring human reading patterns! 🤯
November 2, 2025 at 11:06 AM
Key Result 1: Human Reading
Reading times show a clear pattern:
Full visible< Upper visible < Lower visible in both English & Chinese.
👉 Upper halves are more informative (and easier to read).
November 2, 2025 at 11:06 AM
We model reading time as proportional to the number of visual “samples” needed to reduce uncertainty below a threshold ϕ.
Higher mutual information → fewer samples → faster reading.
November 2, 2025 at 11:06 AM
📊 We quantify this using mutual information (MI) between visual input and word identity.
To test the theory, we created a reading experiment using the MoTR (Mouse-Tracking-for-Reading) paradigm 🖱️📖
We ran the study in both English and Chinese.
November 2, 2025 at 11:06 AM
We propose a formal model where reading is a Bayesian update integrating top-down expectations and bottom-up evidence.
When bottom-up input is noisy (e.g., words are partially occluded), comprehension becomes harder and slower.
November 2, 2025 at 11:06 AM