Lightnews — Scholar-powered news

Reposted by Trinity Chung

Aran Nayebi @anayebi.bsky.social · Jul 24

🚀 New Open-Source Release! PyTorchTNN 🚀
A PyTorch library for biologically-inspired temporal neural nets: unrolling computation through time. Integrates with our recent Encoder-Attender-Decoder, which flexibly combines models (Transformer, SSM, RNN) since no single one fits all sequence tasks.
🧵👇

1 11 36

Trinity Chung @trinityjchung.com · May 27

9/ We thank Albert Gu and Roberta “Bobby” Klatzky for helpful discussions!

Preprint here: arxiv.org/abs/2505.18361
PyTorchTNN library coming very soon – stay tuned!

Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain

Tactile sensing remains far less understood in neuroscience and less effective in artificial systems compared to more mature modalities such as vision and language. We bridge these gaps by introducing...

arxiv.org

2

Trinity Chung @trinityjchung.com · May 27

8/ Our conclusions:
a) ConvRNNs outperform feedforward/SSMs on realistic tactile recognition
b) ConvRNNs best match neural responses in the mice brain
c) Contrastive SSL matches supervised neural alignment, suggesting a general-purpose representation in the somatosensory cortex

1 2

Trinity Chung @trinityjchung.com · May 27

7/ This work is just the beginning of understanding how touch works in both animals and machines. To continue to improve embodied AI, we'll need richer neural datasets and smarter ways to fuse touch with other senses like vision and proprioception.

1 2

Trinity Chung @trinityjchung.com · May 27

6/ We found that ConvRNNs (esp. IntersectionRNNs of @jascha.sohldickstein.com @sussillodavid.bsky.social ) beat ResNets, Transformers, and SSMs when it comes to aligning with real mouse somatosensory cortex data. Plus, they pass the NeuroAI Turing Test! (bsky.app/profile/anay...)

Comparison of neural fit (noise-corrected RSA Pearson’s r) across models. The mean animal-to-animal score is 0.18 and the maximum between all pairs of animals is 1.16. The leftmost “a2a” bar represents the mean animal-to-animal neural consistency score. The lighter-colored left bar represents the randomly initialized version for every model.

1 2

Trinity Chung @trinityjchung.com · May 27

5/ Crucially, we found that when doing tactile self-supervised learning, we couldn’t apply all of the usual image augmentations out-of-the-box (in fact, they couldn’t train at all!), so we designed transformations explicitly for tactile force & torque inputs:

Types of data augmentations applied to SSL models. Given a temporal tactile input over time T, our tactile augmentation vertically, horizontally, temporally flips, and rotates the features, while traditional image augmentation introduces Gaussian noise, color jitter, and grayscale. The tactile augmentations were effective in improving both the neural fit and task performance. The models were unable to be trained with image augmentations.

1 2

Trinity Chung @trinityjchung.com · May 27

4/ We trained our models using data from simulated mouse whiskers from @MitraHartmann's group brushing various objects. Whiskers realistically output detailed force-torque signals, unlike current robot sensors, which remain difficult to scale and have reduced sensitivity in sim.

We use average mouse whisker array measurements to recreate a whisker model in sim. (II) Objects are whisked in simulation using WHISKiT (from Hartmann's lab) resulting in (III) force and torque data for sweeping 9981 ShapeNet objects of 117 categories with various sweep augmentations. The augmentations vary the (1) speed, (2) height, (3), rotation, and (4) distance of the objects relative to the whisker array. We constructed two datasets: a large, low-fidelity set with more sweep augmentations, and a small, high-fidelity set with fewer augmentations.

1 2

Trinity Chung @trinityjchung.com · May 27

3/ We systematically explored temporal architectures (unifying ConvRNNs including from @Chengxu & @dyamins.bsky.social et al. 2017's prior tactile work, SSMs, Transformers) via our Encoder-Attender-Decoder (EAD) framework, built using a custom "PyTorchTNN" library we developed.

The Encoder-Attender-Decoder (EAD) architecture, with task objectives being supervised categorization, self-supervised learning (SimCLR, SimSiam, autoencoding). The ConvRNN encoder includes self-recurrence at each layer where we vary different RNNs.

1 2

Trinity Chung @trinityjchung.com · May 27

2/ Tactile perception is still considerably under-explored in both Neuroscience and Embodied AI. Our goal is to provide (1) a model of tactile processing we can quantitatively compare against the brain with and (2) inspire better models for tactile processing in robots.

1 2

Trinity Chung @trinityjchung.com · May 27

1/ What if we make robots that process touch the way our brains do?
We found that Convolutional Recurrent Neural Networks (ConvRNNs) pass the NeuroAI Turing Test in currently available mouse somatosensory cortex data.
New paper by @Yuchen @Nathan @anayebi.bsky.social and me!

Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain

1 5 18