Georg Bökman
@bokmangeorg.bsky.social
920 followers 410 following 220 posts
Geometric deep learning + Computer vision
Posts Media Videos Starter Packs
bokmangeorg.bsky.social
Using Fourier theory of finite groups, we can block-diagonalize these group-circulant matrices. Hence, incorporating symmetries (group equivariance) in neural networks can make the networks faster. We used this to obtain 𝑞𝑢𝑖𝑐𝑘𝑒𝑟 𝑉𝑖𝑇𝑠. arxiv.org/abs/2505.15441
bokmangeorg.bsky.social
Mapping such 8-tuples to new 8-tuples that permute in the same way under transformations of the input is done by convolutions over the transformation group, or (equivalently) multiplication with group-circulant matrices.
bokmangeorg.bsky.social
Images (or image patches) are secretly multi-channel signals over groups. Below, the dihedral group of order 8: reflecting/rotating the image permutes the values in the magenta vector. So we can reshape the image into 8-tuples that all permute according to the dihedral group (edge case diagonals).
bokmangeorg.bsky.social
Had a skim of Kostelec-Rockmore. There are some interesting pointers suggesting non-triviality of fast implementations of asymptotically fast FFTs at the end. 🙃 Also, there seems to be a version that uses three 1D FFTs, but it is not as fast as possible asymptotically.
bokmangeorg.bsky.social
At least some FFTs for SO(3) work by separation of variables and a sequence of 1D FFTs right? So is the butterfly decomposition "straightforward" for them? Regarding small finite groups, the entire FFT might be unnecessary and can simply be a dense fourier transform matrix.
bokmangeorg.bsky.social
Do you have good examples from other areas of taking the hardware as the prior?
bokmangeorg.bsky.social
Also quite generous to cite the paper as a generic reference for the term "FLOPs" 😅
bokmangeorg.bsky.social
Nice LLM generated citation found by @davnords.bsky.social. I wonder who M. Lindberg and A. Andersson are...
bokmangeorg.bsky.social
Got to honor the traditions. "In Sweden, the west coast city of Gothenburg is known for its puns."
Reposted by Georg Bökman
aaroth.bsky.social
The opportunities and risks of the entry of LLMs into mathematical research in one screenshot. I think it is clear that LLMs will make trained researchers more effective. But they will also lead to a flood of bad/wrong papers, and I'm not sure we have the tools to deal with this.
bokmangeorg.bsky.social
Nice perspective, you look like a giant! And congrats!
bokmangeorg.bsky.social
If you were working at meta you could have called the paper "Mental rotation capabilities emerge at scale with DINOv3" :)
bokmangeorg.bsky.social
I see, yeah plots of proportions over the layers would be cool!
bokmangeorg.bsky.social
Also, I think it is possible to argue for equivariance at scale from a purely computational perspective. bsky.app/profile/bokm...
bokmangeorg.bsky.social
A simple argument for equivariance at scale: 1) At scale, token-wise linear layers dominate compute. 2) Token-wise linear equivariant layers implemented in the Fourier domain are block-diagonal and hence fast.
bokmangeorg.bsky.social
I like the point made in this paragraph. It might follow that it's a good idea to build equivariant architectures that are as similar to proven non-equivariant architectures as possible.
bokmangeorg.bsky.social
I think the difficult part here is to tell whether the object has been rotated or both rotated and mirrored. I.e. the model needs to be sensitive to mirroring. Mirroring is often part of data aug, but the model can (should i.m.o.) still be internally sensitive to mirroring.
bokmangeorg.bsky.social
Interesting how basically one single layer in one single model out of all these can solve the "Shepard-Metzler Free" case
bokmangeorg.bsky.social
Congrats, very interesting work! When training with only 8 filters, how did you choose how many of each to use in each layer? Did you just use equally many of each?
bokmangeorg.bsky.social
Thanks! Also I'm currently doing a postdoc in Amsterdam and would love to visit Delft ;)
bokmangeorg.bsky.social
Perhaps true, depending on the problem. But in general the framing could be "can I get away with equivariance for my problem?" rather than "do I need equivariance for my problem?".
bokmangeorg.bsky.social
However, these potential issues are "skill issues" in my opinion...
bokmangeorg.bsky.social
We've been exploring this for image data recently. Exhibit A arxiv.org/abs/2502.05169, exhibit B arxiv.org/abs/2505.15441 .