Gautam
@gautammalik.bsky.social
Research Assistant at University of Cambridge | Exploring deep learning in biology with big dreams of using AI to make drug discovery a little less complicated!🧬🖥️
[9/9]
Appreciate any advice, pointers to relevant papers, or even “don’t do this” cautionary tales.
Thanks in advance!
#transformers #sparsity #maskedmodeling #deeplearning #symbolicAI #mlresearch #attentionmodels #structureddata
Appreciate any advice, pointers to relevant papers, or even “don’t do this” cautionary tales.
Thanks in advance!
#transformers #sparsity #maskedmodeling #deeplearning #symbolicAI #mlresearch #attentionmodels #structureddata
June 14, 2025 at 5:22 AM
[9/9]
Appreciate any advice, pointers to relevant papers, or even “don’t do this” cautionary tales.
Thanks in advance!
#transformers #sparsity #maskedmodeling #deeplearning #symbolicAI #mlresearch #attentionmodels #structureddata
Appreciate any advice, pointers to relevant papers, or even “don’t do this” cautionary tales.
Thanks in advance!
#transformers #sparsity #maskedmodeling #deeplearning #symbolicAI #mlresearch #attentionmodels #structureddata
[8/9]
C) Local patching: training on smaller, denser subregions of the matrix
D) Contrastive or denoising autoencoder approaches instead of MLM
E) Treating the task as a kind of link prediction or structured matrix completion
F) Something entirely different?
C) Local patching: training on smaller, denser subregions of the matrix
D) Contrastive or denoising autoencoder approaches instead of MLM
E) Treating the task as a kind of link prediction or structured matrix completion
F) Something entirely different?
June 14, 2025 at 5:22 AM
[8/9]
C) Local patching: training on smaller, denser subregions of the matrix
D) Contrastive or denoising autoencoder approaches instead of MLM
E) Treating the task as a kind of link prediction or structured matrix completion
F) Something entirely different?
C) Local patching: training on smaller, denser subregions of the matrix
D) Contrastive or denoising autoencoder approaches instead of MLM
E) Treating the task as a kind of link prediction or structured matrix completion
F) Something entirely different?
[7/9]
B) Loss weighing:
Also tried tweaking the loss weights to prioritize correct prediction of rare relation types.
But it didn’t seem to help either, possibly because the model just ends up ignoring the background (.) altogether.
B) Loss weighing:
Also tried tweaking the loss weights to prioritize correct prediction of rare relation types.
But it didn’t seem to help either, possibly because the model just ends up ignoring the background (.) altogether.
June 14, 2025 at 5:22 AM
[7/9]
B) Loss weighing:
Also tried tweaking the loss weights to prioritize correct prediction of rare relation types.
But it didn’t seem to help either, possibly because the model just ends up ignoring the background (.) altogether.
B) Loss weighing:
Also tried tweaking the loss weights to prioritize correct prediction of rare relation types.
But it didn’t seem to help either, possibly because the model just ends up ignoring the background (.) altogether.
[6/9]
A) Biased masking:
I've tried biased masking (favoring the rare relation tokens), but it’s not helping much.
Can biased masking still work when the background class is that overwhelming? Or does it just drown out the signal anyway?
A) Biased masking:
I've tried biased masking (favoring the rare relation tokens), but it’s not helping much.
Can biased masking still work when the background class is that overwhelming? Or does it just drown out the signal anyway?
June 14, 2025 at 5:22 AM
[6/9]
A) Biased masking:
I've tried biased masking (favoring the rare relation tokens), but it’s not helping much.
Can biased masking still work when the background class is that overwhelming? Or does it just drown out the signal anyway?
A) Biased masking:
I've tried biased masking (favoring the rare relation tokens), but it’s not helping much.
Can biased masking still work when the background class is that overwhelming? Or does it just drown out the signal anyway?
[5/9]
I'm wondering whether standard masked modeling is the right fit for this format, or whether it needs adjustment. Some options I'm exploring or considering:
I'm wondering whether standard masked modeling is the right fit for this format, or whether it needs adjustment. Some options I'm exploring or considering:
June 14, 2025 at 5:22 AM
[5/9]
I'm wondering whether standard masked modeling is the right fit for this format, or whether it needs adjustment. Some options I'm exploring or considering:
I'm wondering whether standard masked modeling is the right fit for this format, or whether it needs adjustment. Some options I'm exploring or considering:
[4/9]
The challenge:
The matrix is highly sparse, most entries are the neutral "no relation" token. When a meaningful relation is masked, it is often surrounded by these neutral tokens. I'm concerned that the model may struggle to learn meaningful context, since most of what it sees is neutral.
The challenge:
The matrix is highly sparse, most entries are the neutral "no relation" token. When a meaningful relation is masked, it is often surrounded by these neutral tokens. I'm concerned that the model may struggle to learn meaningful context, since most of what it sees is neutral.
June 14, 2025 at 5:22 AM
[4/9]
The challenge:
The matrix is highly sparse, most entries are the neutral "no relation" token. When a meaningful relation is masked, it is often surrounded by these neutral tokens. I'm concerned that the model may struggle to learn meaningful context, since most of what it sees is neutral.
The challenge:
The matrix is highly sparse, most entries are the neutral "no relation" token. When a meaningful relation is masked, it is often surrounded by these neutral tokens. I'm concerned that the model may struggle to learn meaningful context, since most of what it sees is neutral.
[3/9]
I’m using a masked modeling objective, where random entries are masked, and the model learns to recover the original token based on the rest of the matrix. The goal is for the model to learn latent structure in how these symbolic relationships are distributed.
I’m using a masked modeling objective, where random entries are masked, and the model learns to recover the original token based on the rest of the matrix. The goal is for the model to learn latent structure in how these symbolic relationships are distributed.
June 14, 2025 at 5:22 AM
[3/9]
I’m using a masked modeling objective, where random entries are masked, and the model learns to recover the original token based on the rest of the matrix. The goal is for the model to learn latent structure in how these symbolic relationships are distributed.
I’m using a masked modeling objective, where random entries are masked, and the model learns to recover the original token based on the rest of the matrix. The goal is for the model to learn latent structure in how these symbolic relationships are distributed.
[2/9]
The relation types are drawn from a small, discrete vocabulary, and the "no relation" case is marked with a neutral symbol (e.g., .). Importantly, this "no relation" doesn’t necessarily mean irrelevance. I’m not sure whether it should be treated as informative context or just noise.
The relation types are drawn from a small, discrete vocabulary, and the "no relation" case is marked with a neutral symbol (e.g., .). Importantly, this "no relation" doesn’t necessarily mean irrelevance. I’m not sure whether it should be treated as informative context or just noise.
June 14, 2025 at 5:22 AM
[2/9]
The relation types are drawn from a small, discrete vocabulary, and the "no relation" case is marked with a neutral symbol (e.g., .). Importantly, this "no relation" doesn’t necessarily mean irrelevance. I’m not sure whether it should be treated as informative context or just noise.
The relation types are drawn from a small, discrete vocabulary, and the "no relation" case is marked with a neutral symbol (e.g., .). Importantly, this "no relation" doesn’t necessarily mean irrelevance. I’m not sure whether it should be treated as informative context or just noise.
[1/9]
I’m working on a transformer-based model over a 2D symbolic matrix where each row and column represents elements from two discrete sets. Each cell contains a token representing a relationship type between the corresponding pair, or a default token when no known relation exists.
I’m working on a transformer-based model over a 2D symbolic matrix where each row and column represents elements from two discrete sets. Each cell contains a token representing a relationship type between the corresponding pair, or a default token when no known relation exists.
June 14, 2025 at 5:22 AM
[1/9]
I’m working on a transformer-based model over a 2D symbolic matrix where each row and column represents elements from two discrete sets. Each cell contains a token representing a relationship type between the corresponding pair, or a default token when no known relation exists.
I’m working on a transformer-based model over a 2D symbolic matrix where each row and column represents elements from two discrete sets. Each cell contains a token representing a relationship type between the corresponding pair, or a default token when no known relation exists.
I like this perspective—it should be about evolution through iterations, rather than expecting the best-evolved algorithm right away. Everyone seems inspired by AlphaFold’s story, looking for a similar breakthrough in their domain, but maybe the focus should be on steady progress.
December 9, 2024 at 10:46 AM
I like this perspective—it should be about evolution through iterations, rather than expecting the best-evolved algorithm right away. Everyone seems inspired by AlphaFold’s story, looking for a similar breakthrough in their domain, but maybe the focus should be on steady progress.
It’s definitely a challenging but fascinating area and I’d love to talk more about this!
December 9, 2024 at 5:30 AM
It’s definitely a challenging but fascinating area and I’d love to talk more about this!
But it isn’t as trivial as it sounds, right? Is it just about using some vector embeddings from domain knowledge and adding them to the model? I’m wondering, is it the lack of collaboration between scientists and AI/ML experts that’s hindering this kind of development?
December 9, 2024 at 5:14 AM
But it isn’t as trivial as it sounds, right? Is it just about using some vector embeddings from domain knowledge and adding them to the model? I’m wondering, is it the lack of collaboration between scientists and AI/ML experts that’s hindering this kind of development?
This debate might be intense, but it’s moments like this that make me curious about where we’re all headed. Science evolves through friction, right?
December 9, 2024 at 4:57 AM
This debate might be intense, but it’s moments like this that make me curious about where we’re all headed. Science evolves through friction, right?
What excites me, though, is the idea I keep hearing: can we combine the best of both worlds? Is that even possible? Are we talking about something like machine-learned potentials in MD simulations, or is it deeper than that? Please, help me out to gain some more perspective!
December 9, 2024 at 4:57 AM
What excites me, though, is the idea I keep hearing: can we combine the best of both worlds? Is that even possible? Are we talking about something like machine-learned potentials in MD simulations, or is it deeper than that? Please, help me out to gain some more perspective!
As a young researcher, I can’t help but notice how scientists using physics-based methods can sometimes show a bias. It’s clear they have their roots, but there's no denying that AI/ML methods come with their own set of caveats, many of which are tough to even recognize.
December 9, 2024 at 4:57 AM
As a young researcher, I can’t help but notice how scientists using physics-based methods can sometimes show a bias. It’s clear they have their roots, but there's no denying that AI/ML methods come with their own set of caveats, many of which are tough to even recognize.
Reposted by Gautam
It’s not real-world ready but a good foundation to explore. And yes, science does need a protein emoji!
github.com/gautammalik-...
github.com/gautammalik-...
GitHub - gautammalik-git/BindAxTransformer: BindAxTransformer is a transformer-based model trained on protein-ligand interactions using self-supervised learning. This repository provides a detailed im...
BindAxTransformer is a transformer-based model trained on protein-ligand interactions using self-supervised learning. This repository provides a detailed implementation and educational resource, sh...
github.com
November 22, 2024 at 7:46 PM
It’s not real-world ready but a good foundation to explore. And yes, science does need a protein emoji!
github.com/gautammalik-...
github.com/gautammalik-...
Reposted by Gautam
To wrap up, I’m curious about your thoughts on the future of docking models. Will the next breakthrough be GNN-based, transformer-based, or something like generative models (e.g., Diffusion)? I'd love to hear your opinions on what direction the field is heading. Let me know your thoughts!
November 22, 2024 at 7:46 PM
To wrap up, I’m curious about your thoughts on the future of docking models. Will the next breakthrough be GNN-based, transformer-based, or something like generative models (e.g., Diffusion)? I'd love to hear your opinions on what direction the field is heading. Let me know your thoughts!
It’s not real-world ready but a good foundation to explore. And yes, science does need a protein emoji!
github.com/gautammalik-...
github.com/gautammalik-...
GitHub - gautammalik-git/BindAxTransformer: BindAxTransformer is a transformer-based model trained on protein-ligand interactions using self-supervised learning. This repository provides a detailed im...
BindAxTransformer is a transformer-based model trained on protein-ligand interactions using self-supervised learning. This repository provides a detailed implementation and educational resource, sh...
github.com
November 22, 2024 at 7:46 PM
It’s not real-world ready but a good foundation to explore. And yes, science does need a protein emoji!
github.com/gautammalik-...
github.com/gautammalik-...