...a big chunk of that paper was about fine-tuning our hgT5 gLMs (it was actually the whole motivation for GUANinE -- tl;dr we saw strong gains in functional & conservation tasks)
...a big chunk of that paper was about fine-tuning our hgT5 gLMs (it was actually the whole motivation for GUANinE -- tl;dr we saw strong gains in functional & conservation tasks)
using all the params in an LM is hard. In genonics I would expect it to conform to extracting features for augmentation (i.e. an LM feature in CADD), just like in protein LMs
www.nature.com/articles/s41...
using all the params in an LM is hard. In genonics I would expect it to conform to extracting features for augmentation (i.e. an LM feature in CADD), just like in protein LMs
www.nature.com/articles/s41...
however, our follow-up preprint correlates it with model "quality" as Basenji2 < Enformer < Borzoi
however, our follow-up preprint correlates it with model "quality" as Basenji2 < Enformer < Borzoi
(but a huge part of the funding & dev pipeline is forbiopharma and variant interpretation, not basic science)
(but a huge part of the funding & dev pipeline is forbiopharma and variant interpretation, not basic science)
the original use case for ELMO and other NLP LMs was pretraining ultra-high parameter models in the absence of large-scale supervised data. genomics only has this absence on novel organisms in genbank, not humans
www.ncbi.nlm.nih.gov/genbank/stat...
the original use case for ELMO and other NLP LMs was pretraining ultra-high parameter models in the absence of large-scale supervised data. genomics only has this absence on novel organisms in genbank, not humans
www.ncbi.nlm.nih.gov/genbank/stat...
(e.g. Borzoi's 32 bp RNA-seq vs Xpresso's historical approach of one-gene-is-one-example)
(e.g. Borzoi's 32 bp RNA-seq vs Xpresso's historical approach of one-gene-is-one-example)
although I've seen pretty strong evidence to suggest they work well on certain tasks like conservation or cCRE recognition, e.g. ~ proceedings.mlr.press/v240/robson2...
(obviously depends on the model, the task... and how predictions are made :) )
although I've seen pretty strong evidence to suggest they work well on certain tasks like conservation or cCRE recognition, e.g. ~ proceedings.mlr.press/v240/robson2...
(obviously depends on the model, the task... and how predictions are made :) )
Borzoi and Enformer capture deeper features than the ones we test out, even surprisingly cryptic chromosomal features from sequence alone
Borzoi and Enformer capture deeper features than the ones we test out, even surprisingly cryptic chromosomal features from sequence alone
be it through single-payer systems (e.g. medicare for all) or publicly developed and distributed medicines
be it through single-payer systems (e.g. medicare for all) or publicly developed and distributed medicines