(Also, we are trying to build such framework already for a while at github.com/nf-core/deep... , currently pipeline is moving a lot because we are in the process of porting of our code to nf-core)
(Also, we are trying to build such framework already for a while at github.com/nf-core/deep... , currently pipeline is moving a lot because we are in the process of porting of our code to nf-core)
An important hint is that there are certain ways to evaluate predictions regardless of the methods used to generate them.
This is an excellent attempt (blog & paper) at bringing more statistical rigor to evaluation of ML models (this is specifically focused on LLM evals).
I feel like we need to have similar clear standards for many types of predictive models in biology. 1/
I agree that downstream test development is useful, and extra convenient. (no need to retrain, can think of model as black box, gives interesting bio insights etc.)
I agree that downstream test development is useful, and extra convenient. (no need to retrain, can think of model as black box, gives interesting bio insights etc.)
- Are pathogenic variants less expected by *insert LLM method* than non-pathonegic variants?
- Is perplexity lower for notably conserved regions ?
- Can this be used to find conserved regions in new genomes ?
- Are pathogenic variants less expected by *insert LLM method* than non-pathonegic variants?
- Is perplexity lower for notably conserved regions ?
- Can this be used to find conserved regions in new genomes ?