David Martínez Millán
@dmartmillan.bsky.social
29 followers 51 following 11 posts
I am an engineer in a biological world.
Posts Media Videos Starter Packs
Reposted by David Martínez Millán
raquelbmi.bsky.social
Very happy to share that our paper on the influence of biological sex and smoking in the clonal landscape of the normal human bladder is out today in Nature. Kudos to all the authors and especially to @ferriol.bsky.social for coming along with me during this project's journey!

🧵Summary following ⬇️
Reposted by David Martínez Millán
irbbarcelona.org
🧪 Out in @nature.com: Smoking and biological sex shape healthy bladder tissue evolution, offering clues to #cancer risk.

✍️ #IRBBarcelona & University of Washington

➡️ bit.ly/42wcIN5

📌 DOI: 10.1038/s41586-025-09521-x

#IRBScience #CancerResearch #BladderCancer @bbglab.bsky.social

🧵👇
dmartmillan.bsky.social
Thanks, Brendan! More than welcome to try it out!
Reposted by David Martínez Millán
dmartmillan.bsky.social
Thank you so much!!! ☺️
dmartmillan.bsky.social
I would like to thank everyone involved in the development of the tool, Federica Brando, Miguel L. Grau, @guixe-m.bsky.social , Carlos López-Elorduy, Iker Reyes-Salazar, Jordi Deu-Pons, @nlbigas.bsky.social and Abel González-Pérez.
dmartmillan.bsky.social
Overall, OpenVariant addresses a significant problem in the field by aggregating cohort-level data from multiple sources into a single harmonized result set. It replaces many of the tedious steps involved in curating data with a more robust and easier-to-document process.
dmartmillan.bsky.social
OpenVariant is open-source software under BSD-3 Clause license, freely available for public use. It is designed in an easily extendable way to encourage collaboration in its development, available on GitHub: github.com/bbglab/openv...
GitHub - bbglab/openvariant: Read, parse and operate different multiple input file formats with OpenVariant
Read, parse and operate different multiple input file formats with OpenVariant - bbglab/openvariant
github.com
dmartmillan.bsky.social
We integrated OpenVariant as the first step in the IntOGen pipeline (www.intogen.org), processing 257,898,749 somatic mutations across 33,218 tumor samples represented through 271 cohorts sequenced by different sources, and stored in different data formats.
IntOGen pipeline on Nextflow.
dmartmillan.bsky.social
No existing tool matches OpenVariant's functionalities, setting it apart from other tools in the field. Its execution time was evaluated against similar Python-based tools using @brent-p.bsky.social benchmark (github.com/brentp/vcf-b...), ranking OpenVariant among the best peers.
GitHub - brentp/vcf-bench: evaluating vcf parsing libraries
evaluating vcf parsing libraries. Contribute to brentp/vcf-bench development by creating an account on GitHub.
github.com
dmartmillan.bsky.social
OpenVariant is designed based on an annotation structure that serves as a core component in which describes how input files are parsed and how the output is represented. As well, a plugin system is incorporated to hone data transformation from the user.
Annotation structure of OpenVariant
dmartmillan.bsky.social
We present OpenVariant, a Python package to encompass a wide range of functionalities to operate multiple variant file formats at once and manage the annotation of metadata relative to mutational datasets. You can consult the documentation at: openvariant.readthedocs.io
OpenVariant documentation — OpenVariant
openvariant.readthedocs.io
dmartmillan.bsky.social
Despite efforts to homogenize data produced by variant callers and available processing tools, differences in the variants persist across projects. This variability hiders the integration of somatic mutations from different sources, key for large cancer genomics analyses.