Nezar Abdennur
@nvictus.bsky.social
100 followers 150 following 24 posts
computational biologist / biological computer / asst prof @UMassChan / phd @MIT / http://abdenlab.org
Posts Media Videos Starter Packs
Pinned
nvictus.bsky.social
I'm proud to announce the latest release of 🧬 #Oxbow 🏹, with new features to make NGS data analysis more powerful, efficient, and "composable".

Learn more at: oxbow.readthedocs.io
Reposted by Nezar Abdennur
jmschreiber91.bsky.social
In the genomics community, we have focused pretty heavily on achieving state-of-the-art predictive performance.

While undoubtedly important, how we *use* these models after training is potentially even more important.

tangermeme v1.0.0 is out now. Hope you find it useful!
nvictus.bsky.social
My talk on #Composability in genomic software at #SciPy2025 is up on YouTube where I showcase both #anywidget and #oxbow.

Thank you to the organizers for the opportunity to present this to both computational biologists and the wider scientific computing community!

www.youtube.com/watch?v=G22_...
Nezar Abdennur - Accelerating Genomic Data Science and AI/ML with Composability | SciPy 2025
YouTube video by SciPy
www.youtube.com
nvictus.bsky.social
Our #anywidget tutorial from last year's #SciPy conf was uploaded to youtube! Check it out for a hands-on walkthrough to create your own web-based widgets.
nvictus.bsky.social
We anticipate that joint dimensionality reduction and projection will become a foundational norm for comparative and integrative analysis of long-range interaction profiles in Hi-C/3C+ data. e.g. existing methods for working with classic A/B vectors can be extended to joint higher-order embeddings.
nvictus.bsky.social
We jointly-hic to create an atlas of 89 human Hi-C samples, uncovering distinct patterns of nuclear architecture associated with heterochromatin composition and demonstrating how higher-order principal components capture missing information about gene expression and regulatory element activity.
nvictus.bsky.social
jointly-hic accomplishes this using mini-batch incremental PCA, allowing for joint decomposition of arbitrarily many contact matrices at any resolution with constant memory.
nvictus.bsky.social
Joint decomposition allows for robust and directly comparable low dimensional representations of arbitrarily many contact maps, providing insights into genome organization across diverse biological contexts, from different tissues to developmental stages.
nvictus.bsky.social
The classic A/B compartment track comes from matrix factorization of a contact matrix into eigenvectors or PCs. Done separately, each map is projected onto a different coordinate system. Comparing such vectors directly is problematic, especially if seeking info from **higher-order** components.
Reposted by Nezar Abdennur
jmschreiber91.bsky.social
A huge challenge I face when doing ML + genomics analysis is *friction*: the stupid error messages (wrong device!) and dumb implementation issues that snap you out of the zone. I wrote a vignette on how tangermeme has helped me reduce this friction:

tangermeme.readthedocs.io/en/latest/ho...
How To: Reduce Friction and Save Time with Tangermeme — tangermeme v0.1.0 documentation
tangermeme.readthedocs.io
Reposted by Nezar Abdennur
jmschreiber91.bsky.social
(4) bpnet-lite: Load official Chrom/BPNet models into PyTorch for downstream tangermeme integration. Improved command-line tools + docs. Still concerns about perf of models trained from scratch -- will be resolved next version!

github.com/jmschrei/bpn...

bsky.app/profile/jmsc...
nvictus.bsky.social
We’re excited and eager for feedback, so please give oxbow a try!

`pip install oxbow`
nvictus.bsky.social
It also supports:

* Column projection and pushdown (parsing only the fields you need)
* Complex and nested field types (e.g. alignment tags, variant genotype call data, etc.)
* Genomic range-based queries via an index
* User-defined transports and file systems
nvictus.bsky.social
This update (v0.4.x) provides complete #ApacheArrow data models for 11 file formats and counting, including the GA4GH/htslib formats and UCSC’s BigWig/BigBed.
nvictus.bsky.social
We revamped the #rustlang backend and implemented a new "DataSource" API in #Python, which allows for streaming conventional #genomic files – in-memory, on-disk, or in the cloud – into the modern data tools you use regularly, including #Pandas, #Polars, #DuckDB, and #Dask.
nvictus.bsky.social
I'm proud to announce the latest release of 🧬 #Oxbow 🏹, with new features to make NGS data analysis more powerful, efficient, and "composable".

Learn more at: oxbow.readthedocs.io
nvictus.bsky.social
We’re excited and eager for feedback, so please give oxbow a try!

`pip install oxbow`
nvictus.bsky.social
It also supports:

* Column projection and pushdown (parsing only the fields you need)
* Complex and nested field types (e.g. alignment tags, variant genotype call data, etc.)
* Genomic range-based queries via an index
* User-defined transports and file systems
nvictus.bsky.social
This update (v0.4.x) provides complete #ApacheArrow data models for 11 file formats and counting, including the GA4GH/htslib formats and UCSC’s BigWig/BigBed.
nvictus.bsky.social
We revamped the #rustlang backend and implemented a new "DataSource" API in #Python, which allows for streaming conventional #genomic files – in-memory, on-disk, or in the cloud – into the modern data tools you use regularly, including #Pandas, #Polars, #DuckDB, and #Dask.