Engelhardt Research Group
@thebeehive.bsky.social
340 followers 590 following 33 posts
Engelhardt Research Group at Stanford University and Gladstone Institutes. Statistical genomics, live-cell imaging, wearable data, cancer immunology, reproductive health.
Posts Media Videos Starter Packs
thebeehive.bsky.social
We are really proud of this work. Please try out NNMF on all of your gene count spatial transcriptomics data, whether you need hard clusters or scalable, interpretable, spatially aware dimension reduction! Feedback welcome!! github.com/ragnhildlaur...
GitHub - ragnhildlaursen/NNMF: Neighborhood Non-Negative Matrix Factorization
Neighborhood Non-Negative Matrix Factorization. Contribute to ragnhildlaursen/NNMF development by creating an account on GitHub.
github.com
thebeehive.bsky.social
On these CRC data, we studied the factors based on their top ten genes. We found immune-dominated factors & factors capturing intra- and peri-tumoral stroma, among others. Importantly, some factors were shared x patients and some were patient specific, characterizing tumor-specific immune responses.
genes characterizing the 30 NNMF factors in the CRC data.
thebeehive.bsky.social
Then, we applied NNMF to MERFISH data publicly released by Vizgen (vizgen.com/data-release...) that includes 500 genes in ∼1.9 million cells from two human colon cancer samples. NNMF showed enormous complexity, where each factor included many cell types and identified detailed biological structure.
Top row: cell types in two patient samples; Bottom row: NNMF signatures in the same two samples, showing substantial complexity.
thebeehive.bsky.social
On the same MERFISH mouse brain data, we aligned the eight parallel slices and ran NNMF on the 3D aligned data. NNMF easily labeled the important regions in 3D, and smoothed the factors across all three dimensions.
NNMF factor 1 and factor 7 across the 3D aligned brain slices.
thebeehive.bsky.social
Next, we ran NNMF on MERFISH single mouse hypothalamus data with eight parallel slices on each individual slice (2D). NNMF + K-means produces hard clusters that match the manual clustering well. But the real story is how much detail and biological complexity soft clusterings add. Vasculature!
Manually annotated brain sample. hard clustering for NNMF + Kmeans, BASS, and MENDER on two parallel samples. All of the ten factors from NNMF in a single sample, colored by weights.
thebeehive.bsky.social
On human brain 10X Visium data and mouse brain MERFISH data, we compared MENDER and BASS to NNMF in terms of run time, and found that MENDER is fastest and NNMF is a close second. However, MENDER uses cell type labels for the hard clustering, not gene counts, and produces poor clusterings.
run time comparison for human brain and mouse merfish across BASS, MENDER, and NNMF. Hard clusters from a manual annotation, NNMF's top signature, NNMF+K-means, BASS, and MENDER on the human brain data.
thebeehive.bsky.social
We use the very cool hard clustering benchmark system pubmed.ncbi.nlm.nih.gov/38491270/ and compared NNMF to fourteen state-of-the-art spatially-aware hard clustering methods, showing good performance of NNMF even in the hard clustering scenario.
Benchmark comparison across five datasets and three metrics, for 15 different methods.
thebeehive.bsky.social
NNMF works by using standard NMF updates, but using Gaussian smoothing of the factor weights on each spot at each iteration that encourages similar weights for spots nearby in space. No matrix inversion needed!
Graphical description of Neighborhood NMF, including its input and how we determined the number of factors.
thebeehive.bsky.social
NNMF is available in R, performs nonnegative matrix factorization on the gene counts that yields soft clusterings of every spot in spatial transcriptomics, and scales to many samples, arbitrary dimensions, & millions of spots. We run K-means on the soft cluster weights to get a NNSF hard clustering.
example of factor weights on mouse brain sample, and the gene programs that define each factor. Then we use K-means to built a hard clustering.
thebeehive.bsky.social
Exciting update!! @bioimagearchive.bsky.social is now hosting the first publicly available Incucyte data! If you have live-cell imaging data, please consider uploading to this amazing repository!! Thanks to Julia Carnevale and Alex Marson for experimental data —

www.ebi.ac.uk/biostudies/b...
thebeehive.bsky.social
Feedback welcome! And please play with these data! There is a lot more signal there.

Thank you to @bioimagearchive.bsky.social for hosting these Incucyte image data -- this is a new thing for them, and they have been so kind in working through the details of submission (link coming soon!)! 🎉
thebeehive.bsky.social
With five new collaborations in the works, and a paper characterizing the differences using explainable AI already accepted as an oral presentation at #PSB2025 (lead by high school senior Marcus Blennemann), look for future work in this space!
www.biorxiv.org/content/10.1...
GitHub - bee-hive/occident: Github repo for Occident Website hosting Live Cell Image Data
Github repo for Occident Website hosting Live Cell Image Data - bee-hive/occident
github.com
thebeehive.bsky.social
In summary, we found that, compared to the SH KO control condition, TCR T cells with the RASA2 KO have a longer dwell time and cripple cancer cells more effectively this way, whereas TCR T cells with the CUL5 KO proliferated more frequently upon activation, adding more T cells to the fight.
thebeehive.bsky.social
With a Markov model, we deconvolved when, in frame t-1, there is one cancer cell and one T cell in a window, and in frame t there is one cancer cell and two T cells. We were able to quantify how often this doubling of T cells attacking a cancer cell was due to proliferation or due to recruitment.
thebeehive.bsky.social
Most thrilling is that we can identify active T cells based on relative cell size and morphology, and watch T cells activate (differentially based on condition) after interacting with cancer cells.
thebeehive.bsky.social
Even more exciting, the speed of cancer cells decreased after interactions with T cells, as did their overall size (indicating stress).
thebeehive.bsky.social
While the # of T cell--cancer cell interactions increased similarly, these interactions & their effects were modulated by the CRISPR KOs. E.g., the time a T cell remained attached to a cancer cell (as estimated by a negative binomial and Markov model separately) was highest in RASA2 KO T cells.
thebeehive.bsky.social
Cancer cell and T cell morphology changes dramatically depending on state. These changes are visible in the brightfield imaging – active interacting T cells are larger and change to less circular shapes. Cancer cell begin to aggregate together when interacting with T cells.
thebeehive.bsky.social
We found that the number of T cells attached to cancer cells reduces the likelihood that the cancer cell will proliferate, with the beneficial KO T cells having greater effects on proliferation reduction.
thebeehive.bsky.social
We can study differences in cancer cell division events (lower in beneficial KO T cells) and average T cell speed (faster in beneficial KO T cells).
thebeehive.bsky.social
We found that T cell proliferation increased in the two beneficial KO T cells, in the CUL5 KO T cells in particular.
thebeehive.bsky.social
With the masked, tracked cells, we went to work to develop Occident. We were curious how well the RFP markers captured cancer cell number; we found that RFP lags as a proxy for cancer cell numbers.