Ming Tommy Tang
@tommytang.bsky.social
4.1K followers 1.4K following 7.2K posts
Director of bioinformatics at AstraZeneca. subscribe to my youtube channel @chatomics. On my way to helping 1 million people learn bioinformatics. Educator, Biotech, single cell. Also talks about leadership. tommytang.bio.link
Posts Media Videos Starter Packs
tommytang.bsky.social
The genomic landscape of relapsed infant and childhood KMT2A-rearranged acute leukemia www.nature.com/articles/s4...
tommytang.bsky.social
14/14
Bioinformatics isn’t just code.
It’s systems engineering in disguise.
Every error conquered = deeper understanding.
You’ve got this. 💪
tommytang.bsky.social
13/14
Key Takeaways:
Red text = clues, not failure

90% of errors are missing system libs

Isolate issues with clean environments

Document every fix—you’ll forget
tommytang.bsky.social
12/14
Windows users:
Install Rtools for C/C++ dependencies:
cran.r-project.org/bin/windows...

Restart RStudio after installing!
tommytang.bsky.social
11/14
Permission denied?
Never use sudo R. Instead:

install.packages("pkg", lib="~/R/libs") # User directory

Add export R_LIBS_USER=~/R/libs to .bashrc.
tommytang.bsky.social
10/14
Nuclear option: Start fresh.

R --vanilla # No startup configs
install.packages("problematic_pkg")

Or use renv/conda for isolated environments.
tommytang.bsky.social
9/14
ChatGPT Prompt Template:
"I’m on macOS Ventura, R 4.3.0. Trying to install DESeq2 with BiocManager::install(). Got this error: [paste]. Whats wrong?
Context matters.
tommytang.bsky.social
8/14
Still stuck?
Google the exact error in quotes:
"error: X11 headers not found" site:stackoverflow.com (you search with that site! cool google trick).
Add R or Bioconductor to narrow results.
Newest Questions
Stack Overflow | The World’s Largest Online Community for Developers
stackoverflow.com
tommytang.bsky.social
7/14
Bioconductor packages: Always use:
BiocManager::install("limma") # Not install.packages()!

Bioconductor has its own dependency tree.
tommytang.bsky.social
6/14
Installing Seurat? Common error:
libgfortran.so.3: cannot open...
Linux:
sudo apt install libgfortran3

macOS:
brew install gcc # Installs Fortran libs
tommytang.bsky.social
5/14
Linux example:
Error: X11 headers not found
Fix:
sudo apt-get install libx11-dev # Debian/Ubuntu

if on macOS
Install XQuartz (X11 server) (This has happened several times to me!)
tommytang.bsky.social
4/14
Don’t scroll past the error.
Read line by line.
Look for:
cannot find -lxml2 → Install libxml2-dev

zlib.h not found → Install zlib1g-dev (Linux) / zlib (macOS)
tommytang.bsky.social
3/14
Most install errors boil down to:
✅ R-level issues: Missing dependencies (e.g., limma)
✅ System-level issues: Missing libraries (e.g., zlib)
✅ Permissions: Can’t write to /usr/lib/
tommytang.bsky.social
2/14
You run:
install.packages("BiocManager") # For Bioconductor packages

Boom. Red wall. Panic sets in.
But breathe. The clues are there—if you know how to read them.
tommytang.bsky.social
1/14
Installation fail. Red text. Cryptic errors.
Every bioinformatician has been there—even experts.
It feels personal. It’s not.
tommytang.bsky.social
You just want to install an R package.
But the screen goes red.
Your stomach drops.
Even pros dread this moment. Here's how to survive it: 🧵
tommytang.bsky.social
12/
Bioinformatics isn’t about blindly analyzing big matrices.
It’s about asking:
“What’s the signal?
What’s the noise?
And what can I ignore to see the truth?”
That’s the real art of dimension reduction.
tommytang.bsky.social
11/
Takeaways:
High dimensions can lie

Use PCA/UMAP wisely

Interpret results with biology in mind

Don’t trust t-SNE for distances

Feature selection is your first defense
tommytang.bsky.social
10/
Bottom line:
Dimensionality is a curse only if you treat all data as equal.
The solution is knowing when to cut, compress, or interpret.
tommytang.bsky.social
9/
Key challenges in scRNA-seq:
Sparsity

Batch effects

Scalability
You need tools that know how to swim in noisy waters—Seurat, Scanpy, scVI.
tommytang.bsky.social
8/
scRNA-seq dimension reduction steps:
Pick highly variable genes

Use PCA to drop noise

UMAP to visualize clusters

Harmony/scVI to align across batches. (Harmony corrects the PC coordinates)

Autoencoders to denoise
tommytang.bsky.social
7/
This works in single-cell RNA-seq too—except now you’ve got 20,000 genes AND 50,000 cells.
Welcome to p ≈ n.
But there’s a twist: the matrix is full of zeroes. That’s dropout.
tommytang.bsky.social
6/
What does PCA really do?
Take 20,000 genes and build 10 orthogonal “super-genes” (PCs).
They explain variance, not necessarily biology.
Still, it helps you see patterns you’d never spot in raw data.