Gregg Thomas
@gwct.bio
370 followers 320 following 9 posts
Bioinformatics Scientist at Harvard FAS Informatics. Evolution, Genomics, Phylogenetics. Dog walker. He/him. gwct.bio // informatics.fas.harvard.edu
Posts Media Videos Starter Packs
Reposted by Gregg Thomas
fabiology.bsky.social
📣 The Mendes Lab is recruiting PhD students in statistical phylogenetics! Interested, or know someone who might be? Details here 👉 tinyurl.com/542wyfb9 — please share!
gwct.bio
Relatedly, but for a narrower audience, I also adapted the Snakemake SLURM executor plugin to perform automatic partition selection on the Harvard Cannon cluster. I was surprised by the amount of flexibility the plugins allowed, which may have finally won me over to them.

github.com/harvardinfor...
GitHub - harvardinformatics/snakemake-executor-plugin-cannon: A Snakemake executor plugin for submitting jobs to the Harvard Cannon cluster
A Snakemake executor plugin for submitting jobs to the Harvard Cannon cluster - harvardinformatics/snakemake-executor-plugin-cannon
github.com
gwct.bio
Each task has an associated tutorial on our group's website:

informatics.fas.harvard.edu/resources/#t...

Feel free to reach out if you want to use any of these workflows and need help getting started or run into any issues!
Resources - Harvard FAS Informatics Group
informatics.fas.harvard.edu
gwct.bio
Recently I developed several Snakemake workflows for tasks related to Cactus and HAL files, including whole genome alignment and pangenome inference. The goal was to perform these tasks efficiently on SLURM-based (or possibly other) clusters. I hope they are useful!

github.com/harvardinfor...
GitHub - harvardinformatics/cactus-snakemake: Snakemake workflows for performing whole genome alignment with Cactus efficiently on SLURM clusters
Snakemake workflows for performing whole genome alignment with Cactus efficiently on SLURM clusters - harvardinformatics/cactus-snakemake
github.com
Reposted by Gregg Thomas
3rdreviewer.bsky.social
New paper led by @glom.bsky.social!

"Unprecedented female female bias in the aye-aye, a highly unusual lemur from Madagascar"

1/
journals.plos.org/plosbiology/...
Photo of an aye-aye
Reposted by Gregg Thomas
jamesbpease.bsky.social
The Pease Lab in the Dept of Evolution, Ecology, and Organismal Biology at The Ohio State University is looking for a Postdoc interested in genotype-phenotype-environment evolution in plant and animal genomes. Details at osu.wd1.myworkdayjobs.com/en-US/OSUCar... and more info at www.peaselab.org
Aronoff Lab at The Ohio State University
Reposted by Gregg Thomas
ecmoore.bsky.social
If you know anyone who might be interested in working as a technician before a PhD, I'm looking for someone to work with me to generate some amazing data to understand the genetics of behavior, sex differences, and reproduction in an evolutionary context. Bonus? Amazing and supportive department!
Reposted by Gregg Thomas
ekopania.bsky.social
Our paper on the evolution of male reproduction in murine rodents was selected as the editor's choice article in the January issue of Evolution!
journal-evo.bsky.social
This month's #EditorsChoice article: "Sperm competition intensity shapes divergence in both sperm morphology and reproductive genes across murine rodents" by Kopania et al. @ekopania.bsky.social https://buff.ly/3PoYFSE
Mus musculus in the snow. By Dion Art (CC BY-SA 4.0).
gwct.bio
Hmm, yea that could work. I wonder if there is any mouse data that could work for this? @jeffreygood.bsky.social
Although, I'm not sure how to map shortbreads. Maybe some batches with BWA-kery? :D
gwct.bio
Yea that would definitely help pinpoint which SNPs are being miscalled. I was also hoping to dig into why they are being miscalled - mis-mapped reads? unmapped reads? something else? Which I can't think of how to do without a truth set for the mappings themselves.
gwct.bio
A thorough look at what I think is an often overlooked problem in popgen and comparative genomics in general!
jazlynmooney.bsky.social
First major paper from the lab! The work was led by super talented postdoc Maria Akopyan. She explored how reference bias skews estimates of diversity, demography, divergence & recombination rate with help from @elliecat.bsky.social
& awesome ungrad Matthew. Full thread coming soon 🤓
biorxiv-evobio.bsky.social
Divergent reference genomes compromise the reconstruction of demographic histories, selection scans, and population genetic summary statistics https://www.biorxiv.org/content/10.1101/2024.11.26.625554v1
gwct.bio
We've (@jeffreygood.bsky.social) implemented this in pseudo-it (github.com/goodest-good...), though hard to quantify the effects without good simulations, which I've yet to find the right read simulation program for. Ideally want simulated bam and vcf to compare mapping and SNP calls. Ideas welcome!
GitHub - goodest-goodlab/pseudo-it: Beta version of the new pseudo-it software for iterative reference guided assemblies.
Beta version of the new pseudo-it software for iterative reference guided assemblies. - goodest-goodlab/pseudo-it
github.com
gwct.bio
Great to work with @ekopania.bsky.social, @jeffreygood.bsky.social, and everyone else on this project! I was happy to be able to present Emily's cool figure showing dN/dS of genes in different tissues and different time-points at Evolution a couple years ago:
A color figure (3) from the paper linked in the quoted post. The top panel A shows a drawing of the male rodent reproductive tract with different tissues shaded different colors, from dark blue to red, indicating median dN/dS values for genes expressed in those tissues. Below are colored cells using the same color scale from five different stages of spermatogenesis. Panels B and C below are graphs with points on the x-axise corresponding to the four cell stages shown above, with the y-axis for panel B being the proportion of genes under positive selection for each time point and for panel C being the proportion of genes expressed during that stage that are testis-specific. The points in panel B hover around the genome-wide average shown with a dotted horizontal line, while the points in panel C increase steadily above the genome-wide average, peaking in stage 4 (spermatids).
Reposted by Gregg Thomas
Reposted by Gregg Thomas
josephwb.bsky.social
For phun I made a "starter pack" for people involved in developing phylogenetic methods. If you feel you should not be involved, or feel I missed you (it is difficult!), please let me know. go.bsky.app/D7LsGUM 🧪
Reposted by Gregg Thomas
rejectresubmit.bsky.social
I'm recruiting students / postdocs to join my new lab at the University of Rochester for Fall 2025 onwards! If you're interested in phylogenetic comparative methods, genome evolution, and/or computational biology, please get in touch! More info:

mhibbins.github.io
Hibbins Lab
mhibbins.github.io
Reposted by Gregg Thomas
roblanfear.bsky.social
For phylo nerds interested in concordance and discordance...

Here's a tutorial that helps you calculate gene, site, and quartet concordance vectors for any branch on your tree.

iqtree.org/doc/recipes/...

(work with @3rdreviewer.bsky.social)

#phylogenetics 🧪 #evolution
A table with the title "Concordance factors for branch ID 545". It shows a table with 4 columns (concordance and discordance factors) and 3 rows (gene, site, and quartet). Each cell contains a percentage representing the proportion of the genes, sites, or quartets which match a particular tree. Cells with higher numbers are coloured darker red. A phylogeny showing six major clades of birds. Next to each is a table of concordance vectors.
Reposted by Gregg Thomas
natforsdick.bsky.social
Adam Freedman gives a great overview of the pros and cons of current genome annotation pipelines. If you have RNAseq data, use BRAKER or Stringtie, if you have a high quality closely related annotation, TOGA performs very well. #Evol2024
Reposted by Gregg Thomas
3rdreviewer.bsky.social
New preprint with Rob Lanfear!

"The meaning and measure of concordance factors in phylogenomics"

ecoevorxiv.org/repository/v...