Lightnews — Scholar-powered news

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

There's a typo on line 225, it is actually 50% identity, not 90%. But yeah, SRA is highly redundant :)

2 3

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

@martinsteinegger.bsky.social‬, @caleblareau.bsky.social, @pierrepeterlongo.bsky.social, @rnalab.bsky.social

5

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

@tlemane.bsky.social‬, @mmontonerin.bsky.social, @apcamargo.bsky.social, @mattlabguy.bsky.social, @sinamajidian.bsky.social‬, @rfaure.bsky.social‬, @jmouradesousa.bsky.social‬, @epcrocha.bsky.social‬, @david-koslicki.bsky.social, @pashadag.bsky.social‬,

1 5

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

Earth’s genetic diversity is a heritage of humanity. It has been an honour to explore this data with a team of dedicated scientists who shared our vision of making this data free and accessible to all 🌍🧬❤️ Thank you!

Updated preprint: doi.org/10.1101/2024...

1 2 11

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

This is a new frontier for biological discovery and AI training data. Logan expands the universe of known proteins, plasmids, AMR, P4 satellites, and the newly discovered Obelisk RNA elements.

1 5

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

All Logan data is freely-available (cc0) right now. We show how Logan-Search (www.logan-search.org) can be used to uncover viral reactivation (HHV-6) in cell therapy products (TIL and CAR-T).

2 1 7

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

Logan rapidly accesses the tapestry of Life’s genetic diversity and can help solve global issues.

To tackle the microplastic crisis, we searched Logan for new versions of the 213 known plastic-degrading enzymes. We identified 200+ million homologs 🤯, including new high-efficiency enzymes 🥤🔥

1 2 4

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

Logan enables minute-scale k-mer search, and hour-scale deep homology protein alignment search, across 100+ Billion proteins.

www.logan-search.org

1 6

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

One year after our initial preprint, we're excited to post a major update to Logan.

At its heart, Logan is the assembly of 27 million samples (50 Pbp) using a 6-day cloud-compute peaking at 2.2M vCPUs. This compresses the SRA 140x compared to raw FASTQs.

github.com/IndexThePlan...

1 2 8

Rayan Chikhi @rayanchikhi.bsky.social · Sep 3

🌎👩‍🔬 For 15+ years biology has accumulated petabytes (million gigabytes) of🧬DNA sequencing data🧬 from the far reaches of our planet.🦠🍄🌵

Logan now democratizes efficient access to the world’s most comprehensive genetics dataset. Free and open.

doi.org/10.1101/2024...

3 120 220

Reposted by Rayan Chikhi

Institut Pasteur | 130 years of biomedical research @pasteur.fr · Jul 24

Congratulations to Rayan Chiki, (Institut Pasteur) head of the “Sequence Bioinformatics” unit, for securing the ERC Proof of Concept 2025 for his project ENZYMINER! 👏

‪@rayan.chiki.bsky.social

#Bioinformatics

4 13 60

Rayan Chikhi @rayanchikhi.bsky.social · Jul 25

thanks Niema!

1

Rayan Chikhi @rayanchikhi.bsky.social · Jul 25

thanks Ben!!

Rayan Chikhi @rayanchikhi.bsky.social · Jul 25

merci Sophie!

1

Rayan Chikhi @rayanchikhi.bsky.social · Jun 3

yes😅

Rayan Chikhi @rayanchikhi.bsky.social · Jun 3

Slides from my talk (with @kamilsjaron.bsky.social) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kme...

1 24 44

Reposted by Rayan Chikhi

Camila Duitama @camiladuitama.bsky.social · Feb 4

🧬 Excited to share our latest work, MUSET 🌭, a new tool for creating abundance unitig matrices from sequencing data. It was published yesterday in Oxford Bioinformatics if you want to have a look👀 :

academic.oup.com/bioinformati...

Let's break it down:

MUSET: Set of utilities for constructing abundance unitig matrices from sequencing data

AbstractSummary. MUSET is a novel set of utilities designed to efficiently construct abundance unitig matrices from sequencing data. Unitig matrices extend

academic.oup.com

1 13 18

Rayan Chikhi @rayanchikhi.bsky.social · Feb 3

For more context: Logan is a collection of all public sequencing data (until end of 2023) assembled into contigs. It is freely hosted on the cloud, and contains hundreds of terabytes of valuable genomic data: github.com/IndexThePlan...

GitHub - IndexThePlanet/Logan: Logan Unitigs and Contigs

Logan Unitigs and Contigs. Contribute to IndexThePlanet/Logan development by creating an account on GitHub.

github.com

1 5

Rayan Chikhi @rayanchikhi.bsky.social · Feb 3

We have updated all Logan contigs (now at version 1.1)! Contiguity has been much improved (2x) and a duplicated k-mers bug has been fixed. More information and changelog here: github.com/IndexThePlan...

github.com

1 10 24

Reposted by Rayan Chikhi

recombseq.bsky.social @recombseq.bsky.social · Jan 24

🚨 Keynotes at RECOMB-seq 2025! 🚨

🌟 Alicia Oshlack – computational transcriptomics
@aliciao.bsky.social

🌟 Rayan Chikhi – sequencing data structures
@rayanchikhi.bsky.social

🗓️ Dates: April 24–25, 2025
📍 Seoul, South Korea

recomb-seq.github.io/speakers/

24 31

Reposted by Rayan Chikhi

Paul Medvedev @pashadag.bsky.social · Dec 20

Do you want to learn systematic ways in which you can revise your research papers? I've posted a short collection of 4 lectures youtube.com/playlist?lis... 1/n

Writing in Computer Science - YouTube

This is a small collection of videos for learning about how to write research papers in computer science. For now, it contains four basic lectures, with more...

youtube.com

1 9 23

Rayan Chikhi @rayanchikhi.bsky.social · Nov 24

Ty Rob!

1

Reposted by Rayan Chikhi

Martin Steinegger 🇺🇦 @martinsteinegger.bsky.social · Nov 23

Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web.
🌐 bfvd.foldseek.com
💾 bfvd.steineggerlab.workers.dev
📄 academic.oup.com/nar/advance-...

6 130 340