Alexis Stamatakis
@stamatak.bsky.social
800 followers 420 following 35 posts
ERA Chair at Institute of Computer Science FORTH Research Group Leader Heidelberg Institute for Theoretical Studies Full Professor at Karlsruhe Institute of Technology Crete lab: https://www.biocomp.gr/ Heidelberg Lab: http://www.exelixis-lab.org/
Posts Media Videos Starter Packs
stamatak.bsky.social
Want to spend time in the French Alps and talk about Machine Learning for Evolutionary Genomics Data? Join us for the 2nd Legend conference - abstract submission deadline is on September 22 legend2025.sciencesconf.org
legend2025 : Machine Learning for Evolutionary Genomics Data - Sciencesconf.org
legend2025.sciencesconf.org
stamatak.bsky.social
Do you fancy spending some days in the French Alps in December and talk about Machine Learning for Evolutionary Genomics Data? Join us for the 2nd Legend conference: legend2025.sciencesconf.org
stamatak.bsky.social
Check out our new preprint on reproducible parallel phylogenetic inference under varying core counts - it also includes a generic method for reproducible parallel associative reduction operations www.biorxiv.org/content/10.1...
Bit-Reproducible Phylogenetic Tree Inference under Varying Core-Counts via Reproducible Parallel Reduction Operators
Motivation: Phylogenetic trees describe the evolutionary history among biological species based on their genomic data. Maximum Likelihood (ML) based phylogenetic inference tools search for the tree and evolutionary model that best explain the observed genomic data. Given the independence of likelihood score calculations between different genomic sites, parallel computation is commonly deployed. This is followed by a parallel summation over the per-site scores to obtain the overall likelihood score of the tree. However, basic arithmetic operations on IEEE 754 floating-point numbers, such as addition and multiplication, inherently introduce rounding errors. Consequently, the order by which floating-point operations are executed affects the exact resulting likelihood value since these operations are not associative. Moreover, parallel reduction algorithms in numerical codes re-associate operations as a function of the core count and cluster network topology, inducing different round-off errors. These low-level deviations can cause heuristic searches to diverge and induce high-level result discrepancies (e.g., yield topologically distinct phylogenies). This effect has also been observed in multiple scientific fields, beyond phylogenetics. Results: We observe that varying the degree of parallelism results in diverging phylogenetic tree searches (high level results) for over 31 % out of 10 130 empirical datasets. More importantly, 8 % of these diverging datasets yield trees that are statistically significantly worse than the best known ML tree for the dataset (AU-test, p < 0.05). To alleviate this, we develop a variant of the widely used phylogenetic inference tool RAxML-NG, which does yield bit-reproducible results under varying core-counts, with a slowdown of only 0 to 12.7 % (median 0.8 %) on up to 768 cores. We further introduce the ReproRed reduction algorithm, which yields bit-identical results under varying core-counts, by maintaining a fixed operation order that is independent of the communication pattern. ReproRed is thus applicable to all associative reduction operations – in contrast to competitors, which are confined to summation. Our ReproRed reduction algorithm only exchanges the theoretical minimum number of messages, overlaps communication with computation, and utilizes fast base-cases for local reductions. ReproRed is able to all-reduce (via a subsequent broadcast) 4.1 · 106 operands across 48 to 768 cores in 19.7 to 48.61 µs, thereby exhibiting a slowdown of 13 to 93 % over a non-reproducible all-reduce algorithm. ReproRed outperforms the state-of-the-art reproducible all-reduction algorithm ReproBLAS (offers summation only) beyond 10 000 elements per core. In summary, we re-assess non-reproducibility in parallel phylogenetic inference, present the first bit-reproducible parallel phylogenetic inference tool, as well as introduce a general algorithm and open-source code for conducting reproducible associative parallel reduction operations. ### Competing Interest Statement The authors have declared no competing interest. European Research Council, https://ror.org/0472cxd90, 882500 European Union, https://ror.org/019w4f821, 101087081
www.biorxiv.org
stamatak.bsky.social
Are you looking for a good excuse to visit Crete? Join us for the EMBO satellite workshop on Biodiversity Genomics - register for free via forms.gle/GRvPxCp2TnYd... limited spots available first-come first-served
stamatak.bsky.social
The next edition of our LEGEND conference on Machine Learning for Evolutionary Genomics Data will take place in Aussois (French Alps) from Dec 8-12 2025.

All practical information and a list of our keynote speakers are available at: legend2025.sciencesconf.org

We hope to meet you there again!
stamatak.bsky.social
By dynamic CPU clock frequency scaling our EcoFreq tool reduces your energy consumption and CO2 footprint by 15-18% while only experiencing a 10% throughput decrease. The tool is free of charge for academic use.

A short video explaining EcoFreq: youtu.be/cpw--Tsbib4
The EcoFreq Tool - compute with cleaner & cheaper energy
YouTube video by Alexandros Stamatakis
youtu.be
stamatak.bsky.social
Another new term we invented together with a colleague is specimen-drain: the export of valuable biodiversity, ancient DNA, or other samples to countries of the global North that have money to process them coupled with losing the lead on the respective research papers
stamatak.bsky.social
"To yield Greece more competitive, substantial increases of R&D expenditure and a long term strategic development plan are required such that the country becomes more than a tourist destination in the European periphery." - see our policy paper for more: www.frontiersin.org/journals/pol...
Frontiers | Necessary reforms in the Greek academic system
www.frontiersin.org
stamatak.bsky.social
Before somebody else comes up with it, here is a new term I invented: brain-redrain

Definition: brains that return to their underdeveloped home country in the hope to help improving things but then leave again since they are frustrated because nothing will ever change there.
stamatak.bsky.social
Here is the prediction for the UEFA EURO 2024 elimination round conducted at ICS-FORTH. Our PhD student Lucia computed 1 million predictions, considering team performance during the group phase only and taking qualifiers into account based on this paper link.springer.com/article/10.1...