hpc.social admins
banner
admin.mast.hpc.social.ap.brid.gy
hpc.social admins
@admin.mast.hpc.social.ap.brid.gy
Mastodon administration for mast.hpc.social. See more information at https://hpc.social and help sponsor this instance and related projects at […]

🌉 bridged from https://mast.hpc.social/@admin on the fediverse by https://fed.brid.gy/
Are you at #fosdem2026 and looking for good ways to get started or to learn and contribute more to #HPC open source software? First of all, don't miss the #hpc, #bigdata, and #datascience devroom Sunday Feb. 1 at https://fosdem.org/2026/schedule/track/hpc-big-data-data-science/ (look for @boegel […]
Original post on mast.hpc.social
mast.hpc.social
January 31, 2026 at 6:49 PM
Reposted by hpc.social admins
My list of package manager related academic papers is now up to 209!

https://nesbitt.io/2025/11/13/package-management-papers.html
Package Management Papers
There’s been all kinds of interesting academic research on package management systems, dependency resolution algorithms, software supply chain security, and package ecosystem analysis over the years. Below is a curated list of papers I’ve found interesting, it’s not exhaustive but covers a good chunk of the literature. **An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries** (2024) _Tom Mens, Alexandre Decan_ arXiv preprint Comprehensive literature review and survey of package dependency management research. Catalogues dependency-related challenges including dependency hell, technical lag, security vulnerabilities, and supply chain attacks. Covers SCA tools, SBOMs, and SLSA security levels. Good starting point for researchers and practitioners new to the field. The papers are organized by topic and include brief descriptions along with author names and publication years. This is a living document—if you know of papers that should be included, please reach out on Mastodon or open a pull request on GitHub. ## Package Management Security Research on security vulnerabilities, attack vectors, and defense mechanisms in package management systems. **A Look in the Mirror: Attacks on Package Managers** (2008) _Justin Cappos, Justin Samuel, Scott Baker, John H. Hartman_ ACM Conference on Computer and Communications Security (CCS) Seminal paper analyzing ten popular package managers (APT, YUM, YaST, Portage) discovering vulnerabilities in all systems exploitable by man-in-the-middle attackers or malicious mirrors. Demonstrated attackers controlling mirrors could compromise hundreds to thousands of clients weekly. Identified replay attacks, freeze attacks, extraneous dependencies attacks, and endless data attacks while proposing a layered security approach. A broader, more “textbook” analysis of these attacks is also available in a technical report by the authors. This further fleshes out a host of related attacks that rely on manipulation of dependency information by mirrors to cause package resolution to behave in ways that harm security or stability. **Package Managers Still Vulnerable** (2009) _Justin Samuel, Justin Cappos_ ;login: The USENIX Magazine Follow-up analysis examining how package managers responded to disclosed vulnerabilities, finding that while some (YaST, APT) made improvements, many remained vulnerable to replay, freeze, and endless data attacks. **Secure Software Updates: Disappointments and New Challenges** (2006) _Anthony Bellissimo, John Burgess, Kevin Fu_ USENIX Workshop on Hot Topics in Security (HotSec) Early analysis of popular software update mechanisms demonstrating that despite research progress, deployed systems relied on trusted networks and were susceptible to man-in-the-middle attacks. Examining McAfee VirusScan, Mozilla Firefox, and Windows Update, the study found none properly authenticated connections. While technically not package manager research, this work demonstrated that security was lacking in the general space of software update systems. **Mercury: Bandwidth-Effective Prevention of Rollback Attacks Against Community Repositories** (2017) _Trishank Kuppusamy, Vladimir Diaz, Justin Cappos_ USENIX Annual Technical Conference (USENIX ATC) Presented bandwidth-efficient techniques for preventing rollback attacks on package repositories in a way that scales to very large software repositories, such as PyPI. The techniques described here reduce metadata overhead by 95% compared to standard TUF while maintaining security properties. Using delta compression, Mercury achieves about 3.5% of average package size per month for PyPI users. **Artemis: Defanging Software Supply Chain Attacks in Multi-repository Update Systems** (2023) _Marina Moore, Trishank Kuppusamy, Justin Cappos_ Annual Computer Security Applications Conference (ACSAC) Discusses ways to securely use multiple repositories with a package manager. This includes a mechanism to 1) blocking or pinning a repository name to a specific repository, 2) a means for multiple parties to have different package namespaces on the same repository, and 3) a means to require a threshold of approvers for all of these operations. This paper presents lessons learned both from deployments of Uptane (the automotive variant of TUF which is widely used in automotive) and other TUF deployments across millions of devices. **Small World with High Risks: A Study of Security Threats in the npm Ecosystem** (2019) _Markus Zimmermann, Cristian-Alexandru Staicu, Cam Tenny, Michael Pradel_ USENIX Security Symposium Systematically analyzed dependencies, maintainers, and security issues in npm, finding that 20 maintainers can reach more than half the ecosystem and two-thirds of advisories remain unpatched. Demonstrated small-world network properties create concentrated security risks. **The impact of security vulnerabilities in the npm package dependency network** (2018) _Alexandre Decan, Tom Mens, Eleni Constantinou_ International Conference on Mining Software Repositories (MSR) Analyzed propagation of security vulnerabilities through npm dependency network, studying how vulnerabilities affect downstream packages and the time required for ecosystem-wide fixes. **Demystifying the vulnerability propagation and its evolution via dependency trees in the npm ecosystem** (2022) _Chengwei Liu, Sen Chen, Lingling Fan, Bihuan Chen, Yang Liu, Xin Peng_ IEEE/ACM International Conference on Software Engineering (ICSE) Analyzes vulnerability propagation within dependency trees by applying npm-specific dependency resolution rules, recommending lockfiles for managing dependencies. **Empirical Analysis of Security Vulnerabilities in Python Packages** (2021) _Various authors_ IEEE conference proceedings Analysis of 550 vulnerability reports affecting 252 Python packages in PyPI ecosystem, providing empirical evidence about vulnerability patterns in Python packages. **Surviving Software Dependencies** (2019) _Russ Cox_ ACM Queue Influential essay on managing software dependencies at scale. Discusses version selection, minimum version selection (used in Go), and the tradeoffs between different dependency management approaches. Required reading for anyone working on package managers. **The Impact of Regular Expression Denial of Service (ReDoS) in Practice** (2018) _James Davis, Christy Coghlan, Francisco Servant, Dongyoon Lee_ ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) - Distinguished Paper Award Ecosystem-scale study of ReDoS vulnerabilities in npm and PyPI. Found thousands of super-linear regexes affecting over 10,000 modules. 93% of vulnerable regexes are polynomial rather than exponential, missed by common detection tools. **Thou Shalt Not Depend on Me: Analysing the Use of Outdated JavaScript Libraries on the Web** (2017) _Tobias Lauinger, Abdelberi Chaabane, Sajjad Arshad, William Robertson, Christo Wilson, Engin Kirda_ Network and Distributed System Security Symposium (NDSS) First comprehensive study of client-side JavaScript library usage across 133K websites. Found 37% include at least one library with a known vulnerability. Median site uses library versions released 1,177 days before newest available release. ## Lockfiles Research on lockfile design, usage, and their role in dependency management. **The Design Space of Lockfiles Across Package Managers** (2025) _Yogya Gamage, Deepika Tiwari, Martin Monperrus, Benoit Baudry_ arXiv preprint First study of lockfiles across seven package managers (npm, pnpm, Cargo, Poetry, Pipenv, Gradle, Go). Analyzes lockfile content and lifecycle differences, finding Go has near 100% lockfile commit rate while Gradle is close to zero. Interviews with 15 developers reveal benefits (build determinism, integrity verification, transparency) and challenges (readability, delayed updates, library locking). Recommends generating lockfiles by default and committing them for all projects. **Reproducible builds: Increasing the integrity of software supply chains** (2022) _Chris Lamb, Stefano Zacchiroli_ IEEE Software Overview of the reproducible builds movement and its importance for software supply chain security. Discusses how bit-for-bit reproducibility enables independent verification of build artifacts. **It’s Like Flossing Your Teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security** (2023) _Marcel Fourné, Dominik Wermke, William Enck, Sascha Fahl, Yasemin Acar_ IEEE Symposium on Security and Privacy (S&P) 24 semi-structured interviews with Reproducible-Builds.org participants. Found self-effective work by highly motivated developers and collaborative communication with upstream projects are key to achieving reproducible builds. Identifies path for R-Bs to become commonplace. **Investigating the reproducibility of npm packages** (2020) _Pronnoy Goswami, Saksham Gupta, Zhiyuan Li, Na Meng, Daphne Yao_ IEEE International Conference on Software Maintenance and Evolution (ICSME) Empirical study of npm package reproducibility, analyzing factors that affect whether packages can be rebuilt identically from source. **Pinning is futile: You need more than local dependency versioning to defend against supply chain attacks** (2025) _Hao He, Bogdan Vasilescu, Christian Kästner_ arXiv preprint Study finding that local pinning leads to more security vulnerabilities due to bloated and outdated dependencies. Suggests risk of malicious package updates can be reduced when core dependencies pin their versions and keep them updated regularly. **Maven-Lockfile: High Integrity Rebuild of Past Java Releases** (2025) _Larissa Schmid, et al._ arXiv preprint Addresses Maven’s lack of native lockfile support. Presents Maven-Lockfile to generate and update lockfiles capturing all direct and transitive dependencies with checksums. Enables high integrity builds and can detect tampered artifacts. **Does Functional Package Management Enable Reproducible Builds at Scale? Yes.** (2025) _Julien Malka, Stefano Zacchiroli, Théo Zimmermann_ International Conference on Mining Software Repositories (MSR) - Distinguished Paper Award First large-scale study of bitwise reproducibility in Nix, rebuilding 709,816 packages from historical snapshots of nixpkgs sampled between 2017 and 2023. Achieved reproducibility rates between 69% and 91% with an upward trend, and rebuildability rates over 99%. Found about 15% of unreproducibility failures are due to embedded build dates. Released a dataset with build statuses, logs, and recursive diffs showing where unreproducible artifacts differ. **Improving Reproducibility of Scientific Software Using Nix/NixOS: A Case Study on the preCICE Ecosystem** (2025) _Max Hausch, Simon Hauser, Benjamin Uekermann_ Electronic Communications of the EASST Case study applying Nix to scientific software reproducibility in the preCICE coupling library ecosystem. Demonstrates how functional package management provides guarantees that packages and their dependencies can be built reproducibly, addressing challenges in computational science where results must be independently verifiable. ## Dependency Resolution Algorithms and Challenges Research establishing the theoretical complexity of dependency resolution and practical solutions. **EDOS deliverable WP2-D2.1: Report on Formal Management of Software Dependencies** (2005) _Roberto Di Cosmo_ INRIA Technical Report First document to show that the package installation problem is NP-complete. First to show a 3SAT encoding for Debian and RPM solves. Compares package constraint languages and proposes improvements for metadata. **OPIUM: Optimal Package Install/Uninstall Manager** (2007) _Chris Tucker, David Shuffelton, Ranjit Jhala, Sorin Lerner_ International Conference on Software Engineering (ICSE) Introduced complete dependency solver using SAT, pseudo-boolean optimization, and Integer Linear Programming. OPIUM guarantees completeness and optimizes user-defined objectives. Demonstrated 23.3% of Debian users encounter apt-get’s incompleteness failures. **Automated dependency resolution for open source software** (2010) _Joel Ossher, Sushil Bajracharya, Cristina Lopes_ IEEE Working Conference on Mining Software Repositories (MSR) Proposed techniques for automatically resolving dependencies in open source projects by mining and analyzing source code repositories, addressing challenges when dependency metadata is incomplete or unavailable. **Handling software upgradeability problems with MILP solvers** (2010) _Claude Michel, Michel Rueher_ International Workshop on Logics for Component Configuration (LoCoCo) Demonstrated how Mixed Integer Linear Programming solvers can handle package upgradeability problems, offering an alternative to SAT-based approaches with different performance characteristics. **Solving Linux Upgradeability Problems Using Boolean Optimization** (2010) _Josep Argelich, Daniel Le Berre, Inês Lynce, João P. Marques Silva, Pascal Rapicault_ International Workshop on Logics for Component Configuration (LoCoCo) Applied pseudo-boolean optimization techniques to Linux package upgradeability, showing how boolean optimization can find optimal solutions while respecting user preferences. **Dependency solving: A separate concern in component evolution management** (2012) _Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli_ Journal of Systems and Software (JSS) Argued for modular package manager architecture where dependency solving separates from other concerns. Reviewed state-of-the-art package managers and proposed generic external solvers (SAT, PBO, MILP) rather than ad-hoc heuristics. **Modelling and Resolving Software Dependencies** (2005) _Daniel Burrows_ Technical Report Presented abstract model of dependency relationships and restartable best-first-search technique for dependency resolution. Documents theoretical approach behind aptitude’s problem resolver. **Dependency Solving Is Still Hard, but We Are Getting Better at It** (2020) _Pietro Abate, Roberto Di Cosmo, Georgios Gousios, Stefano Zacchiroli_ IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Retrospective analysis conducting census of dependency solving capabilities in state-of-the-art package managers, showing SAT-based approaches are gaining adoption. Demonstrated that despite NP-completeness, practical solvers perform well on real-world instances. **aspcud: A Linux Package Configuration Tool Based on Answer Set Programming** (2011) _Martin Gebser, Roland Kaminski, Torsten Schaub_ Electronic Proceedings in Theoretical Computer Science Introduced aspcud, a dependency solver using Answer Set Programming rather than SAT or MILP. Demonstrates ASP as a viable alternative for package configuration, with declarative specification of optimization criteria and competitive performance on Debian package problems. **On software component co-installability** (2011) _Roberto Di Cosmo, Jérôme Vouillon_ SIGSOFT Symposium on the Foundations of Software Engineering (FSE) Addressed fundamental challenge of determining which software components can be installed together, developing formal framework with graph-theoretic transformations to simplify dependency repositories while preserving co-installability properties. **Strong dependencies between software components** (2009) _Pietro Abate, Roberto Di Cosmo, Jaap Boender, Stefano Zacchiroli_ ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) Studied strong dependency relationships where packages are tightly coupled, analyzing patterns of mandatory co-installation and implications for system evolution. **Watchman: Monitoring dependency conflicts for python library ecosystem** (2020) _Ying Wang, Ming Wen, Yepang Liu, Yibo Wang, Zhenming Li, Chao Wang, Hai Yu, Shing-Chi Cheung, Chang Xu, Zhiliang Zhu_ IEEE/ACM International Conference on Software Engineering (ICSE) Identifies factors leading to dependency conflicts in Python ecosystem and proposes monitoring approach for detecting conflicts. **smartpip: A smart approach to resolving python dependency conflict issues** (2023) _Chenyang Wang, Rongxin Wu, Haoran Song, Junjie Shu, Guozhu Li_ IEEE/ACM International Conference on Automated Software Engineering (ASE) Highlights issues related to inefficiency and excessive resource usage by dependency resolution strategies in Python, proposing improved resolution approach. **ConflictJS: Finding and understanding conflicts between javascript libraries** (2018) _Jibesh Patra, Pooja N. Dixit, Michael Pradel_ IEEE/ACM International Conference on Software Engineering (ICSE) Analyzes dependency conflicts in JavaScript arising from namespace collisions, proposing detection and understanding mechanisms. **Could I Have a Stack Trace to Examine the Dependency Conflict Issue?** (2019) _Ying Wang, Ming Wen, Rongxin Wu, Zhenwei Liu, Shin Hwei Tan, Zhiliang Zhu, Hai Yu, Shing-Chi Cheung_ IEEE/ACM International Conference on Software Engineering (ICSE) Proposes approach to help developers diagnose dependency conflicts in Java/Maven by generating stack traces that reveal how conflicts manifest at runtime, making abstract version incompatibilities concrete and actionable. **Hero: On the chaos when path meets modules** (2021) _Ying Wang, Liang Qiao, Chang Xu, Yepang Liu, Shing-Chi Cheung, Na Meng, Hai Yu, Zhiliang Zhu_ IEEE/ACM International Conference on Software Engineering (ICSE) Studies conflicts in Go ecosystem caused by coexistence of two library referencing modes: GOPATH and Go modules. **Stork: Secure Package Management For VM Environments** (2008) _Justin Cappos_ Dissertation (University of Arizona) – Chapter 3.8 Describes backtracking dependency resolution. In contrast to more mathematically advanced techniques, this tries the best match greedily for each package and then rewinds state if there is a conflict. Through practical use in Stork, this was found to work well for adopters, despite its simplcity. **Solving Package Management via Hypergraph Dependency Resolution** (2025) _Ryan Gibb, Patrick Ferris, David Allsopp, Michael Winston Dales, Mark Elvers, Thomas Gazagnaire, Sadiq Jaffer, Thomas Leonard, Jon Ludlam, Anil Madhavapeddy_ arXiv preprint Introduces HyperRes, a formal framework modeling dependencies as hypergraphs to address fragmentation across package managers. Demonstrates translation of metadata between different package managers and solving dependency constraints across ecosystems without forcing users to abandon their preferred tools. **Using Answer Set Programming for HPC Dependency Solving** (2022) _Todd Gamblin, Massimiliano Culpo, Gregory Becker, Sergei Shudler_ Supercomputing Describes the ASP encoding used for Spack’s dependency solver: how to model versions, variants, and dependencies. Also describes how to structure optimization criteria to mix source and binary builds by reusing existing installations/build caches (if they’re compatible). **Using Answer Set Programming for HPC Dependency Solving** (2022) _Todd Gamblin, Massimiliano Culpo, Gregory Becker, Sergei Shudler_ Supercomputing Describes the ASP encoding used for Spack’s dependency solver: how to model versions, variants, and dependencies. Also describes how to structure optimization criteria to mix source and binary builds by reusing existing installations/build caches. **Bridging the Gap Between Binary and Source Based Package Management in Spack** (2025) _John Gouwar, Greg Becker, Tamara Dahlgren, Nathan Hanford, Arjun Guha, and Todd Gamblin_ Supercomputing Discusses some differences beteween source and binary package solving. Describes how to avoid the rigid ABI requirements of Spack’s (and Nix’s and Guix’s) hashing model and not rebuild the world when an ABI-stable package like zlib changes, while preserving reproducibility for mixed (or “impure” in nix-speak) installations. ## Software Supply Chain Security Research on supply chain attacks, detection methods, and prevention frameworks. **in-toto: Providing Farm-to-Table Guarantees for Bits and Bytes** (2019) _Santiago Torres-Arias, Hammad Afzali, Trishank Kuppusamy, Reza Curtmola, Justin Cappos_ USENIX Security Symposium Presented framework for securing the entire software supply chain from development to deployment using cryptographic metadata. Analyzed 30 major supply chain attacks and demonstrated in-toto would have prevented 23 (77%) outright. Deployed at Datadog, Debian, and Kubernetes. **Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks** (2020) _Marc Ohm, Henrik Plate, Arnold Sykosch, Michael Meier_ International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA) Presented dataset and analysis of 174 malicious packages from npm, PyPI, and RubyGems used in real-world attacks between November 2015 and November 2019. Introduced attack trees categorizing injection techniques and execution triggers. **Towards Measuring Supply Chain Attacks on Package Managers for Interpreted Languages** (2020) _Ruian Duan, Omar Alrawi, Ranjita Pai Kasturi, Ryan Elder, Brendan Saltaformaggio, Wenke Lee_ arXiv preprint Proposed comparative framework for assessing security features of package managers for interpreted languages. Developed MalOSS pipeline for automated malware detection, finding and reporting 339 new malicious packages, with 278 (82%) confirmed by maintainers. **SoK: Taxonomy of Attacks on Open-Source Software Supply Chains** (2023) _Piergiorgio Ladisa, Henrik Plate, Matias Martinez, Olivier Barais_ IEEE Symposium on Security and Privacy (S&P) Systematized knowledge about attacks on open-source software supply chains, proposing taxonomy independent of specific languages or ecosystems. Identified 12 distinct attack categories and analyzed their prevalence. **SoK: Analysis of Software Supply Chain Security by Establishing Secure Design Properties** (2022) _Chinenye Okafor, James Davis, et al._ ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (SCORED) Systematized knowledge about secure software supply chain patterns, identifying four stages of supply chain attacks and proposing three security properties: transparency, validity, and separation. **Research directions in software supply chain security** (2025) _Laurie Williams, Grace Benedetti, Samuel Hamer, Ranindya Paramitha, Imranur Rahman, Mahzabin Tamanna, Gabriel Tystahl, Nusrat Zahan, Patrick Morrison, Yasemin Acar, Michel Cukier, Christian Kästner, Alexandros Kapravelos, Dominik Wermke, William Enck_ ACM Transactions on Software Engineering and Methodology (TOSEM) Survey identifying key research directions in software supply chain security including dependency management, vulnerability detection, and trust models across package ecosystems. **Modeling Interconnected Social and Technical Risks in Open Source Software Ecosystems** (2022) _William Schueller, Johannes Wachs_ arXiv preprint Examines how social and technical factors interact to create systemic risks in open source ecosystems. Developers often maintain multiple interdependent libraries, meaning individual departures can cascade failures across projects. Develops a framework measuring risk based on both dependency networks and developer involvement, applied to the Rust ecosystem. **Out of Sight, Out of Mind? How Vulnerable Dependencies Affect Open-Source Projects** (2021) _Gede Artha Azriadi Prana, Abhishek Sharma, Lwin Khin Shar, Darius Foo, Andrew Santosa, Asankhaya Sharma, David Lo_ Empirical Software Engineering Analyzed vulnerabilities in 450 Java, Python, and Ruby projects using industrial SCA tool. Found vulnerabilities persist 3-5 months after fixes become available. Highlights importance of managing dependency count and performing timely updates. **Software Supply Chain: Review of Attacks, Risk Assessment Strategies and Security Controls** (2023) _Betul Gokkaya, et al._ arXiv preprint Systematic literature review analyzing common software supply chain attacks and providing latest trends. Identified security risks for open-source and third-party software supply chains. **Challenges of Producing Software Bill Of Materials for Java** (2023) _Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin Monperrus, Javier Ron, Aman Sharma, Gabriel Skoglund, César Soto-Valero, Martin Wittlinger_ arXiv preprint Evaluated six SBOM generation tools on complex open-source Java projects, identifying hard challenges for accurate SBOM production and usage in software supply chain security contexts. **On the way to sboms: Investigating design issues and solutions in practice** (2024) _Tingting Bi, Boming Xia, Zhenchang Xing, Qinghua Lu, Liming Zhu_ ACM Transactions on Software Engineering and Methodology (TOSEM) Investigates SBOM design issues and solutions, noting lockfiles as related to SBOM generation. **On the correctness of metadata-based sbom generation: A differential analysis approach** (2024) _Songqiang Yu, Wei Song, Xiaolong Hu, Heng Yin_ IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) Differential analysis evaluating correctness of SBOM generation from metadata, using lockfiles as source of truth for comparison. **SBOM.EXE: Countering Dynamic Code Injection based on Software Bill of Materials in Java** (2024) _Aman Sharma, Martin Wittlinger, Benoit Baudry, Martin Monperrus_ arXiv preprint Proposes a runtime defense mechanism for Java applications that constructs an allowlist of legitimate classes using complete software supply chain information, then enforces this list during execution to block unauthorized classes. Tested against critical vulnerabilities including Log4Shell-style threats with minimal performance impact. **Dirty-Waters: Detecting Software Supply Chain Smells** (2024) _Raphina Liu, Sofia Bobadilla, Benoit Baudry, Martin Monperrus_ arXiv preprint Introduces “software supply chain smell” as a novel concept for identifying problematic dependency patterns. Presents Dirty-Waters tool for detecting these smells in JavaScript projects, finding many patterns that reveal potential supply chain risks. **LastPyMile: Identifying the Discrepancy Between Sources and Packages** (2021) _Duc-Ly Vu, Fabio Massacci, Ivan Pashchenko, Henrik Plate, Antonino Sabetta_ ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) Proposed methodology for identifying discrepancies between source code repositories (GitHub) and distributed packages (PyPI). Analyzed 2,438 popular PyPI packages, finding on average 5.8% of artifacts and 2.6% of files have changes. **Towards Using Source Code Repositories to Identify Software Supply Chain Attacks** (2020) _Duc-Ly Vu, Ivan Pashchenko, Fabio Massacci, Henrik Plate, Antonino Sabetta_ ACM Conference on Computer and Communications Security (CCS) Earlier work exploring use of source code repository analysis for detecting supply chain attacks, establishing foundation for LastPyMile approach by identifying that attackers inject minimal code changes. **Software Composition Analysis and Supply Chain Security in Apache Projects: An Empirical Study** (2025) _Sabato Nocera, Sira Vegas, Giuseppe Scanniello, Natalia Juristo_ International Conference on Mining Software Repositories (MSR) Investigated effects of adopting OWASP Dependency-Check (SCA tool) in Apache Software Foundation Java Maven projects. Found adoption causes significant reduction in vulnerabilities including high-severity CVEs. ## Package Repository Analysis and Ecosystems Large-scale empirical studies of package ecosystems and their structural properties. **A Look at the Dynamics of the JavaScript Package Ecosystem** (2016) _Erik Wittern, Philippe Suter, Shriram Rajagopalan_ International Conference on Mining Software Repositories (MSR) First analysis of npm ecosystem examining package descriptions, dependencies, download metrics, and historical evolution. Analyzed 230,000+ packages over 6 years. **npm-follower: A Complete Dataset Tracking the NPM Ecosystem** (2023) _Donald Pinckney, Federico Cassano, Arjun Guha, Jonathan Bell_ ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE) Introduced dataset architecture that archives metadata and code of all npm packages as published, including deleted versions (330,000+ versions deleted between July 2022-May 2023). **npm-miner: An Infrastructure for Measuring the Quality of the npm Registry** (2018) _Kyriakos Chatzidimitriou, Michail Papamichail, Themistoklis Diamantopoulos, Michail Tsapanos, Andreas Symeonidis_ International Conference on Mining Software Repositories (MSR) Infrastructure that crawls npm and analyzes packages using static analysis to extract quality metrics including maintainability and security. Identified ecosystem issues like packages with broken GitHub URLs and copied-pasted projects with only package names changed. **On the accuracy of github’s dependency graph** (2024) _Daniele Bifolco, Sara Nocera, Simone Romano, Massimiliano Di Penta, Rita Francese, Giuseppe Scanniello_ International Conference on Evaluation and Assessment in Software Engineering (EASE) Assesses accuracy of GitHub dependency graph in Java and Python projects, using lockfiles as source of truth for comparison. **Understanding and Detecting Peer Dependency Resolving Loop in npm Ecosystem** (2025) _Xiaohui Wang, Mingyu Wang, Weijian Shen, Rui Chang_ IEEE/ACM International Conference on Software Engineering (ICSE) In-depth study of conflicts between peer dependencies in npm, examining how circular peer dependencies create resolution loops. **An Empirical Analysis of Technical Lag in npm Package Dependencies** (2018) _Ahmed Zerouali, Eleni Constantinou, Tom Mens, Gregorio Robles, Jesús M. González-Barahona_ International Conference on Software Reuse (ICSR) Introduced technical lag metric to assess how outdated packages are compared to latest releases, finding strong presence caused by dependency constraints indicating reluctance to update. **An Empirical Analysis of the Python Package Index (PyPI)** (2019) _Ethan Bommarito, Michael J. Bommarito II_ arXiv preprint Empirical summary covering 178,592 packages, 1,745,744 releases, 76,997 contributors, and 156.8M+ import statements. Found 47% CAGR for active packages, 39% for new authors. **Analyzing the Accessibility of GitHub Repositories for PyPI and NPM Libraries** (2024) _Alexandros Tsakpinis, Alexander Pretschner_ arXiv preprint Analyzed accessibility of GitHub repositories for libraries using page rank algorithm, finding up to 80.1% of PyPI and 81.1% of npm libraries have repository URLs within dependency chains. **A Study of Bloated Dependencies in the Maven Ecosystem** (2021) _César Soto-Valero, Nicolas Harrand, Martin Monperrus, Benoit Baudry_ Empirical Software Engineering Analyzed 9,639 Java artifacts with 723,444 dependency relationships using DepClean tool, finding that bloated dependencies significantly increase binary size and maintenance effort. **Goblin: A Framework for Enriching and Querying the Maven Central Dependency Graph** (2024) _Damien Jaime, Joyce El Haddad, Pascal Poizat_ International Conference on Mining Software Repositories (MSR) Introduced customizable framework comprising dependency graph metamodel with temporal information, miner for Maven Central, and tool for metric weaving. **The Ripple Effect of Vulnerabilities in Maven Central** (2025) _Multiple authors_ arXiv preprint Most recent large-scale Maven vulnerability study analyzing 4 million releases. Found only 1% of releases have direct vulnerabilities, but 46.8% are affected by transitive vulnerabilities. Patch time often spans several years even for critical vulnerabilities. Demonstrates more central artifacts are not necessarily less vulnerable. **Out of Sight, Still at Risk: The Lifecycle of Transitive Vulnerabilities in Maven** (2025) _Piotr Przymus, Mikołaj Fejzer, Jakub Narębski, Krzysztof Rykaczewski, Krzysztof Stencel_ IEEE/ACM International Conference on Mining Software Repositories (MSR) Uses survival analysis to measure how long projects remain exposed after CVE introduction. Shows vulnerabilities at deeper dependency levels persist longer due to compounded resolution delays. Mean time to fix rises from 215 days at level 0 to 2,075 days at level 10. **How Deep Does Your Dependency Tree Go? An Empirical Study of Dependency Amplification Across 10 Package Ecosystems** (2025) _Jahidul Arafat_ arXiv preprint Studies dependency amplification (ratio of transitive to direct dependencies) across 500 projects in 10 ecosystems. Maven exhibits highest mean amplification at 24.7x compared to 4.3x for npm. Challenges prevailing assumptions that npm’s preference for small packages leads to highest amplification. **Understanding Software Vulnerabilities in the Maven Ecosystem** (2025) _Multiple authors_ MSR 2025 Mining Challenge Vulnerability analysis of 77,393 vulnerable releases with 226 unique CWEs. Found 25 CWEs account for nearly 70% of all vulnerabilities. Vulnerabilities take approximately 5 years to document and 4.4 years to resolve on average. Input validation and access control issues dominate. **Tracing Vulnerabilities in Maven: A Study of CVE lifecycles** (2025) _Corey Yang-Smith et al._ arXiv preprint Brand new lifecycle and response time analysis of 3,362 CVEs in Maven. Documents “Publish-Before-Patch” scenarios. Response time reduced 48.3% for critical vs low severity vulnerabilities (78 vs 151 days). Contributor absence and issue activity correlate with CVE occurrences. **A Large-Scale Security-Oriented Static Analysis of Python Packages in PyPI** (2021) _Multiple authors_ arXiv preprint Largest static analysis of PyPI at time of publication, analyzing 197,000+ packages with 749,000+ security issues. Found 46% of Python packages have at least one security issue. Exception handling and code injections most common. Subprocess module identified as particularly problematic. **An Empirical Analysis of the R Package Ecosystem** (2021) _Ethan Bommarito, Michael J. Bommarito II_ arXiv preprint Analysis of 25,000+ packages, 150,000 releases across CRAN, Bioconductor, and GitHub over two decades. Found top 5 packages imported by 25% of all packages, top 10 maintainers support packages imported by 50%+ of ecosystem. **A Complex Network Analysis of the Comprehensive R Archive Network (CRAN) Package Ecosystem** (2020) _Multiple authors_ Journal of Systems and Software Applied complex network analysis to CRAN dependency graph from macroscopic, microscopic, and modular perspectives. Demonstrated how network theory helps profile ecosystem strengths, practices, and risks. **Evolution and Prospects of the Comprehensive R Archive Network (CRAN) Package Ecosystem** (2020) _Marcelino Mora-Cantallops, Salvador Sánchez-Alonso, Elena García-Barriocanal_ Journal of Software: Evolution and Process 20-year empirical analysis of CRAN evolution considering laws of software evolution and CRAN policies. Found progress consistent with continuous growth/change laws but relevant increase in complexity in recent years. **An Empirical Comparison of Dependency Network Evolution in Seven Software Packaging Ecosystems** (2019) _Alexandre Decan, Tom Mens, Philippe Grosjean_ Empirical Software Engineering Quantitative analysis of seven packaging ecosystems (Cargo, CPAN, CRAN, npm, NuGet, Packagist, RubyGems) using libraries.io dataset. Demonstrated important structural differences that complicate cross-ecosystem generalization. **The Multibillion Dollar Software Supply Chain of Ethereum** (2022) _César Soto-Valero, Martin Monperrus, Benoit Baudry_ arXiv preprint Examines how Java Ethereum nodes depend on third-party software maintained by various organizations, analyzing the supply chain supporting blockchain infrastructure and highlighting reliability and security challenges from diverse external dependencies. **A Closer Look at the Security Risks in the Rust Ecosystem** (2024) _Multiple authors_ ACM Transactions on Software Engineering and Methodology (TOSEM) First security investigation of Rust ecosystem. Analyzed dataset of 433 vulnerabilities across 300 vulnerable code repositories. Found vulnerable code is localized at file level and contains significantly more unsafe functions/blocks. More popular packages have more vulnerabilities, while less popular packages remain vulnerable for more versions. **An empirical study of yanked releases in the rust package registry** (2023) _Hao Li, Filipe Cogo, Cor-Paul Bezemer_ IEEE Transactions on Software Engineering Reveals that 46% of Rust packages adopted yanked releases and the proportion keeps increasing. In Cargo, yanked releases can only be resolved if a lockfile is present. **Evolving collaboration, dependencies, and use in the rust open source software ecosystem** (2022) _William Schueller, Johannes Wachs, Vito D.P. Servedio, Stefan Thurner, Vittorio Loreto_ Scientific Data Dataset curating Rust ecosystem data over eight years, capturing developer activity, library dependencies, and usage trends. **Why do software packages conflict?** (2012) _Cyrille Artho, Roberto Di Cosmo, Kuniyasu Suzaki, Stefano Zacchiroli_ IEEE Working Conference on Mining Software Repositories (MSR) Empirical investigation of root causes of package conflicts in Debian ecosystem, categorizing conflict types and their frequencies. **Are There Too Many R Packages?** (2012) _Multiple authors_ Austrian Journal of Statistics Analysis questioning the growth and sustainability of the R package ecosystem. **The Evolution of the R Software Ecosystem** (2013) _Multiple authors including Ahmed E. Hassan_ Academic publication Historical analysis of R ecosystem evolution and growth patterns. **On the Maintainability of CRAN Packages** (2014) _Tom Mens et al._ Academic publication Study examining maintainability challenges in the CRAN ecosystem. **On the Development and Distribution of R Packages: An Empirical Analysis of the R Ecosystem** (2015) _Multiple authors_ Academic publication Empirical analysis of R package development and distribution patterns. **When GitHub meets CRAN: An analysis of inter-repository package dependency problems** (2016) _Multiple authors_ IEEE conference proceedings Analysis of dependency problems arising from packages split between GitHub and CRAN. ## Version Constraints and Semantic Versioning Research on versioning practices, semantic versioning adoption, and breaking changes. **Dependency Versioning in the Wild** (2019) _Jens Dietrich, David Pearce, Jacob Stringer, Amjed Tahir, Kelly Blincoe_ International Conference on Mining Software Repositories (MSR) Large-scale empirical study of versioning practices across 17 package managers, analyzing over 70 million dependencies, complemented by survey of 170 developers. Found many package managers support flexible versioning but developers struggle to balance predictability and agility. **What do package dependencies tell us about semantic versioning?** (2021) _Alexandre Decan, Tom Mens_ IEEE Transactions on Software Engineering Analyzed relationship between dependency declarations and semantic versioning across multiple package ecosystems, revealing disconnect between versioning theory and developer practices. **Technical Lag in Software Compilations: Measuring How Outdated a Software Deployment Is** (2017) _Jesús M. González-Barahona, Paul Sherwood, Gregorio Robles, Daniel Izquierdo_ IFIP International Conference on Open Source Systems (OSS) Introduces the concept of technical lag for measuring how outdated a deployed system is. Proposes theoretical model to assist decisions about upgrading in production, balancing being up-to-date against keeping working versions. **A Formal Framework for Measuring Technical Lag in Component Repositories** (2019) _Ahmed Zerouali, Tom Mens, Jesús González-Barahona, Alexandre Decan, Eleni Constantinou, Gregorio Robles_ Journal of Software: Evolution and Process Formalizes a generic model of technical lag quantifying how outdated a deployed collection of components is. Operationalizes the model for npm and analyzes 500K+ packages over seven years, considering direct and transitive dependencies. **On the Evolution of Technical Lag in the npm Package Dependency Network** (2018) _Alexandre Decan, Tom Mens, Eleni Constantinou_ IEEE International Conference on Software Maintenance and Evolution (ICSME) Studied technical lag (outdatedness of dependencies) in npm ecosystem, examining tension between stability and freshness in dependency management. **Understanding Breaking Changes in the Wild** (2023) _Dhanushka Jayasuriya, Valerio Terragni, Jens Dietrich, Samuel Ou, Kelly Blincoe_ ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) Empirical study finding 11.58% of dependency updates contain breaking changes that impact clients. Almost half of detected breaking changes violate semantic versioning by appearing in non-major releases. **Breaking-Good: Explaining Breaking Dependency Updates with Build Analysis** (2024) _Frank Reyes, Benoit Baudry, Martin Monperrus_ arXiv preprint Automated tool that generates explanations for compilation errors caused by incompatible dependency version changes. Analyzes logs and dependency trees to identify root causes across direct/indirect dependencies, Java version conflicts, and configuration issues. Successfully identified causes for 70% of 243 real breaking updates. **Bump: A benchmark of reproducible breaking dependency updates** (2024) _Frank Reyes, Yogya Gamage, Gabriel Skoglund, Benoit Baudry, Martin Monperrus_ IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Benchmark dataset of reproducible breaking dependency updates for evaluating tools that detect or explain breaking changes in dependency updates. **I depended on you and you broke me: An empirical study of manifesting breaking changes in client packages** (2023) _Daniel Venturini, Filipe Cogo, Igor Polato, Marco Gerosa, Igor Wiese_ ACM Transactions on Software Engineering and Methodology (TOSEM) Quantitative evaluation of the impact of breaking updates on dependent packages in npm, examining how breaking changes manifest and propagate through the ecosystem. **Semantic Versioning versus Breaking Changes: A Study of the Maven Repository** (2014, 2017) _Steven Raemaekers, Arie van Deursen, Joost Visser_ IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM) / Journal of Systems and Software Analyzed 100,000+ JAR files from Maven Central over 7 years covering 22,000+ libraries. Found approximately one-third of all releases introduce breaking changes, often violating semantic versioning conventions. **Breaking Bad? Semantic Versioning and Impact of Breaking Changes in Maven Central** (2021) _Lina Ochoa, Thomas Degueule, Jean-Rémy Falleri, Jurgen Vinju_ Empirical Software Engineering External replication of Raemaekers et al. with different findings: 83.4% of upgrades comply with semver regarding backwards compatibility. Found most breaking changes affect code not used by any client, and only 7.9% of clients are affected by breaking changes. **How Java APIs Break – An Empirical Study** (2015) _Kamil Jezek, Jens Dietrich, Premek Brada_ Information and Software Technology Study of 109 Java open-source programs and 564 versions showing APIs are commonly unstable. Analyzes patterns of API breaking changes and their impact on dependent systems. **Why and How Java Developers Break APIs** (2018) _Aline Brito, Laerte Xavier, André Hora, Marco Tulio Valente_ IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Four-month field study with developers of 400 popular Java libraries. Found breaking changes are mostly motivated by implementing new features, simplifying APIs, and improving maintainability. Developers rarely deprecate elements before changes due to maintenance overhead. **Has My Release Disobeyed Semantic Versioning? Static Detection Based on Semantic Differencing** (2022) _Lyuye Zhang, Chengwei Liu, Zhengzi Xu, Sen Chen, Lingling Fan, Bihuan Chen, Yang Liu_ IEEE/ACM International Conference on Automated Software Engineering (ASE) - Distinguished Paper Award Addresses semantic breaking where APIs have identical signatures but inconsistent semantics. Proposes Sembid tool achieving 90.26% recall. Empirical study on 1.6M APIs found 2-4x more semantic breaking than signature-based issues. **When and How to Make Breaking Changes: Policies and Practices in 18 Open Source Software Ecosystems** (2021) _Chris Bogart, Christian Kästner, James Herbsleb, Ferdian Thung_ ACM Transactions on Software Engineering and Methodology (TOSEM) Comparative study of breaking change policies across 18 ecosystems combining repository mining, document analysis, and large-scale survey. Found practices and values are cohesive within ecosystems but diverse across them. Eclipse’s “prime directive” never permits breaking changes; other ecosystems balance differently. **Possible directions for improving dependency versioning in R** (2013) _Multiple authors_ arXiv preprint Proposal for improving version handling in the R ecosystem. ## Package Manager Design and Architecture Research on package manager design principles, architectures, and implementation. **LUDE: A Distributed Software Library** (1993) _Multiple authors_ USENIX LISA Early distributed software library system. **The Comprehensive TeX Archive Network** (1993) _Multiple authors_ TUGboat Description of CTAN, one of the earliest package repositories. **Nix: A Safe and Policy-Free System for Software Deployment** (2004) _Eelco Dolstra, Merijn de Jonge, Eelco Visser_ USENIX LISA Introduced Nix, a purely functional package manager with unique approach to dependency management. Packages are stored in isolation from each other using cryptographic hashes, preventing dependency conflicts and enabling atomic upgrades and rollbacks. **The Purely Functional Software Deployment Model** (2006) _Eelco Dolstra_ PhD Thesis, Utrecht University The comprehensive treatment of functional package management that the LISA paper summarizes. Develops the theoretical foundations for treating software deployment as a pure function from inputs to outputs, where the cryptographic hash of all build inputs determines the output path. Covers the Nix expression language, the store model, and techniques for achieving reproducible builds. **An adaptive package management system for Scheme** (2007) _Erick Gallesio et al._ Academic publication Adaptive package management approach for Scheme programming language. **NixOS: a purely functional Linux distribution** (2008) _Eelco Dolstra, Andres Löh_ ACM SIGPLAN International Conference on Functional Programming (ICFP) Description of NixOS, a Linux distribution built on Nix package manager. Extends functional package management to system configuration, treating the entire operating system as a function from a declarative specification to a running system. **Functional Package Management with Guix** (2013) _Ludovic Courtès_ European Lisp Symposium Introduces GNU Guix, a purely functional package manager building on Nix’s deployment model but using Scheme as its implementation and extension language. Demonstrates how an embedded domain-specific language for package definitions allows users to benefit from a general-purpose programming language while maintaining the reproducibility guarantees of functional package management. **Reproducible and User-Controlled Software Environments in HPC with Guix** (2015) _Ludovic Courtès, Ricardo Wurmus_ International Conference on High Performance Computing (ISC) Addresses how HPC support teams struggle to balance conservative system administration with user demands for up-to-date tools. Presents GNU Guix as a solution allowing unprivileged users to install and manage their own software environments while maintaining reproducibility, without requiring root access or containers. **The Comprehensive R Archive Network** (2012) _Multiple authors_ Wiley Interdisciplinary Reviews Detailed description of CRAN architecture and design. **EasyBuild: Building Software With Ease** (2012) _Multiple authors_ PyHPC Workshop Framework for building and installing scientific software. **maintaineR: A web-based dashboard for maintainers of CRAN packages** (2014) _Multiple authors_ ICSME Tool Demo Tool for CRAN package maintainers. **The Spack Package Manager: Bringing Order to HPC Software Chaos** (2015) _Todd Gamblin, Matthew LeGendre, Michael R. Collette, Gregory L. Lee, Adam Moody, Bronis R. de Supinski, Scott Futral_ Supercomputing Package manager designed for HPC environments. **SPAM: a Secure Package Manager** (2017) _Multiple authors_ Academic publication Design for a security-focused package manager. **Managing the Complexity of Large Free and Open Source Package-Based Software Distributions** (2006) _Multiple authors_ ASE Analysis of complexity challenges in large package distributions. **Toward Decentralized Package Management** (2011) _Multiple authors_ Academic publication Proposal for decentralized package management approaches. **MPM: a modular package manager** (2011) _Multiple authors_ ACM publication Design of a modular package manager architecture. **A modular package manager architecture** (2013) _Roberto Di Cosmo et al._ Technical report Detailed architecture for modular package managers. **Towards efficient optimization in package management systems** (2014) _Alexey Ignatiev et al._ Academic publication Approaches for optimizing package management operations. **Flexible and optimal dependency management via max-smt** (2023) _Donald Pinckney, Federico Cassano, Arjun Guha, Jonathan Bell, Massimiliano Culpo, Todd Gamblin_ IEEE/ACM International Conference on Software Engineering (ICSE) Introduced unified framework built on Max-SMT solvers to resolve dependencies more systematically, moving beyond ad-hoc algorithms. Demonstrates practical solvers can handle real-world dependency resolution with formal guarantees. **Automatic Software Dependency Management using Blockchain** (2018) _Gavin D’Mello_ Technical report Exploration of blockchain for dependency management. **PubGrub: Next-Generation Version Solving** (2018) _Natalie Weizenbaum_ Medium article Description of PubGrub algorithm used in Dart’s pub package manager. **Contour: A Practical System for Binary Transparency** (2018) _Multiple authors_ Academic publication System for binary transparency in software distribution. ## Software Distribution Systems Research on secure software update systems and distribution frameworks. **Survivable Key Compromise in Software Update Systems** (2010) _Justin Samuel, Nick Mathewson, Justin Cappos, Roger Dingledine_ ACM Conference on Computer and Communications Security (CCS) Introduced The Update Framework (TUF), a secure software update system that remains secure even when repository keys are compromised. TUF uses role separation, threshold signatures, and offline keys. Led to adoption by Docker, Python, and automotive update systems. **Diplomat: Using Delegations to Protect Community Repositories** (2016) _Trishank Karthik Kuppusamy, Santiago Torres-Arias, Vladimir Diaz, Justin Cappos_ USENIX Symposium on Networked Systems Design and Implementation (NSDI) Extended TUF to work efficiently with large community repositories like PyPI and RubyGems. Introduced delegation mechanisms allowing package repositories to scale to hundreds of thousands of packages while maintaining security guarantees. **CHAINIAC: Proactive Software-Update Transparency via Collectively Signed Skipchains and Verified Builds** (2017) _Kirill Nikitin, Eleftherios Kokoris-Kogias, Philipp Jovanovic, Nicolas Gailly, Linus Gasser, Ismail Khoffi, Justin Cappos, Bryan Ford_ USENIX Security Symposium Proposed decentralized software-update framework eliminating single points of failure through independent witness servers. Evaluation shows clients achieve security comparable to verifying every update while consuming only one-fifth of the bandwidth. **Uptane: Securing Software Updates for Automobiles** (2016, 2018) _Trishank Karthik Kuppusamy, Akshay Dua, Russ Bielawski, Cameron Mott, Sam Lauzon, Andre Weimerskirch, Akan Brown, Sebastien Awwad, Damon McCoy, Justin Cappos_ escar Europe / IEEE Vehicular Technology Magazine First software update framework for automobiles capable of resisting nation-state level attacks. Based on TUF but adapted for automotive constraints. Became IEEE/ISTO standard in 2019. **Your Firmware Has Arrived: A Study of Firmware Update Vulnerabilities** (2024) _Yuhao Wu, Jinwen Wang, Yujie Wang, Shixuan Zhai, Zihan Li, Yi He, Kun Sun, Qi Li, Ning Zhang_ USENIX Security Symposium Proposed ChkUp tool to detect firmware update vulnerabilities by resolving program execution paths. Analyzing 12,000 firmware images, identifies vulnerabilities stemming from incomplete or incorrect verification steps. **Formal Security Analysis of Electronic Software Distribution Systems** (2008) _M. Maidl, D. von Oheimb, P. Hartmann, R. Robinson_ International Conference on Computer Safety, Reliability, and Security (SAFECOMP) Introduced software distribution system architecture with generic core component for secure software transport. Used formal methods to validate system security for critical embedded systems. **Reflections on Trusting Trust** (1984) _Ken Thompson_ Communications of the ACM Classic paper on trust in software compilation and distribution. ## Malicious Packages and Typosquatting Research on detection and analysis of malicious packages in ecosystems. **An Empirical Study of Malicious Code in PyPI Ecosystem** (2023) _Wenbo Guo, et al._ IEEE/ACM International Conference on Automated Software Engineering (ASE) Large-scale empirical study with dataset of 4,669 malicious code samples from PyPI. Found 74.81% of malicious packages enter user systems via source code installation. **Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence** (2024) _Junan Zhang, Kaifeng Huang, et al._ ACM Transactions on Software Engineering and Methodology Proposed Cerebro, unified detection model for malicious packages across npm and PyPI using behavior sequence analysis with BERT. Detected 683 malicious PyPI packages and 799 npm packages. **Malicious Package Detection using Metadata Information** (2024) _S. Halder, et al._ ACM Web Conference (WWW) Introduced MeMPtec, metadata-based malicious package detection model. Demonstrates resistance to adversarial attacks with 85.2% precision and 91.8% recall. **On the Feasibility of Detecting Injections in Malicious npm Packages** (Year TBD) _Various authors_ ACM conference proceedings Analyzed 361 malicious npm artifacts covering typosquatting, combosquatting, and package hijacking, providing insights into attack patterns. **SpellBound: Defending Against Package Typosquatting** (2020) _Matthew Taylor, Ruturaj K. Vaidya, Drew Davidson, Lorenzo De Carli, Vaibhav Rastogi_ arXiv preprint Proposed TypoGard, a detection technique based on analysis of npm and PyPI leveraging lexical similarity between names and package popularity. Evaluation showed TypoGard flags up to 99.4% of known typosquatting cases while generating limited warnings (0.5% of package installs) and low overhead (2.5% of package install time). **Typosquatting and Combosquatting Attacks on the Python Ecosystem** (2020) _Duc-Ly Vu, Ivan Pashchenko, Fabio Massacci, Henrik Plate, Antonino Sabetta_ IEEE European Symposium on Security and Privacy Workshops (Euro S&P) Studies typosquatting and combosquatting attacks on PyPI. Combosquatting exploits mistakes in the order of package names consisting of multiple nouns (e.g., “python-nmap” typed as “nmap-python”). Proposes automated approach to identify combosquatting and typosquatting package names. **Practical Automated Detection of Malicious npm Packages** (2022) _Adriana Sejfia, Max Schäfer_ IEEE/ACM International Conference on Software Engineering (ICSE) Presents Amalfi, combining ML classifiers, a reproducer for identifying packages rebuildable from source, and a clone detector for known malicious packages. Identified 95 previously unknown malicious packages over seven days. Found malicious packages more likely to contain minified code or binaries. **TypoSmart: A Low False-Positive System for Detecting Malicious and Stealthy Typosquatting Threats in Package Registries** (2025) _Multiple authors_ arXiv preprint First scalable deployment of a typosquatting detection system that addresses key limitations by leveraging package metadata. Improved neighbor search speeds by 73-91% and reduced false positives by 70.4% compared to prior work. Being used in production, contributing to removal of 3,658 typosquatting threats in one month. **Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies** (2021) _Alex Birsan_ Medium blog post / Security research Revealed dependency confusion attack that exploits package managers pulling higher-versioned packages from public repositories when private packages exist with the same name. Successfully compromised over 35 major companies including Microsoft, Apple, PayPal, Shopify, Netflix, and Tesla. Awarded over $130,000 in bug bounties. ## Package Metadata and Trust Models Research on metadata systems, signing, and trust frameworks. **Why Software Signing (Still) Matters: Trust Boundaries in the Software Supply Chain** (2024) _Multiple authors_ arXiv preprint Analyzed when registry hardening renders signing redundant versus when signing is necessary, examining trust boundaries in software distribution. **An Industry Interview Study of Software Signing for Supply Chain Security** (2025) _Kelechi G. Kalu, James C. Davis_ USENIX Security Symposium Qualitative study interviewing 18 experienced security practitioners across 13 organizations to understand software signing practices and challenges. Shows that experts disagree on signing importance. **Signing in Four Public Software Package Registries: Quantity, Quality, and Influencing Factors** (2024) _Taylor R. Schorlemmer, Kelechi G. Kalu, Luke Chigges, Kyung Myung Ko, Elizabeth A. Ishgair, Saurabh Bagchi, Santiago Torres-Arias, James C. Davis_ IEEE Symposium on Security and Privacy (S&P) Study of software signing adoption in Maven, PyPI, DockerHub and Huggingface, finding strict signature rules increase the quantity of signatures and registry policies impact developer decisions. **A systematic literature review on trust in the software ecosystem** (2022) _Multiple authors_ Empirical Software Engineering Systematic literature review examining trust in software ecosystems, including relationships between end-users and software products, package managers, software producing organizations, and software engineers. Addresses how trust is frequently violated by bad actors and vulnerabilities in the software supply chain. **Sigstore: Software Signing for Everybody** (2022) _Zachary Newman, John Speed Meyers, Santiago Torres-Arias_ ACM Conference on Computer and Communications Security (CCS) Academic analysis of Sigstore’s keyless signing infrastructure. Describes formal attacker model and possible attack avenues. Sigstore uses identity-based signing (OAuth/OIDC) rather than traditional key management, now adopted by npm, PyPI, and major Linux distributions. ## Dependency Management Bots Research on automated dependency management tools like Dependabot and Renovate. **On the use of dependabot security pull requests** (2021) _Mahmoud Alfadel, Diego Elias Costa, Emad Shihab, Moiz Mkhallalati_ IEEE/ACM International Conference on Mining Software Repositories (MSR) Evaluates how developers respond to security updates suggested by Dependabot, finding varying acceptance rates across ecosystems. **Investigating the resolution of vulnerable dependencies with dependabot security updates** (2023) _Hadi Mohayeji, Ani Agaronian, Eleni Constantinou, Nicola Zannone, Alexander Serebrenik_ IEEE/ACM International Conference on Mining Software Repositories (MSR) Investigates how Dependabot helps mitigate vulnerabilities, noting it uses lockfiles to create dependency graphs. **Securing dependencies: A comprehensive study of dependabot’s impact on vulnerability mitigation** (2025) _Hadi Mohayeji, Ani Agaronian, Eleni Constantinou, Nicola Zannone, Alexander Serebrenik_ Empirical Software Engineering Follow-up study on Dependabot’s effectiveness for vulnerability mitigation across projects. **There’s no such thing as a free lunch: Lessons learned from exploring the overhead introduced by the greenkeeper dependency bot in npm** (2023) _Benjamin Rombaut, Filipe Cogo, Bram Adams, Ahmed E. Hassan_ ACM Transactions on Software Engineering and Methodology (TOSEM) Studies whether Greenkeeper reduces developer effort or introduces unnecessary workload. Mentions lockfiles as a way to overcome in-range breaking changes. ## Software Composition Analysis Research on library usage, updates, and composition analysis tools. **Software ecosystem call graph for dependency management** (2018) _Joseph Hejderup, Arie van Deursen, Georgios Gousios_ IEEE/ACM International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER) Proposes moving beyond package-level dependency analysis to call-graph level, enabling finer-grained understanding of which functions are actually used from dependencies. **Präzi: From Package-based to Call-based Dependency Networks** (2021) _Joseph Hejderup, Moritz Beller, Konstantinos Triantafyllou, Georgios Gousios_ arXiv preprint Extends call-graph dependency analysis with Präzi, constructing fine-grained dependency networks at the function level rather than package level. Enables more precise vulnerability impact analysis and identifies unused transitive dependencies. **Towards Understanding Third-Party Library Dependency in C/C++ Ecosystem** (2022) _Wei Tang, Zhengzi Xu, Chengwei Liu, Jiahui Wu, Shouguo Yang, Yi Li, Ping Luo, Yang Liu_ IEEE/ACM International Conference on Automated Software Engineering (ASE) First large-scale C/C++ dependency study addressing lack of unified package manager. Analyzed 24K repositories revealing 71.5% dependencies handled in Install phase. **A Machine Learning Approach for Vulnerability Curation** (2020) _Chen Yang, Andrew Santosa, Ang Ming Yi, Abhishek Sharma, Asankhaya Sharma, David Lo_ International Conference on Mining Software Repositories (MSR) Designed ML system to automatically predict vulnerability-relatedness of data items for software composition analysis databases. **An Exploratory Study on Library Aging by Monitoring Client Usage in a Software Ecosystem** (2017) _Multiple authors_ SANER Study of library aging patterns through client usage monitoring. **Do Developers Update Their Library Dependencies?** (2018) _Raula Gaikovina Kula, Daniel M. German, Ali Ouni, Takashi Ishio, Katsuro Inoue_ Empirical Software Engineering Empirical study on library migration covering 4,600+ GitHub projects and 2,700 library dependencies. Found 81.5% of systems keep outdated dependencies, and developers rarely respond to security advisories. Introduced the Library Migration Plot (LMP) visualization. **A Large-Scale Empirical Study on Java Library Migrations: Prevalence, Trends, and Rationales** (2021) _Hao He, Runzhi He, Haiqiao Gu, Minghui Zhou_ ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) Commit-level analysis of 19,652 Java projects extracting 1,194 migration rules and 3,163 migration commits. Found migrations dominated by logging, JSON, testing, and web service domains. Identified 14 migration reasons, 7 not discussed in prior work. **Modeling Library Dependencies and Updates in Large Software Repository Universes** (2017) _Raula Gaikovina Kula, Coen De Roover, Daniel M. German, Takashi Ishio, Katsuro Inoue_ arXiv preprint Proposes the Software Universe Graph (SUG) to model library dependency and update information mined from Maven. Leverages “wisdom of the crowd” to recommend library updates based on what other projects have adopted. **The emergence of software diversity in maven central** (2019) _Multiple authors_ arXiv preprint Analysis of software diversity patterns in Maven Central. **On the use of package managers by the C++ open-source community** (2018) _Multiple authors_ SAC Study of package manager adoption in C++ open source. **Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development** (2025) _Mahmoud Jahanshahi, David Reid, Audris Mockus_ ACM Transactions on Software Engineering and Methodology Studies how code copying (not just dependency declaration) affects software reuse patterns, examining the relationship between formal dependencies and actual code reuse in open source. **A Longitudinal Analysis of Bloated Java Dependencies** (2021) _Multiple authors_ ResearchGate / Academic publication Longitudinal study examining bloated dependencies in Java projects over time. Analyzes how dependency bloat evolves and accumulates in Maven-based projects. **Longitudinal Analysis of Software Dependencies in Large-Scale Systems** (2025) _Multiple authors_ Empirical Software Engineering Recent longitudinal analysis of software dependencies examining long-term patterns and evolution of dependency management practices in large-scale systems. ## Ecosystem Evolution and Developer Behavior Research on how ecosystems and developer practices evolve over time. **An Empirical Study of API Stability and Adoption in the Android Ecosystem** (2013) _Tyler McDonnell, Baishakhi Ray, Miryung Kim_ IEEE International Conference on Software Maintenance (ICSM) - Most Influential Paper Award 2023 Found Android API evolves at 115 updates per month on average, but client adoption doesn’t keep pace. Established API stability and adoption as a vital research area, inspiring subsequent work on automating API migration and change impact analysis. **Understanding the Response to Open-Source Dependency Abandonment in the npm Ecosystem** (2025) _Courtney Miller, Mahmoud Jahanshahi, Audris Mockus, Bogdan Vasilescu, Christian Kästner_ IEEE/ACM International Conference on Software Engineering (ICSE) Studies how developers respond when their dependencies are abandoned, analyzing response patterns and mitigation strategies in the npm ecosystem. **Underproduction: An Approach for Measuring Risk in Open Source Software** (2021) _Kaylea Champion, Benjamin Mako Hill_ IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) Introduces framework for identifying underproduction where software engineering labor supply is misaligned with demand. Applied to 21,902 Debian packages and 461,656 bugs, finding at least 4,327 packages are underproduced. Desktop environments particularly at risk. **Deprecation of Packages and Releases in Software Ecosystems: A Case Study on npm** (2022) _Filipe Cogo, Gustavo Oliva, Ahmed E. Hassan_ IEEE Transactions on Software Engineering Examines npm’s deprecation mechanism. Found 3.7% of packages have at least one deprecated release, and 66% of those have deprecated all releases, preventing migration to replacements. Transitive adoption of deprecated releases is challenging to track. **The Evolution of Project Inter-dependencies in a Software Ecosystem: The Case of Apache** (2013) _Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto_ IEEE International Conference on Software Maintenance (ICSM) Exploratory study of 147 Apache Java projects over 14 years (1,964 releases), examining how dependency relationships evolve and when projects decide to upgrade dependencies. **How the Apache Community Upgrades Dependencies: An Evolutionary Study** (2015) _Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto, Sebastiano Panichella_ Empirical Software Engineering Follow-up study examining when and why Apache projects upgrade their dependencies, identifying patterns in upgrade decisions. **A Graph-Based Approach to API Usage Adaptation** (2010) _Hoan Anh Nguyen, Tung Thanh Nguyen, Gary Wilson Jr., Anh Tuan Nguyen, Miryung Kim, Tien Nguyen_ ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) Introduces LIBSYNC, which learns complex API usage adaptation patterns from other clients that already migrated to a new library version, guiding developers through API migrations. **Influences on developer participation in the Debian software ecosystem** (2011) _Multiple authors_ Academic publication Analysis of factors affecting developer participation in Debian. **A historical analysis of Debian package incompatibilities** (2015) _Multiple authors_ Academic publication Historical perspective on package incompatibilities in Debian. **Why Do Developers Use Trivial Packages? An Empirical Case Study on npm** (2017) _Suhaib Mujahid et al._ Academic publication Investigation of why developers depend on trivial packages. **On the Impact of Using Trivial Packages: An Empirical Case Study on npm and PyPI** (2020) _Rabe Abdalkareem, Vinicius Oda, Suhaib Mujahid, Emad Shihab_ Empirical Software Engineering Follow-up study finding 16% of npm and 10.5% of PyPI packages are trivial. Survey of 125 developers found they believe trivial packages are well-tested, but only 28% of npm and 49% of PyPI trivial packages actually have tests. 18.4% of npm trivial packages have more than 20 dependencies. **Towards Smoother Library Migrations: A Look at Vulnerable Dependency Migrations at Function Level for npm JavaScript Packages** (2018) _Multiple authors_ IEEE publication Study of library migration patterns for vulnerability fixes. **On the diversity of software package popularity metrics: An empirical study of npm** (2019) _Multiple authors_ arXiv preprint Analysis of different popularity metrics in npm ecosystem. **Are Software Dependency Supply Chain Metrics Useful in Predicting Change of Popularity of npm Packages?** (2018) _Tapajit Dey, Audris Mockus_ International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE) Investigates whether supply chain metrics (dependency relationships, update patterns) can predict changes in npm package popularity. **Ecosystem-Level Determinants of Sustained Activity in Open-Source Projects: A Case Study of the PyPI Ecosystem** (2018) _Multiple authors_ FSE Study of factors contributing to sustained activity in PyPI. **When It Breaks, It Breaks: How Ecosystem Developers Reason about the Stability of Dependencies** (2015) _Multiple authors_ CMU Technical Report Investigation of how developers reason about dependency stability. **How to Break an API: Cost Negotiation and Community Values in Three Software Ecosystems** (2016) _Multiple authors_ ACM publication Study of API breaking changes across three ecosystems. **An ecosystemic and socio-technical view on software maintenance and evolution** (2016) _Multiple authors_ Academic publication Socio-technical perspective on ecosystem maintenance. **On the topology of package dependency networks: a comparison of three programming language ecosystems** (2016) _Multiple authors_ Academic publication Topological analysis of dependency networks. **Structure and Evolution of Package Dependency Networks** (2017) _Multiple authors_ TU Delft publication Analysis of dependency network structure and evolution. **An Empirical Comparison of Developer Retention in the RubyGems and npm Software Ecosystems** (2017) _Multiple authors_ arXiv preprint Comparison of developer retention across ecosystems. **Culture and Breaking Change: A Survey of Values and Practices in 18 Open Source Software Ecosystems** (2017) _Multiple authors_ Figshare Survey of cultural values around breaking changes. **A generalized model for visualizing library popularity, adoption, and diffusion within a software ecosystem** (2018) _Raula et al._ NAIST publication Model for visualizing library adoption patterns. **Release synchronization in software ecosystems** (2019) _Multiple authors_ ACM publication Study of release coordination across ecosystem projects. **Steering insight: An exploration of the ruby software ecosystem** (2011) _Multiple authors_ Academic publication Exploration of Ruby ecosystem characteristics. **Socio-technical evolution of the Ruby ecosystem in GitHub** (2017) _Constantinou, Mens_ SANER Analysis of Ruby ecosystem evolution on GitHub. **How do developers react to API deprecation? The case of a Smalltalk ecosystem** (2012) _Multiple authors_ FSE Study of developer responses to API deprecation. **A study of ripple effects in software ecosystems** (2011) _Multiple authors_ ICSE Analysis of how changes ripple through ecosystems. **How do developers react to API evolution? The Pharo ecosystem case** (2015) _Multiple authors_ HAL archives Case study of API evolution in Pharo. **How do developers react to API evolution? A large-scale empirical study** (2018) _Multiple authors_ Academic publication Large-scale study of API evolution responses. **Software engineering with reusable components** (1997) _Multiple authors_ Academic publication Early work on component reuse in software engineering. **A method to generate traverse paths for eliciting missing requirements** (2019) _Multiple authors_ ACM publication Method for identifying missing requirements through path analysis. **Mining component repositories for installability issues** (2015) _Roberto Di Cosmo et al._ MSR Mining approach for finding installability problems. **Measuring the Health of Open Source Software Ecosystems: Beyond the Scope of Project Health** (2014) _Slinger Jansen_ Information and Software Technology First model for measuring open source ecosystem health, evaluating productivity, robustness, and niche creation. Distinguishes ecosystem health from project health, recognizing that ecosystem health involves multiple interrelated projects, contributors, and end-users. **Software Ecosystems Governance - A Systematic Literature Review and Research Agenda** (2017) _Carina Alves, Joyce Oliveira, Slinger Jansen_ ICEIS Systematic literature review examining how software ecosystems should be managed and controlled. Analyzed 63 studies and classified governance mechanisms into value creation, coordination of players, and organizational openness and control. **Giving Back: Contributions Congruent to Library Dependency Changes in a Software Ecosystem** (2022) _Supatsara Wattanakriengkrai, Dong Wang, Raula Gaikovina Kula, Christoph Treude, Patanamon Thongtanunam, Takashi Ishio, Kenichi Matsumoto_ arXiv preprint Empirical study of how developers contribute to open-source libraries in relation to dependency changes within npm. Analyzed over 5.3 million commits across 107,242 packages to measure dependency-contribution congruence. Found a statistically significant relationship between such contributions and whether packages become dormant. **An empirical study of software ecosystem related tweets by npm maintainers** (2023) _Syful Islam, Yusuf Sulistyo Nugroho, et al._ PeerJ Computer Science Analyzed approximately 1,176 tweets from npm package maintainers to categorize discussion topics, communication styles, and emotional tone. Found package management issues dominate discussions and maintainers express predominantly neutral sentiment about technical matters. ## LLMs and Package Hallucinations (Slopsquatting) Recent research on how large language models hallucinate non-existent packages, creating new supply chain attack vectors. **We Have a Package for You! An Analysis of Package Hallucinations by Code Generating LLMs** (2024) _Multiple authors from University of Texas at San Antonio, Virginia Tech, University of Oklahoma_ arXiv preprint | GitHub Analysis using 16 popular LLMs and 2 prompt datasets for Python and JavaScript code generation, producing 576,000 code samples. Found 440,445 (19.7%) were hallucinations, including 205,474 unique non-existent packages. Average hallucination rate of 5.2% for commercial models and 21.7% for open-source models. Demonstrates how attackers can exploit LLM hallucinations by registering fake packages. **Importing Phantoms: Measuring LLM Package Hallucination Vulnerabilities** (2025) _Arjun Krishna, Erick Galinkin, Leon Derczynski, Jeffrey Martin_ arXiv preprint Examines package hallucinations across multiple programming languages (Python, JavaScript, Rust) for different tasks across different LLMs. Found package hallucination rate depends on model choice, programming language, model size, and task specificity. Discovered inverse correlation between package hallucination rate and HumanEval coding benchmark. Shows coding models are not being optimized for secure code generation. **HFuzzer: Testing Large Language Models for Package Hallucinations via Phrase-based Fuzzing** (2025) _Multiple authors_ arXiv preprint First framework to introduce fuzzing into testing LLMs for package hallucinations. Adopts phrase-based fuzzing to guide models to generate diverse coding tasks. Triggers package hallucinations across all tested models. Identifies 2.60× more unique hallucinated packages compared to mutational fuzzing frameworks. Found 46 unique hallucinated packages when testing GPT-4o. **AI-Induced Supply-Chain Compromise: A Systematic Review of Package Hallucinations and Slopsquatting Attacks** (2025) _Multiple authors_ Research Square preprint Systematic review of package hallucinations and slopsquatting attacks where malicious actors exploit LLMs’ tendency to generate non-existent package names. Coined term “slopsquatting” by security researcher Seth Larson. Analyzes how adversaries can register hallucinated package names in public registries with malware payloads. **An Empirical Study of Vulnerable Package Dependencies in LLM Repositories** (2025) _Multiple authors_ arXiv preprint Empirical analysis of 52 open-source LLMs examining third-party dependencies and vulnerabilities. Found half of vulnerabilities in LLM ecosystem remain undisclosed for more than 56.2 months, and 75.8% of LLMs include vulnerable dependencies. GitHub’s Top 100 AI projects reference on average 208 direct and transitive dependencies, with 15% containing 10+ known vulnerabilities. **The Hidden Risks of LLM-Generated Web Application Code: A Security-Centric Evaluation** (2025) _Multiple authors_ arXiv preprint Security evaluation of LLM-generated web application code finding over 40% of AI-generated solutions contain security flaws. Common issues include missing input sanitization, authentication mechanism vulnerabilities, and dependency overuse that expands attack surface. **Large Language Models and Code Security: A Systematic Literature Review** (2024) _Multiple authors_ arXiv preprint Systematic literature review examining LLMs in code security, covering vulnerabilities to remediation approaches. Analyzes common security vulnerabilities in AI-generated code across multiple languages and models. If you’re aware of research that should be included in this collection, please reach out on Mastodon or submit a pull request on GitHub.
nesbitt.io
January 26, 2026 at 9:20 PM
Only a week to go till the #fosdem2026 #hpc, #bigdata, and #datascience devroom in Brussels! Check out the agenda and options for information on how to connect at https://fosdem.org/2026/schedule/track/hpc-big-data-data-science/
FOSDEM 2026 - HPC, Big Data & Data Science
fosdem.org
January 25, 2026 at 6:53 PM
Reposted by hpc.social admins
Great news for NSF-funded researchers: They can now deposit author accepted manuscripts into PAR. NSF has also made headway on implementing persistent digital identifiers, although they took a different direction than using DOIs, which is an odd choice […]
Original post on sciences.social
sciences.social
January 23, 2026 at 12:52 AM
Welcome and thanks to our new organizational sponsor HMX Labs! https://hpc.social/sponsors
January 20, 2026 at 4:21 AM
Just a bit over two weeks to go until the #fosdem2026 #hpc, #bigdata, and #datascience Devroom Sunday, 1 February 2026 in Brussels! See the links here for the schedule, room info, ans ways to connect and chat to participate! https://fosdem.org/2026/schedule/track/hpc-big-data-data-science/
FOSDEM 2026 - HPC, Big Data & Data Science
fosdem.org
January 17, 2026 at 10:22 PM
Reposted by hpc.social admins
The CFP for CUG 2026, to be held in Nice, France, is now open! Wish I could go. CUG was one of my favorite annual #hpc conferences.

https://cug.org/cug-2026/
CUG 2026 – CUG
cug.org
January 12, 2026 at 2:58 PM
RE: https://mast.hpc.social/@admin/115657373438746388

Witht the deadline for the HPSF Conference 2026 in Chicago just passed, the next opportunity will be for the HPSF Community Summit 2026 European Workshop, 25 – 27 Feb 2026 at the Technical University of Braunschweig. Submission deadline: 31 […]
January 12, 2026 at 5:40 PM
From https://bsky.app/profile/techjournalism.bsky.social:

The AI Boom has Super-sized Data Center Campuses. New data provides insights into the growth of the largest AI clusters.

https://datacenterrichness.substack.com/p/data-centers-2026-the-largest-ai
Data Centers 2026: The Largest AI Campuses
New Reports Offer a Closer Look at the Super-Sizing of Hyperscale Clusters
datacenterrichness.substack.com
January 12, 2026 at 1:12 PM
Reposted by hpc.social admins
I went through the last of what I could find and think I understand what ICMS is and is not. https://www.glennklockwood.com/garden/ICMS has been updated to reflect that—the physical implementation and several software implementations that can ride atop are now outlined.

#ai #storage
January 10, 2026 at 6:19 AM
Reposted by hpc.social admins
Double header blog posts today where I attempt to categorize package manager clients and registries in various ways.

https://nesbitt.io/2025/12/29/categorizing-package-registries.html

https://nesbitt.io/2025/12/29/categorizing-package-manager-clients.html
Categorizing Package Registries
Package registries differ in dozens of ways, but most of those differences cluster into a few structural categories. Looking at them through the lens of design tradeoffs helps explain why they ended up where they did. The ecosyste.ms documentation repositories contain detailed data on over 70 registries; here I’m trying to draw out the shapes. The categories below are roughly orthogonal dimensions. No registry is “just” one thing; each is a particular combination of choices. npmjs.com is database-backed, unreviewed, has flat-plus-scoped names, ships mostly source, and is run by a for-profit company. Debian’s repositories are filesystem-based, curated by maintainers, use distro-managed names, ship binaries, and are run by a foundation. Those combinations matter more than any single axis. A companion post covers package manager clients: resolution algorithms, lockfiles, build hooks, and manifest formats. The data is also available as CSV. There are gaps; contributions welcome. **Contents:** Architecture · Review model · Namespacing · Distribution model · Governance · Ecosystem scope · Version retention · Size · Mirroring ## Registry architecture How does the registry store and serve package metadata? **Database-backed web services** are uploaded via API, with metadata in Postgres or similar. This model scales well and supports rich features like download counts and vulnerability reporting. * npmjs.com * pypi.org * rubygems.org * nuget.org * crates.io1 * packagist.org * hex.pm * pub.dev * clojars.org * forge.puppet.com * anaconda.org * luarocks.org * community.chocolatey.org * open-vsx.org * galaxy.ansible.com * jsr.io **Git repositories as indexes** use version control as the storage layer on the critical path. If you removed git, you’d need to replace it with a database. The git model provides history, is trivially mirrorable, and works offline once cloned. But it doesn’t scale indefinitely; Cargo had to add a sparse index to avoid downloading the entire registry on first use. * homebrew-core * cocoapods.org * vcpkg * conan.io * swiftpackageindex.com * Julia General registry * juliahub.com2 * winget-pkgs * spack **Filesystem-based repositories** serve generated index files statically from HTTP mirrors. The server does work only when the repository is updated, not when clients fetch. This is the pattern that the compact index brought to RubyGems. * apt/dpkg * yum/dnf * pacman * apk * zypper * Portage * cran.r-project.org * bioconductor.org * metacpan.org * hackage.haskell.org * pkgs.racket-lang.org3 * FreeBSD ports * pkgsrc * Helm * postmarketOS * Adélie Linux **Source host as registry** means no central registry. Packages are fetched directly from git hosts using URLs as identifiers. * Go modules * Deno * Carthage **Content-addressed stores** identify packages by hash of inputs. Binary caches provide pre-built artifacts. * Nix * Guix ## Reviewed / Unreviewed Does someone look at packages before they’re available? **Unreviewed** means anyone can publish immediately. You create an account, run a publish command, and your package is live within seconds. This enables growth but creates attack surface. * npmjs.com * pypi.org * crates.io * rubygems.org * packagist.org * nuget.org * hex.pm * pub.dev * clojars.org * juliahub.com * hackage.haskell.org * metacpan.org * forge.puppet.com * anaconda.org * luarocks.org * open-vsx.org * galaxy.ansible.com * jsr.io **Reviewed** registries have maintainers review packages before they appear. These registries grow more slowly but catch problems earlier. In practice, “review” ranges from packaging QA and policy checks to security vetting; very few projects do systematic source code review. * Debian * Fedora * Ubuntu * homebrew-core * Alpine * Arch4 * nixpkgs * F-Droid * cran.r-project.org * bioconductor.org * conda-forge * postmarketOS * Adélie Linux * spack * FreeBSD ports * pkgsrc * winget-pkgs * central.sonatype.com5 **Moderated upload** accepts uploads but has moderation layers or automated semantic checks. * package.elm-lang.org6 * community.chocolatey.org7 ## Namespacing How are packages named? **Flat** namespaces give each package a single global name. * rubygems.org * pypi.org * crates.io * hex.pm * hackage.haskell.org * cran.r-project.org * juliahub.com * package.elm-lang.org * luarocks.org * community.chocolatey.org **Scoped** namespaces add organizational prefixes like `@babel/core` or `symfony/console`. * npmjs.com * packagist.org * forge.puppet.com * open-vsx.org * galaxy.ansible.com * winget-pkgs * artifacthub.io * anaconda.org * jsr.io **Hierarchical** namespaces use structured naming like `org.apache.commons:commons-lang3` or `DateTime::Format::Strptime`. * central.sonatype.com * metacpan.org * clojars.org **URL-based** identifiers like `github.com/user/repo` use domain ownership as the claim. No registration step. * proxy.golang.org * deno.land * Swift Package Manager * Carthage **Distro-managed** names are controlled by distribution maintainers, often differing from upstream project names. * Debian * Fedora * Arch * Alpine * homebrew-core * nixpkgs * spack * conda-forge * FreeBSD ports * pkgsrc * postmarketOS ## Distribution model What gets distributed? **Source only** ships code that gets compiled or interpreted on the client. One artifact supports any platform. * npmjs.com * crates.io * proxy.golang.org * metacpan.org * hex.pm * hackage.haskell.org * cran.r-project.org * package.elm-lang.org * pkgs.racket-lang.org * clojars.org8 * luarocks.org * galaxy.ansible.com * artifacthub.io * jsr.io **Binary only** ships precompiled artifacts. * central.sonatype.com * nuget.org * apt/dpkg * yum/dnf * pacman * apk * community.chocolatey.org * winget-pkgs **Mixed source and binary** provides source distributions plus prebuilt wheels/binaries. Native code gets platform-specific builds. * pypi.org * rubygems.org * cocoapods.org * anaconda.org * homebrew-core9 * cache.nixos.org10 * spack11 * FreeBSD ports * pkgsrc **Platform matrices** publish multiple artifacts per release: `cp39-manylinux_x86_64`, `cp310-macosx_arm64`, etc. * pypi.org * rubygems.org * anaconda.org * cache.nixos.org * nuget.org12 * homebrew-core ## Registry governance Who runs the registry? **Non-profit foundations** operate registries as community infrastructure. * pypi.org13 * crates.io14 * rubygems.org15 * central.sonatype.com16 * packagist.org17 * metacpan.org18 * hex.pm19 * clojars.org20 * hackage.haskell.org21 * cran.r-project.org22 * homebrew-core23 * open-vsx.org24 * artifacthub.io25 **For-profit companies** run registries as products or strategic infrastructure. * npmjs.com26 * nuget.org27 * pub.dev28 * anaconda.org29 * juliahub.com30 * forge.puppet.com31 * galaxy.ansible.com32 * community.chocolatey.org33 * winget-pkgs34 * proxy.golang.org35 * deno.land36 * jsr.io36 **Community projects** run registries as volunteer efforts, often with fiscal sponsors. * cocoapods.org * conda-forge * swiftpackageindex.com * luarocks.org * Carthage * nixpkgs **Distribution projects** maintain repositories as part of their distro. * Debian * Fedora37 * Ubuntu38 * Arch * Alpine * postmarketOS * Adélie Linux * spack * FreeBSD * pkgsrc ## Ecosystem scope What kind of software does this package manager handle? **Language-specific** registries serve a single programming language ecosystem. * npmjs.com * pypi.org * rubygems.org * crates.io * hex.pm * hackage.haskell.org * metacpan.org * clojars.org * pub.dev * cran.r-project.org * juliahub.com * package.elm-lang.org * pkgs.racket-lang.org * packagist.org * proxy.golang.org * central.sonatype.com * luarocks.org * jsr.io **System-level** registries install operating system components and applications. * apt/dpkg * yum/dnf * pacman * apk * homebrew-core * nixpkgs * Guix * zypper * Portage * FreeBSD ports * pkgsrc * community.chocolatey.org * winget-pkgs **Domain-specific** registries serve particular use cases or industries. * bioconductor.org * conda-forge * spack * ROS * forge.puppet.com * registry.terraform.io * galaxy.ansible.com * artifacthub.io * open-vsx.org ## Version retention Does the registry keep old versions available? What happens when a published version needs to be removed? **Keeps all versions** indefinitely. You can install any historical version. * central.sonatype.com39 * proxy.golang.org40 **Yanking** marks a version as unavailable for new installs but keeps it accessible for existing lockfiles. * crates.io * rubygems.org * hex.pm **Time-limited deletion** allows removal within a window, then versions become permanent. * npmjs.com41 * pypi.org42 * nuget.org * packagist.org * clojars.org * hackage.haskell.org * metacpan.org * pub.dev **Latest only** or limited retention. Old versions disappear when new ones are published. * homebrew-core43 * apt/dpkg44 * Arch45 * Alpine46 ## Registry size How many packages? Grouped by order of magnitude. **10⁶+ (millions)** * npmjs.com * proxy.golang.org * pypi.org **10⁵ (hundreds of thousands)** * central.sonatype.com * nuget.org * packagist.org * rubygems.org * crates.io * cocoapods.org * anaconda.org * nixpkgs * Arch AUR * Fedora * Debian * Ubuntu **10⁴ (tens of thousands)** * pub.dev * clojars.org * hex.pm * hackage.haskell.org * cran.r-project.org * FreeBSD ports * Alpine **10³ (thousands)** * homebrew-core * luarocks.org * package.elm-lang.org ## Mirroring / Proxying How hard is it to run your own registry or mirror? **Trivial** means filesystem-based repos or source-host registries that need no special infrastructure. * apt/dpkg * yum/dnf * proxy.golang.org * metacpan.org * cran.r-project.org **Supported** means official tooling or documented processes exist for running mirrors or private registries. * npmjs.com47 * pypi.org48 * central.sonatype.com49 * nuget.org * rubygems.org * crates.io * packagist.org * hex.pm50 * luarocks.org51 * clojars.org52 1. Cargo originally required cloning the full crates.io-index git repo; the sparse index now allows fetching only needed entries. ↩ 2. JuliaHub has a database-backed front end but the underlying Julia General registry is a git repository. ↩ 3. pkgs.racket-lang.org stores packages as files, generates a JSON index, and serves via S3. It polls git sources for updates but doesn’t use git as its storage layer. ↩ 4. The AUR (Arch User Repository) is unreviewed; official repos are curated. ↩ 5. Requires proving domain ownership via DNS or hosting a file at the domain. ↩ 6. Elm enforces semantic versioning by diffing package APIs and rejecting publishes that break compatibility without a major version bump. ↩ 7. Three-stage automated review (validator, verifier, VirusTotal scan) plus human moderation. ↩ 8. Publishes JVM bytecode in JAR files, but these are built from source during the publish process. ↩ 9. Bottles are prebuilt binaries for common macOS versions. ↩ 10. Binary substitutes from cache.nixos.org avoid rebuilding from source. ↩ 11. Spack supports binary caches but defaults to building from source. ↩ 12. Runtime Identifiers (RIDs) specify platform-specific assets. ↩ 13. Python Software Foundation ↩ 14. Rust Foundation ↩ 15. Ruby Central ↩ 16. Originally Sonatype, now Linux Foundation ↩ 17. Funded by Private Packagist ↩ 18. Perl Foundation ↩ 19. Six Colors AB, community-funded ↩ 20. Clojurists Together ↩ 21. Haskell.org ↩ 22. R Foundation ↩ 23. Fiscally sponsored by Open Source Collective ↩ 24. Eclipse Foundation ↩ 25. Cloud Native Computing Foundation ↩ 26. GitHub/Microsoft ↩ 27. Microsoft ↩ 28. Google ↩ 29. Anaconda Inc ↩ 30. Julia Computing ↩ 31. Perforce ↩ 32. Red Hat. ↩ 33. Chocolatey Software. ↩ 34. Microsoft. ↩ 35. Google. ↩ 36. Deno Company. ↩ ↩2 37. Red Hat ↩ 38. Canonical ↩ 39. Maven Central does not allow deletion or modification of published artifacts. ↩ 40. Once cached by proxy.golang.org, modules remain available indefinitely. ↩ 41. 72-hour window for unpublishing, with exceptions for security issues. ↩ 42. Can delete files and releases; PEP 763 proposes limiting this to 72 hours. ↩ 43. Formulas point to the latest version; older versions require tapping homebrew-core history. ↩ 44. Each Debian/Ubuntu release has its own repository snapshot. ↩ 45. Rolling release model; only current versions are available. ↩ 46. Each Alpine release has its own repository. ↩ 47. Verdaccio is the most popular private npm registry. ↩ 48. devpi and Artifactory provide PyPI-compatible private registries. ↩ 49. Nexus and Artifactory are widely used for hosting private Maven repositories. ↩ 50. Official mirror documentation with geographic mirrors available. ↩ 51. Custom rock servers can be configured via `rocks_servers` in the config file. ↩ 52. Mirror documentation with instructions for running your own. ↩
nesbitt.io
December 29, 2025 at 12:04 PM
Reposted by hpc.social admins
Package Managers Devroom at FOSDEM 2026: Schedule Announced: https://blog.ecosyste.ms/2025/12/20/fosdem-2026-package-managers-devroom-schedule.html
Package Managers Devroom at FOSDEM 2026: Schedule Announced
Wolf Vollprecht and Andrew Nesbitt are co-organizing the Package Managers devroom at FOSDEM 2026, and the schedule is now live. We have nine talks covering supply chain security, dependency resolution, build reproducibility, and the economics of running package registries. **Saturday, 31 January 2026** Room K.3.201 (capacity 80) / 10:30-14:25 ### A phishy case study _Adam Harvey / 10:30-10:55_ Adam walks through a phishing attack that targeted owners of popular Rust crates in September 2024. The talk covers how the attack unfolded and how collaboration between the Rust Project, Rust Foundation, and Alpha-Omega helped shut it down quickly. ### Current state of attestations in programming language ecosystems _Zach Steindler / 11:00-11:25_ Zach surveys how npm, PyPI, RubyGems, and Maven Central have adopted attestations to link packages to their source code and build instructions. He’ll explain Sigstore bundle verification, compare implementation approaches across registries, and discuss what this means for ecosystems that haven’t adopted attestations yet. ### Name resolution in package management systems _Gábor Boskovits / 11:30-11:55_ Gábor examines how different package managers handle dependency resolution through the lens of reproducible builds. The talk compares language-specific lock files (Cargo), traditional distribution packaging (Debian), and declarative approaches (Nix, Guix). ### Package managers à la carte: A Formal Model of Dependency Resolution _Ryan Gibb / 12:00-12:25_ Ryan introduces the Package Calculus, a formal framework for unifying how different package managers resolve dependencies. The talk addresses three problems: multi-language projects can’t express cross-language dependencies precisely, system and hardware dependencies remain implicit, and security vulnerabilities in full dependency graphs are hard to track. ### Trust Nothing, Trace Everything: Auditing Package Builds at Scale with OSS Rebuild _Matthew Suozzo / 12:30-12:55_ Matthew argues that reproducible builds aren’t enough if you don’t understand what happens during the build itself. He presents OSS Rebuild’s open-source observability toolkit, including a transparent network proxy and an eBPF-based system analyzer for detecting suspicious build behavior. The talk responds to supply chain attacks like the XZ backdoor. ### PURL: From FOSDEM 2018 to international standard _Philippe Ombredanne / 13:00-13:10_ Philippe traces Package-URL’s journey from its FOSDEM 2018 debut to becoming an international standard for referencing packages across ecosystems. PURL now appears in CVE formats for vulnerability tracking and is used by security tools, SCA platforms, and package registries for SBOM and VEX generation. ### Binary Dependencies: Identifying the Hidden Packages We All Depend On _Vlad-Stefan Harbuz / 13:15-13:25_ Vlad tackles a gap in package management: while source dependencies are well documented, binary dependencies like numpy’s reliance on OpenBLAS binaries remain invisible. He proposes a global index of binary dependencies using a linker that tracks symbols across the open source ecosystem. ### The terrible economics of package registries and how to fix them _Michael Winser / 13:30-13:55_ Michael examines why package registries struggle financially despite being used by almost all software. Most rely on grants, donations, and in-kind resources while facing increased costs and security expectations. He discusses how the Alpha-Omega project has funded security improvements and piloted sustainable revenue models with major registries. ### Package Management Learnings from Homebrew _Mike McQuaid / 14:00-14:25_ Mike discusses Homebrew’s v5.0.0 release from November 2025, covering what other package managers could learn from Homebrew’s approach and what Homebrew has adopted from elsewhere. See you in Brussels on January 31st.
blog.ecosyste.ms
December 20, 2025 at 4:56 PM
Reposted by hpc.social admins
Package managers keep using git as a database, it never works out.

https://nesbitt.io/2025/12/24/package-managers-keep-using-git-as-a-database.html
Package managers keep using git as a database, it never works out
Using git as a database is a seductive idea. You get version history for free. Pull requests give you a review workflow. It’s distributed by design. GitHub will host it for free. Everyone already knows how to use it. Package managers keep falling for this. And it keeps not working out. ## Cargo The crates.io index started as a git repository. Every Cargo client cloned it. This worked fine when the registry was small, but the index kept growing. Users would see progress bars like “Resolving deltas: 74.01%, (64415/95919)” hanging for ages, the visible symptom of Cargo’s libgit2 library grinding through delta resolution on a repository with thousands of historic commits. The problem was worst in CI. Stateless environments would download the full index, use a tiny fraction of it, and throw it away. Every build, every time. RFC 2789 introduced a sparse HTTP protocol. Instead of cloning the whole index, Cargo now fetches files directly over HTTPS, downloading only the metadata for dependencies your project actually uses. (This is the “full index replication vs on-demand queries” tradeoff in action.) By April 2025, 99% of crates.io requests came from Cargo versions where sparse is the default. The git index still exists, still growing by thousands of commits per day, but most users never touch it. ## Homebrew GitHub explicitly asked Homebrew to stop using shallow clones. Updating them was “an extremely expensive operation” due to the tree layout and traffic of homebrew-core and homebrew-cask. Users were downloading 331MB just to unshallow homebrew-core. The .git folder approached 1GB on some machines. Every `brew update` meant waiting for git to grind through delta resolution. Homebrew 4.0.0 in February 2023 switched to JSON downloads for tap updates. The reasoning was blunt: “they are expensive to git fetch and git clone and GitHub would rather we didn’t do that… they are slow to git fetch and git clone and this provides a bad experience to end users.” Auto-updates now run every 24 hours instead of every 5 minutes, and they’re much faster because there’s no git fetch involved. ## CocoaPods CocoaPods is the package manager for iOS and macOS development. It hit the limits hard. The Specs repo grew to hundreds of thousands of podspecs across a deeply nested directory structure. Cloning took minutes. Updating took minutes. CI time vanished into git operations. GitHub imposed CPU rate limits. The culprit was shallow clones, which force GitHub’s servers to compute which objects the client already has. The team tried various band-aids: stopping auto-fetch on `pod install`, converting shallow clones to full clones, sharding the repository. The CocoaPods blog captured it well: “Git was invented at a time when ‘slow network’ and ‘no backups’ were legitimate design concerns. Running endless builds as part of continuous integration wasn’t commonplace.” CocoaPods 1.8 gave up on git entirely for most users. A CDN became the default, serving podspec files directly over HTTP. The migration saved users about a gigabyte of disk space and made `pod install` nearly instant for new setups. ## Go modules Grab’s engineering team went from 18 minutes for `go get` to 12 seconds after deploying a module proxy. That’s not a typo. Eighteen minutes down to twelve seconds. The problem was that `go get` needed to fetch each dependency’s source code just to read its go.mod file and resolve transitive dependencies. Cloning entire repositories to get a single file. Go had security concerns too. The original design wanted to remove version control tools entirely because “these fragment the ecosystem: packages developed using Bazaar or Fossil, for example, are effectively unavailable to users who cannot or choose not to install these tools.” Beyond fragmentation, the Go team worried about security bugs in version control systems becoming security bugs in `go get`. You’re not just importing code; you’re importing the attack surface of every VCS tool on the developer’s machine. GOPROXY became the default in Go 1.13. The proxy serves source archives and go.mod files independently over HTTP. Go also introduced a checksum database (sumdb) that records cryptographic hashes of module contents. This protects against force pushes silently changing tagged releases, and ensures modules remain available even if the original repository is deleted. ## Beyond package managers The same pattern shows up wherever developers try to use git as a database. Git-based wikis like Gollum (used by GitHub and GitLab) become “somewhat too slow to be usable” at scale. Browsing directory structure takes seconds per click. Loading pages takes longer. GitLab plans to move away from Gollum entirely. Git-based CMS platforms like Decap hit GitHub’s API rate limits. A Decap project on GitHub scales to about 10,000 entries if you have a lot of collection relations. A new user with an empty cache makes a request per entry to populate it, burning through the 5,000 request limit quickly. If your site has lots of content or updates frequently, use a database instead. Even GitOps tools that embrace git as a source of truth have to work around its limitations. ArgoCD’s repo server can run out of disk space cloning repositories. A single commit invalidates the cache for all applications in that repo. Large monorepos need special scaling considerations. ## The pattern The hosting problems are symptoms. The underlying issue is that git inherits filesystem limitations, and filesystems make terrible databases. **Directory limits.** Directories with too many files become slow. CocoaPods had 16,000 pod directories in a single Specs folder, requiring huge tree objects and expensive computation. Their fix was hash-based sharding: split directories by the first few characters of a hashed name, so no single directory has too many entries. Git itself does this internally with its objects folder, splitting into 256 subdirectories. You’re reinventing B-trees, badly. **Case sensitivity.** Git is case-sensitive, but macOS and Windows filesystems typically aren’t. Check out a repo containing both `File.txt` and `file.txt` on Windows, and the second overwrites the first. Azure DevOps had to add server-side enforcement to block pushes with case-conflicting paths. **Path length limits.** Windows restricts paths to 260 characters, a constraint dating back to DOS. Git supports longer paths, but Git for Windows inherits the OS limitation. This is painful with deeply nested node_modules directories, where `git status` fails with “Filename too long” errors. **Missing database features.** Databases have CHECK constraints and UNIQUE constraints; git has nothing, so every package manager builds its own validation layer. Databases have locking; git doesn’t. Databases have indexes for queries like “all packages depending on X”; with git you either traverse every file or build your own index. Databases have migrations for schema changes; git has “rewrite history and force everyone to re-clone.” The progression is predictable. Start with a flat directory of files. Hit filesystem limits. Implement sharding. Hit cross-platform issues. Build server-side enforcement. Build custom indexes. Eventually give up and use HTTP or an actual database. You’ve built a worse version of what databases already provide, spread across git hooks, CI pipelines, and bespoke tooling. None of this means git is bad. Git excels at what it was designed for: distributed collaboration on source code, with branching, merging, and offline work. The problem is using it for something else entirely. Package registries need fast point queries for metadata. Git gives you a full-document sync protocol when you need a key-value lookup. If you’re building a package manager and git-as-index seems appealing, look at Cargo, Homebrew, CocoaPods, Go. They all had to build workarounds as they grew, causing pain for users and maintainers. The pull request workflow is nice. The version history is nice. You will hit the same walls they did.
nesbitt.io
December 24, 2025 at 4:49 PM
Reposted by hpc.social admins
Patrick Kennedy brings a huge amount of effort to bear to be able to test the NVIDIA ConnectX-8 NIC at 800 Gbps in current generation PCIe Gen5 servers that can only handle half that bandwidth in each x16 slot. https://www.servethehome.com/nvidia-connectx-8-dual-400gbe-400g-nic-review/
NVIDIA ConnectX-8 C8240 800G Dual 400G NIC Review This is a SuperNIC
In our NVIDIA ConnectX-8 SuperNIC review, we show full next-gen 800G performance from the C8240 dual 400GbE card in PCIe Gen5 servers The post NVIDIA ConnectX-8 C8240 800G Dual 400G NIC Review This is a SuperNIC appeared first on ServeTheHome.
www.servethehome.com
December 24, 2025 at 8:23 AM
The National Academies are hosting a hybrid workshop that will explore innovative ways that AI can enhance climate science and support decision-making for resilience and mitigation. The workshop will identify critical applications where AI can inform climate action at speed and scale, consider […]
Original post on mast.hpc.social
mast.hpc.social
December 22, 2025 at 10:38 PM
As the end of the year approaches, please consider becoming a sponsor of your favorite #hpc community platform. https://github.com/sponsors/hpc-social
Sponsor @hpc-social on GitHub Sponsors
The hpc-social resources provide an open platform for community engagement, including a variety of open source work on code and resources hosted on the hpc.social domain and associated community ca...
github.com
December 21, 2025 at 10:01 AM
Reposted by hpc.social admins
NERSC recently did a wholesale replacement of its FDR InfiniBand storage fabric to RoCE. The IB was a greenfield installation back when I started in 2015, and replacing it with a competing technology in production is quite the feat. Glad to hear it succeeded […]
Original post on mast.hpc.social
mast.hpc.social
December 17, 2025 at 12:20 AM
Submit to #pearc26! Submission details: https://pearc.acm.org/pearc26/submission-guidelines/

Key Dates
👉 Tutorials & Workshops: Feb 2, 2026
👉 Technical Tracks (Long Papers): Feb 9, 2026
👉 Presentation-Only Abstracts, BoF & Panels, Posters & Visuals, ICW, Short Papers: March 2026
December 11, 2025 at 6:55 PM
SCA/HPCAsia will be held January 26 - 29, 2026 in Osaka, Japan to bring together leading researchers, practitioners, and innovators from across the global HPC, AI, and QC communities. The conference chairs include Satoshi Matsuoka and Hiroyuki Takizawa. This year’s conference is held in […]
Original post on mast.hpc.social
mast.hpc.social
December 9, 2025 at 6:35 PM
HPSF Conference 2026 Call for Proposals Now Open https://hpsf.io/blog/2025/hpsfcon-2026-call-for-proposals-now-open/
HPSF Conference 2026: Call for Proposals Now Open
hpsf.io
December 3, 2025 at 7:51 PM
Reposted by hpc.social admins
I wrote up my notes from #SC25. Have a look: https://blog.glennklockwood.com/2025/12/sc25-recap.html

I’ll keep picking away at the editing, but would love to hear more from others about what stood out to them. I wasn’t at the conference itself as much this years as in the past, so I know I […]
Original post on mast.hpc.social
mast.hpc.social
December 1, 2025 at 7:27 PM
Tomorrow after many spending-oriented days, you'll be urged to support a variety of organizations on #GivingTuesday. We encourage you to do that. Through organizational and individual support, HPC.social is nearly self-sustaining. It would be nice to get all the way there! […]
Original post on mast.hpc.social
mast.hpc.social
December 1, 2025 at 6:03 PM
Graduate students and postdoctoral scholars from institutions in Australia, Canada, Europe (* see below), Japan, South Africa the United States will be invited to apply for the 16th International High Performance Computing (HPC) Summer School, to be held July 12-17, 2026 in Perth, Australia.

* […]
Original post on mast.hpc.social
mast.hpc.social
December 1, 2025 at 5:43 PM
Reposted by hpc.social admins
@jannem Honestly I donʻt knwo why Slurm never implemented DRMAA, but it was a real loss. Most major schedulers impleme nted v1, and DRMAA 2 is very nice with several enhancements. Details at https://www.drmaa.org/ and https://www.drmaa.org/implementations.php

The best modern implementation is […]
Original post on mast.hpc.social
mast.hpc.social
November 28, 2025 at 4:11 PM