Damien Tournoud
damientournoud.bsky.social
Damien Tournoud
@damientournoud.bsky.social
This results in the allocation of a map with 2.1 million entries, and the corresponding 2.1 million short key strings. In total, 416 MB of allocation.
January 20, 2026 at 9:44 PM
The issue is a classic memory amplification issue: while Request.ParseForm refuses by default to parse request bodies bigger than 10 MB, even that can result in significant memory usage for specially crafted inputs.

In this case, the input is a URL-encoded form data with small keys and no values.
January 20, 2026 at 9:44 PM
5/ That's not the only choice you can make. As noted in the Sync 1.1 release notes, if the repository was ordered in preorder depth-first search order, a reader implementation could both validate the cryptographic properties and iterate the keys in order.
January 6, 2026 at 10:36 PM
5/ Tap itself fetches the list of known repositories from the relay via the com.atproto.sync.listRepos method.

A full fetch of the list takes a few hours, because the API scales pretty poorly. This is what the rate of discovery of new repositories looked like the first time I ran this:
January 5, 2026 at 5:39 PM
10/ The biggest collection excluding Bluesky (jp.5leaf.sync.mastodon) is only 280MB compressed in total.

On the tail end, there is a large amount of very small collections, most of them appearing spamish.
December 31, 2025 at 7:21 AM
9/ At the collection level, the Bluesky-related collections (app.bsky.* and chat.bsky.*) form the overwhelming bulk of the data stored on the network (literally 99.9% in size).

In decreasing order of size: likes (a majority of the data), then posts, then reposts and finally follows.
December 31, 2025 at 7:21 AM
8/ Speaking of compression, in my current storage (CARv1 format, zstd level 1), most repositories achieve a compression ratio above 2x. Still, 20% of the repositories compress very little, at 1.51x ratio.

Surprising given the amount of redundancy in the data (i.e. structs are stored as maps).
December 31, 2025 at 7:21 AM
7/ On the other side of the size distribution, the biggest repository is 457MB compressed, and 1.18GB uncompressed.

Definitely some interesting outliers here.
December 31, 2025 at 7:21 AM
6/ Not unexpectedly, most of the repositories are very small. 20% of the repositories are below 912B while 50% of the repositories are below 6.34kB.

You have to wait until the 97th centile to reach 1.04MB.
December 31, 2025 at 7:21 AM
GitHub, my boy. Dates are hard.
November 4, 2025 at 8:18 PM
What you are looking at here is the 301 response of Nginx 1.22.1.
September 16, 2025 at 4:27 AM