HTTP Archive 💾
@httparchive.org
750 followers 31 following 25 posts
Public dataset that tracks how the web is built. Maintained by @patmeenan.com, @paulcalvano.bsky.social, @tunetheweb.com, @maxostapenko.com, and Nurullah Demir
Posts Media Videos Starter Packs
Pinned
httparchive.org
What do you think? Shall we do another Web Almanac this year? Diving deep into all the data we collect to see what’s changing in web trends?

Check out this post if interested in getting involved:
github.com/HTTPArchive/...

Or nominate your favorite experts that you’d love to see author a chapter.
Contribute to the 2025 Web Almanac · HTTPArchive almanac.httparchive.org · Discussion #4062
Dear all, We are excited to announce the Call for Contributions for the 2025 Web Almanac (6th Edition)! The Web Almanac is an annual report that provides an overview of the state of the web, based ...
github.com
Reposted by HTTP Archive 💾
paulcalvano.bsky.social
New @httparchive.org analysis about sites adding AI Bots and Crawlers to their robots.txt files. While robots.txt doesn't "block" bots by itself, it's a clear demonstration of the preferences of site owners and the sentiment towards AI crawlers on today's web. paulcalvano.com/2025-08-21-a...
AI Bots and Robots.txt
There’s been a lot of discussion lately around AI crawlers and bots, which are used to train LLMs and/or fetch content on behalf of their users. In the past few weeks I’ve seen blog posts about the am...
paulcalvano.com
httparchive.org
We're please to welcome Nurullah Demir as the newest maintainer of the HTTP Archive project!

Nurullah has helped lead the Web Almanac over the last two years and we look forward to seeing what this year's edition comes up with!

Welcome to the team Nurullah!
httparchive.org
🚨 Calling all web experts! 🚨

The 2025 Web Almanac is still open for contributors!

Know someone perfect for it? Mention them here and help us reach the right folks. 🙌

📢 Please help us spread the word!

🔗 Learn more: github.com/HTTPArchive/...
Reposted by HTTP Archive 💾
tamethebots.com
It's that time of year where the web almanac is seeking contributors for the next edition.
It takes a fair amount of work,but it's ridiculously rewarding,
& guaranteed warm fuzzy feelings when you see the chapter you contributed towards published.

I encourage you to give it a go see:
Contribute to the 2025 Web Almanac · HTTPArchive almanac.httparchive.org · Discussion #4062
Dear all, We are excited to announce the Call for Contributions for the 2025 Web Almanac (6th Edition)! The Web Almanac is an annual report that provides an overview of the state of the web, based ...
github.com
httparchive.org
What do you think? Shall we do another Web Almanac this year? Diving deep into all the data we collect to see what’s changing in web trends?

Check out this post if interested in getting involved:
github.com/HTTPArchive/...

Or nominate your favorite experts that you’d love to see author a chapter.
Contribute to the 2025 Web Almanac · HTTPArchive almanac.httparchive.org · Discussion #4062
Dear all, We are excited to announce the Call for Contributions for the 2025 Web Almanac (6th Edition)! The Web Almanac is an annual report that provides an overview of the state of the web, based ...
github.com
Reposted by HTTP Archive 💾
not-a-robot.com
Ask me how excited I got seeing @johnmu.com cite my and @tamethebots.com ‘s Page Weight chapter at Search Central Live
John Mueller on stage in front of a graph of Page Weight by year
Reposted by HTTP Archive 💾
henrihelvetica.bsky.social
Day 010 #100DaysOfPerf: In an apt compliment and follow up to the page weight post, working with media is one rife with challenges. Media loading, media formats, media codecs... But it's what makes the web so valuable.
✨ MEDIA ✨ chapter from the @httparchive.org is another worth checking 🧵⬇️
illustration of an video camera projecting a PLAY BUTTON . caption: "MEDIA"
Reposted by HTTP Archive 💾
henrihelvetica.bsky.social
Though I shared a link from the @httparchive.org and the 2024 Web Almanac, sharing a chart from the the actual archive: over 5 yrs, the page weight has grown almost 30%, and there has never been an indication of slowing down. The almanac chapter covers much of the data proving so. 🧵⬇️
chart indication page weight of desktop and mobile pages between February 1 2020 and February 1 2025

MEDIAN DESKTOP
2678.0 KB 
up 28.7%

MEDIAN MOBILE
2409.8 KB
up 27.8%
Reposted by HTTP Archive 💾
Reposted by HTTP Archive 💾
henrihelvetica.bsky.social
Day 007 #100DaysOfPerf: Since I touched on the @httparchive.org Web Almanac, decided to continue sharing more of their content. A note: The HTTP Archive really started as a way to see how the web was built, and its performance. So it will always have a hint of performance discussion. Today? FONTS 🧵⬇️
Illustration of 'F' font being pushed down a conveyor belt by illustrated characters. Caption "fonts"
Reposted by HTTP Archive 💾
henrihelvetica.bsky.social
Day 006 #100DaysOfPerf: Let's keep it moving by highlighting a great piece from the @httparchive.org Web Almanac, and their #performance chapter. "No one ever complained about a fast website", is a classic quote over the years, and the Performance Chapter highlights web data and patterns 🧵⬇️
Illustration of a browser window, a stopwatch, and some resources. Caption: "part two, chapter 9, performance."
httparchive.org
So they may be the same as desktop pages, they may be responsive and deliver different content to mobiles (based on user-agent, or viewport size), or they may be mobile-optimized sites not visited by desktop devices (e.g. m.facebook.com).
httparchive.org
We get a list of origins visited by mobile devices (based on user-agent) from the Chrome User Experience Report (CrUX). We then crawl them with mobile viewport, mobile user agent, and 4G setting as detailed in our methodology: almanac.httparchive.org/en/2024/meth...
Methodology | The Web Almanac by HTTP Archive
Describes how the 2024 Web Almanac was put together: The Datasets and Tools used and how the project was run.
almanac.httparchive.org
httparchive.org
An enormous thanks to all 88 contributors who made the 2024 edition possible.

And we've already started talking about a 2025 edition. Check out our GitHub repo to get a sense of the project, and there's an interest form if you want to help build the 2025 edition:
github.com/HTTPArchive/...
github.com
httparchive.org
That marks the end of the 2024 Web Almanac. A little later than usual but worth it in the end.

And also means the ebook is now available:
almanac.httparchive.org/en/2024/tabl...

754 pages jam packed with all sorts of nerdy goodness!!!
Table of Contents | Web Almanac 2024
Table of Contents for the 2024 Web Almanac, listing each section: Page Contents, User Experience, Content Publishing, Content Distribution.
almanac.httparchive.org
Reposted by HTTP Archive 💾
tammyeverts.com
I recently published my annual dive into the
@httparchive.org, focusing on page growth, #webperf and #ux:

www.speedcurve.com/blog/page-bl...

A common question is "How big SHOULD my pages be?" According to analysis by @infrequently.org, the ideal page should be <1.4 MB with <365 KB coming from JS.
A table that gives ideal versus actual page size and JavaScript size:

The ideal JS weight is under 365 kilobytes. The actual median JS weight is 650 kilobytes. The actual JS weight at the 90th percentile is 1825 kilobytes.

The ideal total page weight is under 1.4 megabytes. The actual median page weight is 2.6 megabytes. The actual JS weight at the 90th percentile is 11.1 megabytes.
Reposted by HTTP Archive 💾
tammyeverts.com
Just published the results of my annual dive into the @httparchive.org. Key findings:

😱 Med page has grown 8%
😱 90p page has grown 24%
😱 90p mobile page is 10MB
😱 Main culprits: JS & video

Dig in & learn what your page size targets should be and how to hit them: www.speedcurve.com/blog/page-bl...
SpeedCurve | Page bloat update: How does ever-increasing page size affect your business and your users?
The median web page has grown 8% in one year. How does this affect your Core Web Vitals, your search rank, your business and your users?
www.speedcurve.com
httparchive.org
Just about to start...
henrihelvetica.bsky.social
✨ The Web Almanac LIVE STREAM II ✨
featuring Chapters (authors):
🔸 SEO (Jamie, Mikael)
🔸 Privacy (Max O)
🔸 HTTP (Robin)
🔸 Cookies (Yana)
🔸 3rd Parties (Yash)
📆 Thursday, January 16th
⏰ 14h EST, everytimezone.com/s/6e8b3a3d
🔗 www.youtube.com/live/zCiMls2...
A 🔄 would be ✨🙏🏾✨.
Web Almanac
By HTTP Archive
THE WEB ALMANAC LIVE STREAM! Il. The Avatar of the host
The avatars for 6 persons. 
ONLINE
JANUARY 16, THURSDAY
time: 14h EST, 11h PST, 20h CET.
who: the authors!
Reposted by HTTP Archive 💾
henrihelvetica.bsky.social
HTTP/1.1/2/3 are all present in unison on the web today, w/ a 21/70/9 split. Or is it? @programmingart.bsky.social will shed the light on the @httparchive.org Web Almanac data, and the reality of the protocol's adoption. Join us to hear him share findings this Thursday
🔗 ⬇️
bsky.app/profile/henr...
Http version usage per website
 on desktop: 22%, H1, 71% H2, 7%, H3
 on Mobile: 21%, H1, 70% H2, 9%, H3
Reposted by HTTP Archive 💾
henrihelvetica.bsky.social
One of the @httparchive.org Web Almanac chapters we'll explore next week is SEO. Some interesting notes:
🔸 14% of Robot TXT files return as 404 🤯 (1 in 7)
🔸 10%+ of pages have invalid elements in the which has important consequences.
🔹 Find out why + more 1 week today
🔗 See pinned tweet
an illustration of a search bar and web pages.