Nintendofan885
banner
nintendofan885.bsky.social
Nintendofan885
@nintendofan885.bsky.social
87 followers 140 following 180 posts
Also https://mastodon.social/@Nintendofan885 Pfp: fennec fox at a zoo Banner: Towers in Scunthorpe
Posts Media Videos Starter Packs
heh, web.archive.org/details/tld:... got updated with data up to May (previously didn't include the 2025 bar)

Captures: 2,204,810,318
URLs: 1,725,452,488
New URLs: 1,506,233,721

I feel like the Archive Team project may have helped a bit :P
Reposted by Nintendofan885
🌐 From the first web page created in 1991… to 1 trillion web pages archived today.

Every meme, blog, tweet & vanished site is part of our shared story. This is our collective memory. And it’s being saved.

Join in our celebration this October: blog.archive.org/trillion/

#Wayback1T #WaybackMachine
nice :)

FYI looks like you accidentally made a typo with the year
Reposted by Nintendofan885
Reposted by Nintendofan885
We are wrapping up seed list nominations for #EOT2024.

Going forward, you can submit government URL nominations using a similar tool: digital2.library.unt.edu/nomination/G...

Crawls of post-EOT seeds won’t be part of the #EOTArchive, but will appear in the @archive.org Wayback Machine
Nomination Tool: About Project
digital2.library.unt.edu
Reposted by Nintendofan885
Ever wanted to restore a taken down site from a web archive, with link navigation and references intact?
Our latest tooling make this simpler than ever!

Announcing govarchive.us - a dynamic, web archive-powered mirror of US government sites!

More details on our blog
webrecorder.net/blog/2025-03...
Introducing GovArchive.us & Mirroring Entire Sites with Web Archives • Webrecorder Blog
Introducing GovArchive.us and tooling to mirror web sites using web archives.
webrecorder.net
Reposted by Nintendofan885
ah, just realised that [something]-mil.govarchive.us works for .mil domains
nice :)

What about domains not on .gov? (e.g. military sites)
Reposted by Nintendofan885
Reposted by Nintendofan885
🗽 Give us your hidden, your overlooked,

Your orphaned Gov URLs yearning to be preserved,

The forgotten databases of your civic shore.

Send these, the neglected, the soon-to-vanish, to us—

We lift our crawler beside the open web. 🗽

#EOTArchive #EOT2024
Reposted by Nintendofan885
We’re excited to share the first batch of US Government websites that Webrecorder has archived as part of the
@eotarchive.org initiative. They’re now available on our public collections gallery app.browsertrix.com/explore/usgo...

#WebArchiving #Browsertrix #EOTarchive
FYI the URL is now web.archive.org/collection-s... after the collection was fixed
FYI the URL is now web.archive.org/collection-s... after the collection was fixed
sorry, I mean in relation to the Archive Team project
BTW did ScienceBase URLs ever get run? (since I remember mentioning in the IRC on the first day)
thanks whoever replied to my email about it :)
Reposted by Nintendofan885
We just launched a 16TB archive of every dataset that has been available on data.gov since November. This will be updated day by day as new datasets appear. It can be freely copied, and we're sharing the code behind it to help others make their own archives of data they depend on.
Announcing the Data.gov Archive | Library Innovation Lab
Today we released our archive of data.gov on Source Cooperative. The 16TB collection includes over 311,000 datasets harvested during 2024 and 2025, a complet...
lil.law.harvard.edu
Reposted by Nintendofan885
Penn is getting a lot of questions about Data Refuge. That effort no longer exists, but several efforts are currently active. I've created a doc from what I & others have suggested. I'll update as I hear more. Feel free to share or suggest: docs.google.com/document/d/1...
Data Rescue Efforts
Data / Website Rescue Efforts End of Term Crawl - The main coordinated effort to archive websites, but datasets have been more of a challenge. EDGI - They have been focused on environmental data. A ...
docs.google.com
It's still ongoing. The first crawl ran from September to the inauguration and the second crawl (post-inauguration) started on the 1st and continues until about April