scott b. weingart
banner
scottbot.bsky.social
scott b. weingart
@scottbot.bsky.social
past: circus performer; historian of science; librarian; chief data officer at NEH.

present: dad; resident scholar at dartmouth; chief technology officer at the library of virginia.

personal account; views solely my own.

https://scottbot.github.io
Pinned
📌Hi! I'm Scott, a historian of science.

Before DOGE, I helped the US fund the humanities efficiently and impactfully, to reach the breadth of the American public.

Now I help make the Library of Virginia's rich collections and services digitally accessible to all.

Personal account, mostly silly.📌
Wait wait wait msnbc/msnow hasn't been associated with microsoft since 2005???
November 25, 2025 at 11:30 PM
Reposted by scott b. weingart
It was a remarkable feeling, working on a recent project, to have access to software that transcribed the marginalia as well as the printed text.
Nearly-perfect printed and handwritten text recognition is the most consequential technical contribution to the study of human culture of the last fifteen years, and it's not even close.

It fundamentally changes our (both lay and expert) relationship with the written past.
New issue of my newsletter: "The Writing Is on the Wall for Handwriting Recognition" — One of the hardest problems in digital humanities has finally been solved, and it's a good use of AI newsletter.dancohen.org/archive/the-...
November 25, 2025 at 7:19 PM
Reposted by scott b. weingart
Now this is an overview worth reading #skystorians. I’ve been running table models with pre-trained German language models (with pretty high CER) and it still took my total data entry time down at least 60-70%.
Nearly-perfect printed and handwritten text recognition is the most consequential technical contribution to the study of human culture of the last fifteen years, and it's not even close.

It fundamentally changes our (both lay and expert) relationship with the written past.
New issue of my newsletter: "The Writing Is on the Wall for Handwriting Recognition" — One of the hardest problems in digital humanities has finally been solved, and it's a good use of AI newsletter.dancohen.org/archive/the-...
November 25, 2025 at 7:06 PM
Nearly-perfect printed and handwritten text recognition is the most consequential technical contribution to the study of human culture of the last fifteen years, and it's not even close.

It fundamentally changes our (both lay and expert) relationship with the written past.
New issue of my newsletter: "The Writing Is on the Wall for Handwriting Recognition" — One of the hardest problems in digital humanities has finally been solved, and it's a good use of AI newsletter.dancohen.org/archive/the-...
The Writing Is on the Wall for Handwriting Recognition
One of the hardest problems in digital humanities has finally been solved
newsletter.dancohen.org
November 25, 2025 at 6:14 PM
Reposted by scott b. weingart
MajinBook is a badly-needed catalog for shadow libraries. It provides metadata (e.g., date of first publication, popularity on Goodreads) for over half a million English-language books. arxiv.org/abs/2511.11412 +
MajinBook: An open catalogue of digital world literature with likes
This data paper introduces MajinBook, an open catalogue designed to facilitate the use of shadow libraries--such as Library Genesis and Z-Library--for computational social science and cultural analyti...
arxiv.org
November 21, 2025 at 2:24 PM
Reposted by scott b. weingart
“Blogs are one of the great literary inventions of our time. Coming somewhere between an essay and a diary entry, they are a form of personal journalism that is intimate and immediate... but they also have… a rough-edged informality that breaks down barriers. They are engaging.”
November 12, 2025 at 11:22 AM
Reposted by scott b. weingart
We are so proud of this work. Not only is it the first effort to publish & analyze **open-access data** derived from the entire text contents of digitized @britishlibrary.bsky.social newspapers, it presents a metadata-driven approach to understanding bias in big historical data. #dh #skystorians
November 11, 2025 at 5:07 PM
Reposted by scott b. weingart
!Stop Press! Article on bias in digitised newspaper collections: ’Whose News’, in the new journal of @comphumresearch.bsky.social by Kaspar Beelen, @jonhistorian61.bsky.social, @kmcdono.bsky.social and me. See blog for summary & 🧵 1/7

Article doi.org/10.1017/chr....

Blog is.gd/2IFc30

#dh #c19 🗃️
Whose news? Critical methods for assessing bias in large historical datasets | Computational Humanities Research | Cambridge Core
Whose news? Critical methods for assessing bias in large historical datasets - Volume 1
doi.org
November 11, 2025 at 4:05 PM
Reposted by scott b. weingart
Great news! This is out: Opening the black box of EEBO academic.oup.com/dsh/advance-...
Opening the black box of EEBO
Abstract. Digital archives that cover extended historical periods can create a misleading impression of comprehensiveness while in truth providing access t
academic.oup.com
November 9, 2025 at 10:30 AM
Reposted by scott b. weingart
“And then you have librarians who are experiencing a real existential crisis because they are getting asked by their jobs to promote [AI] tools that produce more misinformation. It's the most, like, emperor-has-no-clothes-type situation that I have ever witnessed.” - Alison Macrina
AI Is Supercharging the War on Libraries, Education, and Human Knowledge
"Fascism and AI, whether or not they have the same goals, they sure are working to accelerate one another."
www.404media.co
November 7, 2025 at 7:15 AM
Reposted by scott b. weingart
How do we navigate gaps and challenges in quantifying and understanding innovation and R&D in the creative sector? Excellent roundup from @suzannerblack.bsky.social on how difficult it is to count and map innovation, and "return on funder investment", in the creative industries.
November 7, 2025 at 12:03 PM
Reposted by scott b. weingart
The first (1955) Danish edition of Ray Bradbury’s FAHRENHEIT 451. Later editions did not convert the title, so this is the only SI-compatible edition! 🎢
November 1, 2025 at 7:55 PM
Reposted by scott b. weingart
Haven't seen any linguistics research on character limits, but perhaps someone who follows me might know of some!
Hey, @gretchenmcc.bsky.social, sorry to at you like this out of the blue, but couldn't think of anyone better to ask.

Are you aware of any research looking at how character limits influence word choice on social media?
November 1, 2025 at 11:10 PM
Reposted by scott b. weingart
A permanent post in my department. Closing date Dec 14th 2025, interviews in March. Please spread #histsci
Assistant Professor in History of Knowledge Pre-1400
Applications are invited for the position of Assistant Professor in History of Knowledge Pre-1400, in the Department of History and Philosophy of Science at the University of Cambridge. Please note
www.cam.ac.uk
October 30, 2025 at 10:07 AM
Reposted by scott b. weingart
As DH grows, it’s increasingly important to publish conference papers, but there hasn’t been a clear venue for that.

So I’m thrilled to share this new home for DH proceedings, which will include CHR papers & more.

Thanks to @taylor-arnold.bsky.social for leading this effort!

bit.ly/ach-anthology
October 29, 2025 at 3:39 PM
Reposted by scott b. weingart
Every time discover a new piece on the dangers of LLMs, particularly for research and teaching, I add it to a Zotero library. I figure that I might as well share it, so here's my library of Cautionary AI Tales: www.zotero.org/groups/62758...
October 28, 2025 at 10:11 AM
Reposted by scott b. weingart
Online! Free!
@londonmedievalsoc.bsky.social London Medieval Society will be hosting its first colloquium of the academic year on the 22nd November 2025 on the subject of Women and Knowledge in the Middle Ages. 😊

You can find the Zoom link here: us02web.zoom.us/j/81823681150
October 27, 2025 at 1:38 AM
Reposted by scott b. weingart
'Cost-Effective Machine Learning for Automatically Processing Bibliographic Metadata' is a very readable account of using DistilBERT for specific DH tasks www.euppublishing.com/doi/full/10.... #AI4LAM
www.euppublishing.com
October 25, 2025 at 12:17 PM
Reposted by scott b. weingart
(I've occasionally heard the argument that even rejections add "value" to the general "market." For example, research applications help scholars gather thoughts, and can be reused for articles. Which is true, but may not outweigh the sunk costs & resource rebalancing from poor to rich institutions.)
October 24, 2025 at 7:00 PM
Reposted by scott b. weingart
Weirdly, this means in some cases that adding a (competitive, onerous, and popular) grant program where one didn't exist can actually cost a community more than if it didn't exist at all. So, that's cursed knowledge for you!
October 24, 2025 at 4:31 PM
Reposted by scott b. weingart
PSA: Grant programs with sufficiently low acceptance rates and payouts can cost more for applicants than they distribute to awardees.

Say it costs $5k in salaried hours to complete a proposal. 1k people apply. $1m is distributed overall. For every $1 spent, 20¢ of funding goes out.

This happens.
October 24, 2025 at 4:25 PM
Reposted by scott b. weingart
A neat tool I just came across: Viabundus, a digital road map of northern Europe 1350-1650, that lets you calculate contemporary travel routes/times. In 1500, going Amiens → Köln by horse took almost 7 days and 13 toll payments.

#medievalsky

www.landesgeschichte.uni-goettingen.de/handelsstras...
October 24, 2025 at 10:58 PM
PSA: Grant programs with sufficiently low acceptance rates and payouts can cost more for applicants than they distribute to awardees.

Say it costs $5k in salaried hours to complete a proposal. 1k people apply. $1m is distributed overall. For every $1 spent, 20¢ of funding goes out.

This happens.
October 24, 2025 at 4:25 PM
Reposted by scott b. weingart
1. We ( @jbakcoleman.bsky.social, @cailinmeister.bsky.social, @jevinwest.bsky.social, and I) have a new preprint up on the arXiv.

There we explore how social media companies and other online information technology firms are able to manipulate scientific research about the effects of their products.
October 24, 2025 at 12:47 AM