William J.B. Mattingly
@wjbmattingly.bsky.social
470 followers 100 following 300 posts
Digital Nomad · Historian · Data Scientist · NLP · Machine Learning Cultural Heritage Data Scientist at Yale Former Postdoc in the Smithsonian Maintainer of Python Tutorials for Digital Humanities https://linktr.ee/wjbmattingly
Posts Media Videos Starter Packs
wjbmattingly.bsky.social
🚨Job ALERT🚨! My old postdoc is available!

I cannot emphasize enough how much a life-altering position this was for me. It gave me the experience that I needed for my current role. As a postdoc, I was able to define my projects and acquire a lot of new skills as well as refine some I already had.
Reposted by William J.B. Mattingly
bcgl.bsky.social
Excited to be co-editing a special issue of @dhquarterly.bsky.social on Artificial Intelligence for Digital Humanities: Research problems and critical approaches
dhq.digitalhumanities.org/news/news.html

We're inviting abstracts now - please feel free to reach out with any questions!
DHQ: Digital Humanities Quarterly: News
dhq.digitalhumanities.org
wjbmattingly.bsky.social
Ahh no worries!! Thanks! I hope you had a nice vacation
wjbmattingly.bsky.social
Something I've realized over the last couple weeks with finetuning various VLMs is that we just need more data. Unfortunately, that takes a lot of time. That's why I'm returning to my synthetic HTR workflow. This will be packaged now and expanded to work with other low-resource languages. Stay tuned
wjbmattingly.bsky.social
No problem! It's hard to fit a good answer in 300 characters =) Feel free to DM me any time.
wjbmattingly.bsky.social
Also, if you are doing a full finetune vs LoRa adapters is another thing to consider. Also, depends on the model arch.
wjbmattingly.bsky.social
I hate saying this, but it's true: it depends. For line-level medieval Latin (out of scope, but small problem size), 1-3k examples seems to be fine. For page level out of scope problems, it really becomes more challenging and very model dependent, 1-10k in my experience.
wjbmattingly.bsky.social
I've been getting asked training scripts when a new VLM drops. Instead of scripts, I'm going to start updating this new Python package. It's not fancy. It's for full finetunes. This was how I first trained Qwen 2 VL last year.
wjbmattingly.bsky.social
Let's go! Training LFM2-VL 1.6B on Catmus dataset on @hf.co now. Will start posting some benchmarks on this model soon.
wjbmattingly.bsky.social
Training on full catmus now and the results after first checkpoint are very promising. Character and massive word-level improvement.
wjbmattingly.bsky.social
LiquidAI cooked with LFM2-VL. At the risk of sounding like an X AI influencer, don't sleep on this model. I'm finetuning right now on Catmus. A small test over night on only 3k examples is showing remarkable improvement. Training now on 150k samples. I see this as potentially replacing TrOCR.
wjbmattingly.bsky.social
New super lightweight VLM just dropped from Liquid AI in two flavors: 450M and 1.6B. Both models can work out-of-the-box with medieval Latin at the line level. I'm fine-tuning on Catmus/medieval right now on an h200.
Reposted by William J.B. Mattingly
aboutgeo.bsky.social
With #IMMARKUS, you can already use popular AI services for image transcription. Now, you can also use them for translation! Transcribe a historic source, select the annotation—and translate it with a click.
wjbmattingly.bsky.social
GLM-4.5V with line-level transcription of medieval Latin in Caroline Miniscule. Inference was run through @hf.co Inferencevia Novita.
wjbmattingly.bsky.social
Qwen 3-4B Thinking finetune nearly ready to share. It can convert unstructured natural language, non-linkedart JSON, and HTML into LinkedArt JSON.
wjbmattingly.bsky.social
I need to get back to my Voynich work soon! I will finally have time in a couple months I think.
Reposted by William J.B. Mattingly
wjbmattingly.bsky.social
Hmm, I think in those scenarios it may default to character parsing, but it wouldn't leverage the language model component very well. If you have some examples, I can test them out and see what happens.
wjbmattingly.bsky.social
Good question! I think it could handle the layout parsing aspect with enough training data (maybe 2k pages?). The problem is where to put the HTR/OCR output for reading order. Also, the quality of the HTR/OCR will depend on the language. Is this for medieval Latin?
wjbmattingly.bsky.social
5. Overall, this is a great model and at 1.7B I am seriously amazed at how well it handles two complex tasks (layout parsing and OCR/HTR) in tandem.
wjbmattingly.bsky.social
4. Getting Dots.OCR to learn the features and syntax of a new language is a daunting task. I have 2.3k pages (some bi-paginal) of Old Chruch Slavonic. This is an entirely unsupported language. It started to learn some of the new characters (ligatures), but struggled in getting the syntax.