Kathy
kathaem.bsky.social
Kathy
@kathaem.bsky.social
53 followers 110 following 21 posts
Computational Linguistics / Multilingual Language Models Into SciFi, choir, cats (incomplete list of interests) they/them
Posts Media Videos Starter Packs
@aclanthology.org not sure where to report, but in the last few months I've often had issues with long loading times/timeouts on aclanthology.org. It's particularly bad today---maybe related to the upcoming ARR deadline?
aclanthology.org
Idk about "primarily" mate
You mean the most popular *US* politicians on this list
Personally, sleeping more and vitamin D in the winter.

...sorry, not much of a baker
@aclrollingreview.bsky.social Why is the reviewing window (still) so short this cycle? Wasn't the cycle extended to ten weeks specifically to make the process more manageable? Wasn't it three weeks in past cycles? Instead reviewers don't even get two full weeks to handle 4+ submissions.
Reposted by Kathy
Reposted by Kathy
Beyond text: Modern AI tokenizes images too! Vision models split photos into patches, treating each 16x16 pixel square as a "token." 🖼️➡️🔤 #VisualTokenization

Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!
tokenization-workshop.github.io
As a second language English speaker this also confused me for so long. Eventually I decided it must be from the phrase "having cake" which also means eating the cake
Just spent two days in Göttingen at #HumanCLAIM workshop! Re-presented my poster on surveying methods for cross-lingual representation alignment, got a city tour, heard cool talks and had interesting conversations 💬💭
Oh very nice to see a paper for this intuition, and the data could be very useful! Adding to the reading list 👀
Alignability is more predictive of cross-lingual transfer than divergence of literal token distributions, particularly for language pairs with disparate scripts.
Basically we argue that token overlap measures for predicting multilingual performance are too literal, and introduce the notion of **token alignability**, which can be measured via the scores of a statistical aligner over a corpus tokenised with a given tokenised.
Reposted by Kathy
Following the MT Marathon, we're hosting a hackathon in Prague. Researchers and students from five institutions (+1 online) are working together to assess how robust #LLMs are to grammar errors in machine translation and related tasks. Thanks to EAMT for their support.
@queerinai.com Hi, I was invited to review for the workshop the other day but the email is not clear on when reviews will be due. This info will be important to decide if I'm able to serve; can you share the deadlines? Thanks!
Gotta say I'm not sure what pronunciation "luh-BOEV" is referring to but in my head it sounds like French beef
Germany. a) ground floor b) first floor. This matches how we count in German but the German terms basically treat the "upper floors" separately from the "ground floor"
Reposted by Kathy
Bill Labov died this morning. I'm not coherent enough to talk about how important and influential and brilliant he was. I am very sad.

I was so lucky to know him, and I am grateful every day that he (and Gillian, and Walt, etc) built an academic field where kindness is expected.
To add to the reviewing complaints 😅 Why do authors so often respond with an absolute wall of text? (Biggest response I got this time was four comments long.) As a reviewer, I find this very tough to engage with in the short discussion period, and as an author, I try to be concise in my responses.
5k is a small town, honestly 😂
Just wanted to say a quick thank you for organising a lovely social! 🎊🌈