Tokenization Workshop (TokShop) @ICML2025
@tokshop.bsky.social
94 followers 11 following 25 posts
Let's Talk about Tokenization https://tokenization-workshop.github.io
Posts Media Videos Starter Packs
tokshop.bsky.social
🎥 Videos of our invited talks and the panel discussion are now also available on YouTube: www.youtube.com/@tokenizatio... ▶️
tokshop.bsky.social
🎥 Videos from our Tokenization Workshop are now live! Watch invited talks, panel discussions, and the best paper presentation at icml.cc/virtual/2025... #Tokenization #NLP #LLMs
Tokenization Workshop (TokShop)ICML 2025
icml.cc
tokshop.bsky.social
🏆 Announcing our Best Paper Awards!
🥇 Winner: "BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization" openreview.net/forum?id=AO7...
🥈 Runner-up: "One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression" openreview.net/forum?id=lC4...
Congrats! 🎉
tokshop.bsky.social
🔥 The Tokenization Workshop is happening NOW, and we have a packed room! It's great to see so much interest in tokenization research. #ICML2025 #Tokenization #LLM #NLP
tokshop.bsky.social
Three invited speakers will share their insights at TokShop! Hear from Yuval Pinter @uvp.bsky.social, Desmond Elliott @delliott.bsky.social, and Adrian Łańcuck on cutting-edge tokenization research. Don't miss these keynote presentations! #ICML2025 tokenization-workshop.github.io/speakers
tokshop.bsky.social
🎤 Meet our expert panelists! Join Albert Gu, Alisa Liu, Kris Cao, Sander Land, and Yuval Pinter as they discuss the Future of Tokenization on July 18 at 3:30 PM at TokShop at #ICML2025.
tokshop.bsky.social
The TokShop schedule is now live! Join us at #ICML2025 for invited talks, poster sessions, and a panel on the future of tokenization. tokenization-workshop.github.io/schedule #Tokenization #LLM #NLP
tokshop.bsky.social
TokShop @ #ICML2025 got way more submissions than expected! 📈 We could really use a few more reviewers to help out. If you have the capacity to review a #tokenization paper by Saturday, please fill out this form: forms.gle/32A6sQHQrMSb... 🙏
TokShop 2025
Registering interest in all things tokenization at TokShop @ ICML 2025 (July 18) Consider joining the Google group for future updates! https://groups.google.com/g/tokshop
forms.gle
tokshop.bsky.social
📣 We extend the submission deadline by 24 hours to avoid conflict with ACL camera-ready deadline.

📅 New Submission Deadline: May 31, 2025 (23:59 AoE)

📩 OpenReview: openreview.net/group?id=ICM...
tokshop.bsky.social
Got a good tokenization paper under review at COLM, but the scores were a letdown? 😬

Why bother with rebuttal when the perfect venue is right around the corner!

Submit your paper to the #ICML2025 Tokenization Workshop (TokShop) by May 30! 🚀
tokshop.bsky.social
Beyond text: Modern AI tokenizes images too! Vision models split photos into patches, treating each 16x16 pixel square as a "token." 🖼️➡️🔤 #VisualTokenization

Interested in tokenization? Join our workshop tokenization-workshop.github.io
The submission deadline is already May 30!
tokenization-workshop.github.io
tokshop.bsky.social
Got a tokenization paper rejected from ACL? Didn't submit to EMNLP/NeurIPS? Want to present your ACL/EMNLP/NeurIPS work non-archivally? Submit to TokShop @ ICML 2025!
The deadline is already May 30!
openreview.net/group?id=ICM...
tokenization-workshop.github.io
tokshop.bsky.social
Language matters: Low-resource languages are severely overtokenized: While English uses ~1.2 tokens per word, e.g., Tamil requires more tokens than characters, making #LLMs much costlier for billions of speakers! 💸🌍

Check out our ICML workshop 🔗 tokenization-workshop.github.io
Tokenization Workshop @ ICML 2025
tokenization-workshop.github.io
tokshop.bsky.social
Did you know BPE (Byte Pair Encoding), the most common LLM tokenizer, was originally a compression algorithm from 1994? #Tokenization #LLM #NLP

Want to find out more about tokenization? Attend our workshop at ICML! tokenization-workshop.github.io
Tokenization Workshop @ ICML 2025
tokenization-workshop.github.io
tokshop.bsky.social
📝 Submit papers (up to 9 pages, shorter submission ) via OpenReview: openreview.net/group?id=ICM...

🗓️ Important dates:
Deadline: May 30, 2025
Notifications: June 9, 2025
Workshop: July 18, 2025
Both archival and non-archival options available! #ICML2025 #TokShop #ML #NLP
ICML 2025 Workshop TokShop
Welcome to the OpenReview homepage for ICML 2025 Workshop TokShop
openreview.net
tokshop.bsky.social
📣 Call for Paper Alert: TokShop @ ICML 2025
TokShop explores tokenization across all data modalities. Topics include: subword NLP techniques, multimodal approaches, multilingual challenges, post-training modification, alternative representations, and statistical perspectives.
ICML 2025 Workshop TokShop
Welcome to the OpenReview homepage for ICML 2025 Workshop TokShop
openreview.net
tokshop.bsky.social
Got a tokenization paper that just didn't make the cut for ICML? Submit it to the Tokenization Workshop TokShop at #ICML2025 -- we'd love to see it there!
tokenization-workshop.github.io
Tokenization Workshop @ ICML 2025
tokenization-workshop.github.io
tokshop.bsky.social
In the upcoming weeks, we will announce an exciting line-up of invited talks and panelists. Follow our account
@tokshop.bsky.social to stay tuned.

Join us at TokShop at #ICML2025!
tokshop.bsky.social
We're looking for papers on tokenization in text, vision, audio, multimodal, and more.

📝 Up to 9 pages (shorter welcome!)
🔍 Double-blind review
📚 Archival and non-archival options available
tokshop.bsky.social
There has been a lot of chatter about tokenization for LLMs over the last few months, but tokenization goes beyond text-based models.

It's time we bring the NLP and ML communities together to explore this foundational topic. Let's talk about tokenization at TokShop!
tokshop.bsky.social
🚨 NEW WORKSHOP ALERT 🚨

We're thrilled to announce the first-ever Tokenization Workshop (TokShop) at #ICML2025 @icmlconf.bsky.social! 🎉

Submissions are open for work on tokenization across all areas of machine learning.

📅 Submission deadline: May 30, 2025
🔗 tokenization-workshop.github.io
Tokenization Workshop @ ICML 2025
tokenization-workshop.github.io
tokshop.bsky.social
In the upcoming weeks, we will announce an exciting line-up of invited talks and panelists. Follow our account @tokshop.bsky.social to stay tuned.

Join us at TokShop at #ICML2025! @icmlconf.bsky.social
tokshop.bsky.social
We're looking for papers on tokenization in text, vision, audio, multimodal, and more.

📝 Up to 9 pages (shorter welcome!)
🔍 Double-blind review
📚 Archival and non-archival options available