sebastiandziadzio.com
go.bsky.app/NFbVzrA
Stay tuned for the official leaderboard and real-time personalised benchmarking release!
If you’re attending ACL or are generally interested in the future of foundation model benchmarking, happy to talk!
#ACL2025NLP #ACL2025
@aclmeeting.bsky.social
Check out ✨ONEBench✨, where we show how sample-level evaluation is the solution.
🔎 arxiv.org/abs/2412.06745
Stay tuned for the official leaderboard and real-time personalised benchmarking release!
If you’re attending ACL or are generally interested in the future of foundation model benchmarking, happy to talk!
#ACL2025NLP #ACL2025
@aclmeeting.bsky.social
arxiv.org/abs/2412.06712
Model merging assumes all finetuned models are available at once. But what if they need to be created over time?
We study Temporal Model Merging through the TIME framework to find out!
🧵
arxiv.org/abs/2412.06712
Model merging assumes all finetuned models are available at once. But what if they need to be created over time?
We study Temporal Model Merging through the TIME framework to find out!
🧵
So I'm really happy to present our large-scale study at #NeurIPS2024!
Come drop by to talk about all that and more!
Check out ✨ONEBench✨, where we show how sample-level evaluation is the solution.
🔎 arxiv.org/abs/2412.06745
Check out ✨ONEBench✨, where we show how sample-level evaluation is the solution.
🔎 arxiv.org/abs/2412.06745
1. Write a sentence.
2. Copy it to an LLM for edits, add a prompt explaining in simple words what I'm trying to say.
3. Realise my simple word explanation is actually what I need.
4. Copy it over to the paper, move on to the next sentence.
1. Write a sentence.
2. Copy it to an LLM for edits, add a prompt explaining in simple words what I'm trying to say.
3. Realise my simple word explanation is actually what I need.
4. Copy it over to the paper, move on to the next sentence.
Turns out you can, and here is how: arxiv.org/abs/2411.15099
Really excited to this work on multimodal pretraining for my first bluesky entry!
🧵 A short and hopefully informative thread:
Turns out you can, and here is how: arxiv.org/abs/2411.15099
Really excited to this work on multimodal pretraining for my first bluesky entry!
🧵 A short and hopefully informative thread:
go.bsky.app/TENRRBb
go.bsky.app/TENRRBb
go.bsky.app/NFbVzrA
go.bsky.app/NFbVzrA
I have been informed about some pretty unfortunate oversights on my part and ultimately platformed some creators who should not have been platformed.
I have been informed about some pretty unfortunate oversights on my part and ultimately platformed some creators who should not have been platformed.