Lightnews — Scholar-powered news

Alex Dimakis

@alexdimakis.bsky.social

UC Berkeley Professor working on AI. Co-Director: National AI Institute on the Foundations of Machine Learning (IFML). http://BespokeLabs.ai cofounder

Posts Replies Media Videos

Alex Dimakis

@alexdimakis.bsky.social

Please let us know what you find out.

February 16, 2025 at 9:01 AM

Alex Dimakis

@alexdimakis.bsky.social

Yes we have been thinking of doing this. DCLM 7B is a fully open model (full pre-training data and code open) and we could post-train it with open-thoughts.

February 15, 2025 at 4:15 AM

Alex Dimakis

@alexdimakis.bsky.social

Qwen32B. post-training data is open and available as openthoughts 114k huggingface.co/datasets/ope...

open-thoughts/OpenThoughts-114k · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

February 12, 2025 at 10:31 PM

Alex Dimakis

@alexdimakis.bsky.social

Congratulations Constantine for this big effort for supporting Greek students.

January 29, 2025 at 7:11 PM

Alex Dimakis

@alexdimakis.bsky.social

for the 17k traces used in Berkeley Sky-T1, I joked that you could take 1000 student Berkeley course and give 17 homework problems to each student. On a more serious note universities are a great way to collect such data I think.

January 29, 2025 at 9:00 AM

Alex Dimakis

@alexdimakis.bsky.social

Our repo: github.com/open-thought...
Open code, Open reasoning data (114k and growing), Open weight models.
Please let us know if you want to participate in the Open Thoughts community effort. (2/n)
www.openthoughts.ai

January 28, 2025 at 6:23 PM

Alex Dimakis

@alexdimakis.bsky.social

Thanks for featuring us Nathan !

January 28, 2025 at 3:12 AM

Alex Dimakis

@alexdimakis.bsky.social

Answer. Another way it could be done: Get data by teaching a 1000 student class and assign 17 homework problems. Side benefit: make $10M by charging $10K tuition.

January 14, 2025 at 6:09 PM

Alex Dimakis

@alexdimakis.bsky.social

Creating small specialized models is currently hard. Evaluation, post-training data curation and fine-tuning are tricky, and better tools are needed. Still, its good to go back to UNIX philosophy to inform our future architectures. (n/n)

January 8, 2025 at 11:28 PM

Alex Dimakis

@alexdimakis.bsky.social

This is related to the "Textbooks is all you need", but for narrow jobs like summarization, legalQA, and so on, as opposed to general-purpose small models. Research shows how to post-train using big models to create small models that are faster and outperform their big teachers in narrow tasks(6/n)

January 8, 2025 at 11:28 PM

Alex Dimakis

@alexdimakis.bsky.social

I believe that the best way to engineer AI systems will be to use post-training to specialize Llama small models into narrow focused jobs. 'Programming' specialized models can be done by creating post-training datasets created from internal data by prompting foundation models and distilling. (5/n)

January 8, 2025 at 11:28 PM

Alex Dimakis

@alexdimakis.bsky.social

Instead, I would like to make the case for Small Specialized Models following Unix philosophy:
1. Write programs that do one thing and do it well
2. Write programs to work together
3. Write programs to handle text streams, because that is a universal interface. Replace programs with AI models (4/n)

January 8, 2025 at 11:28 PM

Alex Dimakis

@alexdimakis.bsky.social

Monolithic AI systems are also extremely wasteful in terms of energy and cost: using GPT4o as a summarizer, fact checker, or user intent detector, reminds me of the first days of the big data wave, when people where spinning Hadoop clusters to process 1GB of data. (3/n)

January 8, 2025 at 11:28 PM

Alex Dimakis

@alexdimakis.bsky.social

This is not working very well. This monolith view of AI is in contrast to how we teach engineers to build systems. To build complex systems engineers create modular components. This makes systems reliable and helps teams to coordinate with specs that are easy to explain, engineer and evaluate. (2/n)

January 8, 2025 at 11:28 PM

Alex Dimakis

@alexdimakis.bsky.social

hello world

November 19, 2024 at 10:43 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news