I want to accelerate Mistral 3 Small (make a 2-3B with the same vocab). There doesn't seem to be a good existing model.
I want to accelerate Mistral 3 Small (make a 2-3B with the same vocab). There doesn't seem to be a good existing model.
My notes here: simonwillison.net/2024/Dec/6/r...
My notes here: simonwillison.net/2024/Dec/6/r...