This is the interview after we just launched 19,000 LPUs in Saudi Arabia. We built the largest inference cluster in the region.
Link to the interview in the comments below!
This is the interview after we just launched 19,000 LPUs in Saudi Arabia. We built the largest inference cluster in the region.
Link to the interview in the comments below!
Build fast.
Build fast.
Yes. It's called Jevons Paradox and it's a big part of our business thesis.
In the 1860s, an Englishman wrote a treatise on coal where he noted that every time steam engines got more efficient people bought more coal.
🧵(1/5)
Yes. It's called Jevons Paradox and it's a big part of our business thesis.
In the 1860s, an Englishman wrote a treatise on coal where he noted that every time steam engines got more efficient people bought more coal.
🧵(1/5)
OpenAI, Anthropic, and Azure are the top 3 LLM API providers on LangChain
Groq is #4, and close behind Azure
Google, Amazon, Mistral, and Hugging Face are the next 4.
Ollama is for local development.
Now add three more 747's worth of LPUs 😁
OpenAI, Anthropic, and Azure are the top 3 LLM API providers on LangChain
Groq is #4, and close behind Azure
Google, Amazon, Mistral, and Hugging Face are the next 4.
Ollama is for local development.
Now add three more 747's worth of LPUs 😁
Groq's second B747 this week. How many LPUs and GroqRacks can we load into a jumbo jet? Take a look.
Have you been naughty or nice?
Groq's second B747 this week. How many LPUs and GroqRacks can we load into a jumbo jet? Take a look.
Have you been naughty or nice?
techcrunch.com/2024/12/06/m...
techcrunch.com/2024/12/06/m...
There was this guy who got in a lot of trouble once, his name was Galileo.
There was this guy who got in a lot of trouble once, his name was Galileo.
One side says its 25 million, because we're going to get to 25 million tokens per second by the end of the year
On the other side, it says, “Make it real. Make it now. Make it wow.”
One side says its 25 million, because we're going to get to 25 million tokens per second by the end of the year
On the other side, it says, “Make it real. Make it now. Make it wow.”
3 months back: Llama 8B running at 750 Tokens/sec
Now: Llama 70B model running at 3,200 Tokens/sec
We're still going to get a liiiiiiitle bit faster, but this is our V1 14nm LPU - how fast will V2 be? 😉
3 months back: Llama 8B running at 750 Tokens/sec
Now: Llama 70B model running at 3,200 Tokens/sec
We're still going to get a liiiiiiitle bit faster, but this is our V1 14nm LPU - how fast will V2 be? 😉