Start your day with intelligence. Get The OODA Daily Pulse.

Groq unveils lightning-fast LLM engine; developer base rockets past 280K in 4 months

Groq now allows you to make lightning fast queries and perform other tasks with leading large language models (LLMs) directly on its web site. The company quietly introduced the capability last week. The results are much faster and smarter than the company has demoed before. It lets you type your queries, but also lets you speak the queries with voice commands. On the tests I did, Groq replied at around 1256.54 tokens per second, a speed that appears almost instantaneous, and something that GPU chips from companies like Nvidia are unable to do, according to Groq. The speed is up from an already impressive 800 tokens per second Groq showed off in April. By default, Groq’s site engine uses Meta’s open source Llama3-8b-8192 LLM. It also lets you choose from the larger Llama3-70b, some Gemma (Google) and Mistral models, and it will support other models soon. The experience is significant because it demonstrates to developers and non-developers alike just how fast and flexible a LLM chatbot can be. Groq’s CEO Jonathan Ross says usage of LLMs will increase even more once people see how easy it is to use them on Groq’s fast engine. For example, the demo provides glimpses at what other tasks can be done easily at this speed, for example generating job postings or articles and changing them on the fly.

Full case study : Groq unveils lightning-fast LLM engine; developer base rockets past 280K in 4 months.