For better or for worse, Large Language Models (LLMs) – used for natural language processing by commercial AI Platform-as-a-Service (PaaS) subscription offerings – have become one of the first “big data” applied technologies to become a crossover hit in the AI marketplace.

From a big data perspective, LLMs are gigantic datasets or data models. In the world of AI, LLM’s are huge neural networks that increase in size based on the number of parameters included in the model and are used by neural networks for training. Neural network parameters are values constantly refined while training an AI model, resulting in AI-based predictions. The more parameters, the more the data training results in structured information (organized around the parameters of the LLM) – enhancing the accuracy of the predictions generated by the model.

In April of 2020, the bleeding edge of innovation in this space was the Facebook chatbot Blender, made open source by Facebook with 9.4 billion parameters and an innovative structure for training on 1.5 billion publicly available Reddit conversations – with additional conversational language datasets for conversations that contained some kind of emotion; information-dense conversations; and conversations between people with distinct personas. Blender’s 9.4 billion parameters dwarfed Google’s Meena (released in January 2020) by almost 4X. (1)

OpenAI, a San Francisco-based research and deployment company, released GPT-3 in June of 2020 – and the results were instantly compelling: Natural language processing (NLP) with a seeming mastery of language that generated sensible sentences and was able to converse with humans via chatbots. By 2021, the MIT Technology Review was proclaiming OpenAI’s GPT-3 a top 10 breakthrough technology, “a big step toward AI that can understand and interact with the human world.”

