Foundational Models: Building Blocks for Generative AI Applications

By Stefan Geirhofer and Scott McKinney

Generative artificial intelligence (GenAI) refers to a category of machine learning models that, in response to a user’s input prompt, can generate output based on the prompt. 

The generated content can be of different types, including text, computer source code, images, audio, and video.

Developing GenAI models from scratch requires substantial machine-learning expertise, vast training data and significant computational resources to train the model on the data. While startups and smaller organizations might lack the necessary resources to make such a sizeable upfront investment, widely and often freely available “foundational” models have emerged as a class of extensible, general-purpose GenAI models that can be adapted to various applications. Foundational models can be extended by either finetuning the model’s parameters through targeted, smaller-scale training on datasets selected for specific applications (without training from scratch) or by providing contextual information as part of the input prompt, such as valid question/answer pairs that the model can emulate in its response. Either way, the versatility of foundational models reduces barriers to entry for GenAI application developers with limited resources by using such foundational models as a starting point.

This article focuses on large language models (LLMs) for text-based applications as an example of large foundational GenAI models that can be used as building blocks for a variety of use cases ranging from machine translations to chatbots. Similar foundational models have been developed for other content types, and many of the observations in this article extend to such models as well.

The recent and rapid advance of GenAI models has several origins. The proliferation of cloud computing technology over the past decade and the development of computer chips optimized for machine-learning computations have played a key role. Also having a part is the availability of foundational models and the ability of those models to be (relatively) easily customized for specific use cases by refining the model. In fact, foundational models have opened up the machine learning ecosystem to new developers who previously lacked the resources to develop full-blown GenAI models from scratch. In large part, this extensibility has led to the flurry of different GenAI applications launched in recent months.

 High-Level Structure of Large Language Models

Foundational models are GenAI models that have been trained on large swaths of general-purpose data and are intended to be later adapted for specific applications. The structure of foundational models depends on whether their purpose is to generate text, images or other content. For text-based applications involving natural language processing, LLMs with a “transformer” structure have replaced the previous generation of recursive neural networks and have become an important element of popular foundational models, such as “GPT” (General Pretrained Transformer) and “BERT” (Bidirectional Encoder Representations from Transformers).

Fundamentally, the purpose of LLMs is to predict the next word in a sequence of words based on the LLM’s model parameters, which embody statistical information derived from vast amounts of training data, such as the entirety of Wikipedia, thousands of books and a myriad of articles scraped from the web. 

Many LLMs do not always pick the most likely “next word” but inject some randomness into the generated output text by picking a lower-ranked alternative with a certain probability. This randomness is why repeatedly issuing the same prompt to an LLM will generally lead to different outputs. For reasons that are not fully understood (even by the creators of LLM technology), the injected randomness tends to make the output appear more “creative” and interesting.

Read the full story on Competition Policy International.