-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Practical Generative AI with ChatGPT
By :

In Chapter 1, while covering the latest trends and innovations, we introduced multimodality as a feature typical of large multimodal models (a subset of large foundation models), which consists of processing and generating different types of data, such as text, images, audio, and video.
Definition
Large language models (LLMs) and large multimodal models (LMMs) are both part of the realm of generative AI and feature a Transformer architecture.
LLMs are trained on extensive textual data, enabling them to understand and generate human-like text. They are utilized in applications such as content creation, language translation, and customer service agents.
On the other hand, LMMs expand upon LLMs by processing and integrating multiple data types, including text, images, audio, and video. This allows them to generate images from textual descriptions, analyze videos with textual context, and create content that combines various data forms...