Patent Landscape Report - Generative Artificial Intelligence (GenAI)

Generative AI – new systems with a long history

Artificial intelligence technologies have seen a dramatic increase in public and media attention in recent years. However, AI is not a new field of research. US and UK scientists – including theoretical mathematician Alan Turing – were already working on machine learning in the 1930s and 1940s, although the term AI did not become popular until the 1950s (McCarthy et al. 2006). ⁽¹⁾The term “Artificial Intelligence” has been vastly influenced by John McCarthy at Dartmouth, who co-organized with Marvin Minsky the Dartmouth Summer Research Project on Artificial Intelligence in 1956. The 1950s and 1960s saw a surge of interest in many AI areas including natural language processing, machine learning and robotics. Some scientists at the time predicted that a machine as intelligent as a human would exist within a generation (Minsky 1967). These predictions proved to be overly optimistic. Progress stagnated because of the limitations of computing power and algorithmic approaches available at the time. As a result, research funding dried up, which led to the first “AI winter” in the 1970s. In the following decades, periods of high AI research intensity alternated with periods of lower activity.

For a long time, AI algorithms and software were developed for specific purposes, based on clear rules of logic and parameters specified by programmers. Even now, many AI applications rely on rule-based decisions: if this, then that. For example, virtual assistants (Siri, Alexa, etc.) are essentially command-and-control systems. They only understand a limited list of questions and requests and fail to adapt to new situations. They cannot apply their “knowledge” to new problems or deal with uncertainty.

AI in the 21st century

The modern AI boom started at the beginning of the 21st century and has been on an upward trajectory ever since. Today, AI and machine learning is used in countless applications, including search engines, recommendation systems, targeted advertising, virtual assistants, autonomous vehicles, automatic language translation, facial recognition and many more. The rise of AI has been driven mainly by the following factors:

More powerful computers: In 1965, Gordon Moore observed that the number of transistors on computer chips doubles approximately every two years and predicted that this would continue for another 10 years (Moore 1965). His law has held true for more than half a century. This exponential growth translated into more and more powerful AI systems, often with AI-specific enhancements.
Big data: Second, the availability of data has increased similarly exponentially. This has provided a powerful source of training data for AI algorithms and has made it possible to train models with billions of images or a hundred billion tokens ⁽²⁾Tokens are common sequences of characters found in a set of text. Tokenization breaks text into smaller parts for easier machine analysis, helping AI models understand human language. of text.
Better AI/machine learning algorithms: New methods that allow AI systems to better use data and algorithms to learn the way humans do, such as deep learning, have enabled breakthroughs in areas such as image recognition or natural language processing (WIPO 2019).

Learning with examples rather than rules

The heart of modern AI is machine learning, when computer systems learn without being specifically programmed to do so. Modern AI models are fed with examples of input data and the desired outcome, allowing them to build models or programs that can be applied to entirely new data. Machine learning excels at handling massive datasets and uncovering hidden patterns within them.

A powerful approach within machine learning is called deep learning. It leverages complex structures called artificial neural networks, loosely modeled after the human brain. These networks identify patterns within datasets. The more data they have access to, the better they learn and perform. Information flows through numerous layers of interconnected neurons, where it is processed and evaluated. Each layer refines the information, connecting and weighting it through nodes. Essentially, AI learns by continuously reassessing its knowledge, forming new connections and prioritizing information based on new data it encounters. The term deep learning refers to the vast number of layers these networks can utilize. Deep learning-powered AI has achieved remarkable advancements, especially in areas like image and speech recognition. However, its success comes with a drawback. While the accuracy of the results is impressive, the decision-making process remains unclear, even to AI experts. This lack of transparency is a contrast to older rule-based systems.

Modern generative AI (GenAI): the next level of AI

Generative AI (GenAI) has been an active area of research for a long time. Joseph Weizenbaum developed the very first chatbot, ELIZA, in the 1960s (Weizenbaum 1966). However, GenAI as we know it today was heralded by the advent of deep learning based on neural networks.

Today, GenAI is one of the most powerful examples of machine learning. Compared to old rule-based AI applications that could only perform a single task, modern GenAI models are trained on data from many different areas, without any limitations in terms of task. Because the amount of training data is so large – OpenAI’s GPT-3 was trained on more than 45 terabytes of compressed text data (Brown et al.2020) – the models appear to be creative in producing outputs. For example, traditional chatbots follow scripted responses and rely on pre-defined rules to interact with users, making them suitable only for specific tasks. In contrast, modern GenAI chatbots such as ChatGPT or Google Gemini can generate human-like text, allowing for conversations that can adapt to many topics without being confined to a predetermined script. In addition, these modern chatbots can produce not only text, but also images, music and computer code based on the dataset on which they were trained.

The release of ChatGPT in 2022 was an iPhone moment for GenAI

In November 2022, OpenAI released ChatGPT (Chat Generative Pre-trained Transformer) to the public, which greatly increased public enthusiasm for GenAI. More than one million people signed up to use ChatGPT in just five days. A 2023 survey by auditing and consulting firm Deloitte found that nearly 61% of respondents in Switzerland who work with a computer already use ChatGPT or other GenAI programs in their daily work (Deloitte 2023). The ChatGPT release has been described by many, including Nvidia CEO Jen-Hsun Huang, as an “iPhone moment” for GenAI (VentureBeat 2023). This is partly because the platform made it easier for users to access advanced GenAI models, specifically decoder-based large language models. ⁽³⁾See next chapter for an overview and description of the different modern GenAI models. These models have demonstrated the potential for many real-world applications and have sparked a wave of research and development. Many companies are heavily investing in GenAI, with these newer models reaching a new dimension of capabilities.

Motivation of this report

This WIPO Patent Landscape Report provides observations on patenting activity and scientific publications in the field of GenAI. The analysis builds on the 2019 WIPO Technology Trends publication on Artificial Intelligence (WIPO 2019).

GenAI is expected to play an increasingly important role in various real-world applications and industries. It is therefore important to understand the technological trends in the field of GenAI in order to adapt business and intellectual property (IP) strategies. The aim of this report is to shed light on the current technology development, its changing dynamics and the applications in which GenAI technologies are expected to be used. It also identifies key research locations, companies and organizations.

As GenAI can be used for many different applications, we use a multi-angle perspective to gain an in-depth understanding. In particular, the analysis is based on three different perspectives illustrated by Figure 1.

The first perspective covers the GenAI models. Patent filings related to GenAI are analyzed and assigned to different types of GenAI models (autoregressive models, diffusion models, generative adversarial networks (GAN), large language models (LLMs), variational autoencoders (VAE) and other GenAI models).
The second perspective shows the different modes of GenAI. The term “mode” describes the type or mode of input used and the type of output produced by these GenAI models. Based on keywords in the patent titles and abstracts, all patents are assigned to the corresponding modes: image/video, text, speech/voice/music, 3D image models, molecules/genes/proteins, software/code and other modes.
The third perspective analyzes the different applications for modern GenAI technologies. The real-world applications are numerous, ranging from agriculture to life sciences to transportation and many more.

Introduction