This is an informal case summary prepared for the purposes of facilitating exchange during the 2024 WIPO IP Judges Forum.
Session 1: Frontier Technologies and Intellectual Property Adjudication
Hamburg Regional Court, Germany [2024]: Robert Kneschke v. LAION e.V., Case No. 310 O 227/23
Date of judgment: September 27, 2024
Issuing authority: Hamburg Regional Court
Level of the issuing authority: First instance
Type of procedure: Judicial (Civin( �/span>
Subject matter: Copyright and Related Rights
Plaintiff: Robert Kneschke
Defendant: LAION e.V.
Keywords: Copyright, Artificial Intelligence, Training Data, Text and Data Mining Exception
Basic facts: The plaintiff is a professional photographer. The defendant is an association that made a data set with almost 6 billion image-text pairs available to the public free of charge. The data set consists of a spreadsheet with hyperlinks to images or image files that are publicly available on the Internet, as well as information about each image, including a textual description (also called alternative text). The data set could be used to train generative artificial intelligence.
To create the data set, the defendant used existing data from a third party that contained the respective image URLs and textual descriptions for a random cross-section of images available on the Internet. The defendant downloaded the images linked in the existing data set, used software to check whether the textual description matched the corresponding image, and filtered out those where the text and image did not sufficiently match. The defendant then extracted the metadata associated with the remaining images, in particular the URL of the image storage location and the image description, to create the new data set.
As part of this process, an image copyrighted by the plaintiff and made available online via the website of a photo agency was recorded, downloaded, analyzed and included in the new data set with its metadata. The photo agency had issued a usage reservation in English in its terms of use, according to which visitors to the site were prohibited from "downloading" or "scraping" content from the site using automated programs.
The plaintiff demanded that the defendant refrain from reproducing the plaintiff's image for the creation of AI training data sets in the future.
Held: The Regional Court dismissed the action. The only issue before the Chamber concerned the permissibility of the download of the disputed image, which the defendant undertook to carry out a comparison of the image content with the pre-existing image description and create a new data set. The Chamber found that downloading the image in this context was covered by the copyright exception for text and data mining for the purposes of scientific research conducted by non-commercial research organizations (Section 60d of the German Copyright Act). The plaintiff failed to carry its burden of proving that the exception did not apply.
Text and data mining is defined in the law as the “automated analysis of single or multiple digital or digitized works in order to extract information from them, particularly about patterns, trends and correlations”. The Chamber found that the comparison of the image content with the pre-existing image description carried out by the defendant falls within this definition.
Although the Chamber did not need to determine whether the general exception for text and data mining (Section 44b of the Copyright Act) was also available to the defendant, it offered obiter dicens on its potential application. The general exception for text and data mining – unlike the more specific exception for text and data mining for the purposes of scientific research – permits the rights holder to reserve the use of its work for text and data mining through an express declaration. For works accessible online, the reservation of use is only effective if made in “machine-readable” form. The photo agency website from which the defendant’s photo was downloaded contained a reservation of use in “natural language”. The Chamber opined that the meaning of “machine-readable” should be assessed in light of the technology available at the time that the copyrighted work was reproduced. It further suggested that at least at the time of the court’s decision reservations of use in natural language should be regarded as “machine-readable”, but left open how this question would have been decided at the time of the defendant's act of reproduction in 2021.
The Chamber did not consider the legality of any possible subsequent use of the plaintiff’s image to train generative artificial intelligence by virtue of its inclusion in the defendant’s new data set.
Relevant legislation: Sections 44a, 44b, 60d of the German Copyright Act; Arts. 3-4 of Directive (EU) 2019/790 (EU Digital Single Market Directive); Art. 5 of Directive (EC) 2001/29 (InfoSoc Directive); Art. 53(1)(c) of Regulation (EU) 2024/1689 (AI Act)