CURRENT EFFORTS CONCERNING THE IDENTIFICATION OF WORKS PROTECTED BY COPYRIGHT AND NEIGHBORING RIGHTS
1. International Standard Work Code (ISWC)
2. International Standard Recording Code (ISRC)
3. International Standard Music Number (ISMN)
4. International Standard Book and Serial Numbers (ISBN/ISSN)
5. Publisher Item Identifier (PII)
6. Serial and Book Item and Contribution Identifiers (SICI)
7. Compositeur, Auteur, Editeur Code (CAE/IPI)
8. Digital Object Identifier (DOI)
9. International Standard Audiovisual Number (ISAN)
10. Persistent Uniform Resource Identifiers (URN/PURLs)
CURRENT EFFORTS CONCERNING THE STANDARDIZATION OF METADATA
1. The Dublin Core
2. MARC
3. The INDECS Project
4. BIBLINK/NEDLIB
This system, from the International Confederation of Societies of Authors and Composers (CISAC), an umbrella organization representing a number of collective management organizations mainly in the music field, is being used for musical works and developed for literary works. The codes are a "dumb" or "mute" numbers, in the sense that they do not in themselves contain any information. Unique for each object, the identification number is a key to a database where relevant information is contained. The version if this draft code used for music (ISWC-T) consists of the letter T followed by a sequentially allocated ten-digit numeric code, the last digit of which is a "check digit" that allows the computer to validate the other nine digits. Numbers for the literary system (ISWC-L) will be similar.
According to the International Federation of the Phonographic Industry (IFPI), the material traveling on electronic networks does not consist of "works" in a pure copyright sense, but rather of "manifestations" of works (also referred to as "digital objects"). Such manifestations might include a recording of a specific performance of a musical work (which, in the United States, may become a new work), or an HTML or PDF version of a scientific article published on the Web, including graphs and illustrations from various sources. Current IFPI identifiers for manifestations include the ISO-recognized International Standard Recording Code that identifies a musical recording (e.g., a track on a CD). Although it was adopted by ISO more than 10 years ago, less than 50% of recordings on the market have an embedded code. It is likely that efforts concerning encryption and protection of music files over the Web will affect the standardization process.
Another identifier in the music field is the ISO-recognized International Standard Music Number, which is used for sheet music.
Books may be considered manifestations, although they are also finished commercial products. For over thirty years they have been identified using the International Standard Book Number (ISBN). The ISBN is composed of a one-digit "region" code, a publisher
prefix, and then sequentially attributed numbers, followed by a check digit. Periodical publications are similarly identified at the title level by the International Standard Serial
Number (ISSN), but that number applies to a periodical publication, not to the articles, graphs, charts, and images that it contains.
In the book trade, probably due to the absence of a specific publisher identifier, some people identify publishing houses by their ISBN prefix.
Used in the publishing industry, the Publisher Item Identifier was developed in 1995 by an informal group of scientific and technical publishers: American Chemical Society, American Institute of Physics, American Physical Society, Elsevier Science and the Institute of Electrical and Electronics Engineers (IEEE). The Publisher Item Identifier is composed of seventeen alphanumeric characters that indicate publication type (whether it is a book or a journal), and other information depending on the type-such as the year of a serial publication. It contains no other intelligence, however, and is not linked to a central database.
The Serial Item and Contribution Identifier (SICI) is a recognized standard used by serial publishers, subscription agents, and libraries, but no one has found a way to use it in the digital environment because it does not identify individual articles. An expanded SICI and a new Book Item and Component Identifier (BICI) are now under development. They will be able to identify any part of a book or serial such as a chapter, an article, a foreword, an illustration, or a table.
The BICI is a flexible identification system with a fairly loose set of rules. The absence of firm rules here and in identifiers like the Digital Object Identifier (see 3.2.1.8) reflect the amorphous and changing nature of the data to be identified, and the way in which it is stored, made available, and used or reused.
The CAE code is used by collective management organizations in the music field to identify those who create music and-more recently-other forms of information. Created in 1992 by the International Confederation of Societies of Authors and Composers, the code has been superseded by the IP number to identify "Interested Parties" to a work-a full range of rightholders. The format of the number itself did not change, and previously allocated CAE codes were converted into IP numbers. As with some other identifiers, the numbers convey no meaning. At present, use of and access to the IP database is restricted to confederation members. If it is made available, it could lead to a standard identifier for people by all copyright industries.
The Digital Object Identifier (DOI) is not an identifier per se, but it offers both a structure for an identifier and a persistent routing system to a database containing relevant information.
Launched by the Association of American Publishers in conjunction with the Corporation for National Research Initiatives at the 1997 Frankfurt Book Fair, the DOI was designed to "provide persistent and reliable identification of digital objects via a proven technology-the CNRI Handle System®-and an efficient administration system to link customers with publishers, facilitate electronic commerce, and enable automated copyright management systems." The CNRI Handle System is a distributed computer system that stores names of digital items and can quickly find the information necessary to locate and access the items. The DOI is thus mainly two things: an identification system, potentially applicable to any and all categories of works and manifestations (even though at present its beta users are mostly book and journal publishers), and a central directory or database which, when queried using a DOI number, will route the user to the appropriate source of information.
The DOI is very flexible, given that rights holders or other persons using it as an identifier can use any suffix, including other existing identifiers (the ISBN in the example given above). The DOI is functionally similar to a Uniform Resource Locator (URL) in that a user can click on it and go directly to the DOI Directory, which in turn seamlessly reroutes the user to the source of information corresponding to that DOI. Unlike a URL, the DOI can easily be rerouted. A rightholder who purchases rights to a work from another rights holder can update the Directory information to ensure that future clicks are routed to its system.
However, the DOI Foundation is still grappling with the issue of which digital objects the DOI should identify. The creative communities as well as some more traditional copyright industries see the point of departure as the creative work or its manifestation. They see the initial work as the "core" to be identified, acknowledging that it may have digital versions." Even from that viewpoint, however, the task is difficult, given that there is no uniform identification system for those works and manifestations and "no widely accepted data model defining all creative and publishing acts, necessary in placing [creations] in a digital world. If the original works are identified using DOIs, should the various "physical" manifestations receive DOIs? What about products such as books, journals, articles and abstracts? The information industry, on the other hand, starts with digital objects that can be traded and has no need or desire to go "upstream" back to the original work.
The conclusion drawn by the DOI Foundation is that no single identifier is capable of serving all purposes. This is not fatal, however, because the DOI is not "just" an identifier. Rather, it is a structure in which other identifiers can be used to create a new identifier. With this structure in place, it is likely that the DOI and interested parties will be able to also offer an electronic copyright-management system solution, at least for print publications (in paper or digital form), probably late in 1999.
The International Standard Audiovisual Number (ISAN) is a joint development of the International Confederation of Societies of Authors and Composers, the International Federation of Film Producers Associations, and the Association de Gestion Internationale Collective des Oeuvres Audiovisuelles. The audiovisual number has reached the level of "committee draft" within the International Standards Organization, and has been submitted to national ISO committees. The proposed identifier is a sixteen-digit dumb number that may be used to identify audiovisual works of all kinds. It is an identification number without any legal implication or meaning and has no prima facie evidence value as regards the copyright status or ownership of the work. It does not identify rights owners, even though it will be a tool used by people concerned with copyright management as well as by many people interested in precise identification of audiovisual works. In other words, the number is a mere pointer to a database where information necessary for the identification of content is maintained. The proposal is to affix the number onto the work-on masters and copies, whether in analog or digital format, on packaging, contacts, etc. The system is administered by an ad-hoc, non-profit-making, international agency. The system and the information in the identification database will be open to any interested user. A fee will be charged to access the database. Many collective management organizations active in the audiovisual field plan to use the number as a key feature of the International Database on Audiovisual Works, a database of rights ownership in audiovisual works to be used for collective rights-management purposes.
There are various proposals to upgrade the standard Internet Uniform Resource Locators (URLs). The problem is that when a digital resource moves from one "page" or file on a server to another or from one server to another, the URL also changes. A user who enters the original URL in the browser gets the infamous "error 404" message, meaning that the resource is no longer available at that address. PURLs are URLs that point to a server that can be updated (a system not unlike the DOI directory). "Instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. The PURL resolution service associates the PURL with the actual URL and returns that URL to the client. The client can then complete the URL transaction in the normal fashion. In Web parlance, this is a standard HTTP "redirect."
[Annex II follows]
The Dublin Core is an attempt to identify the "core" elements of metadata that are needed to satisfy the needs of all those involved in the exchange of or commerce in electronic-information resources. It was developed over a three-year period at workshops in which experts from the library world, the networking and digital library research communities, and a variety of content specialties participated. This Core was named after the city in Ohio in which the first meeting was held.
Originally, the Dublin Core contained fifteen core elements: Title, Subject, Description, Creator (or primary contributor), Contributor, Publisher, Date, Type, Format, Identifier, Source (previous resource), Language, Relation (to another resource), Coverage (geographical or temporal) and Rights. In further meetings, other elements were added including the concept of a sub-element, which is used to qualify an element (for example, "date" can refer to a date of publication, or of a revision); a scheme, a label used to identify the method followed to identify the data (e.g., Dewey or MARC); and the language in which the metadata is entered, as opposed to the language of the resource itself.
A number of other groups are working on standards that could have a direct impact on the future of the Dublin Core. While those standards are not for metadata per se, they affect the way metadata is coded, transmitted, used, retrieved, and accessed. For example, the World Wide Web Consortium is developing new markup languages and a new language for representing metadata in XML, the markup language designed to replace HTML.
A well-known public repository of metadata in the United States is the machine-readable cataloging-records database (known as MARC or US MARC). The MARC formats are standards for the representation and communication of bibliographic and related information in machine-readable form. The MARC was developed by the Library of Congress, the Canadian National Library, and the American Library Association along with the Australian National Library, Online Computer Library Center (OCLC), the Music Library Association, and the Special Libraries Association. The US MARC database contains approximately seven million records of publisher titles. It can be searched online. A MARC record contains three elements: the record structure, the content designation, and the data content of the record. The MARC benefits from the fact that it already applies to a vast number of titles. The question is whether (and how) it could be extended to apply to other types of content.
See the supplementary background paper prepared by Ms. Koskinen-Olsson
BIBLINK is not a metadata definition project as such, but rather a project that aims to establish a relationship and encoding model between national bibliographic agencies and publishers of electronic material, in order to establish authoritative bibliographic information that will benefit both sectors. It is intended to deliver an interactive demonstration system that will enable publishers of electronic documents to input and transmit an agreed minimum level of data describing the documents to national bibliographic services. BIBLINK is funded by the European Commission. The Networked European Deposit Library (NEDLIB) is sponsored by a group of European national libraries and particularly the National Library of the Netherlands. It started where BIBLINK ended. Launched in January 1998 and funded by the European Commission, NEDLIB is not a metadata project. Its chief aim is to "construct the basic infrastructure upon which a networked European deposit library can be built." Further work on NEDLIB might provide useful guidance on the use of metadata in transactions between librarians and publishers.
(1) Dr. Gervais is Vice President, International at Copyright Clearance Center, Inc. and partner at the Montreal-based firm of Brouillette, Charpentier, Fortin.
(1bis) In copyright terminology, a "work" refers to the incorporeal creation of the mind. A manifestation of that work is a physical embodiment that allows others to access the work. For example, a literary work (e.g., a poem) might be printed in multiple different books, on a poster, on a "Web" page (in HTML) format, saved a computer file (word-processing, PDF, etc.). Each of these renditions is a separate manifestation of the underlying work.
(1ter) "Digital Rights and Wrongs. Computers were supposed to be threatening copyright. Instead, they may end up making it stronger", The Economist, 17 July 1999.
(2) For a discussion of the various collective management organization models, see Mihály Ficsor. Collective Administration of Copyright and Neighboring Rights. WIPO, 1990.
(3) While this type of application is newer, theatrical performance of theater plays has functioned under this model for a very long time, but does not enter the scope of this paper which focuses on diffusion techniques, i.e., on reception of material by users other than by direct personal access (presence at a live concert, etc.).
(4) The adoption of international exhaustion (a territory of exhaustion might be the European Union) may impact on the application of this principle, but the principle remains nonetheless.
(5) See for example the on-line contract for Mira uses, Article 3, http://www.mira.com/Services/MoreTermsConditions.htm.
(6) See Daniel Gervais, "The Law and Practice of Digital Encryption", op. cit.
(7) For the purposes of this section, "privacy" relates to protection of consumer data, while "confidentiality" applies in a corporate environment.
(8) Anne W. Branscomb. Anonymity, Autonomy and Accountability: Challenges to the First Amendment in Cyberspaces. (1995) 104 Yale Law Journal, 1639. Professor Julie Cohen (University of Pittsburgh) has taken a similar view.
(9) It must be said that some "newer" rightholders who started to play that role due to the possibilities of digital technology have (impatiently) held that view since the very beginning.
Association of American Publishers: http://www.publishers.org
Australian Copyright Agency Limited's Copyright Xpress: http://www.copyright.com.au
The BIBLINK Project: http://hosted.ukoln.ac.uk/biblink/
ByLine: The Global Online Journalism Bank: http://www.universalbyline.com/scoop.html
CNRI Handle System®: http://www.handle.net/index.html
Copyright Clearance Center: http://www.copyright.com/
Copyright Licensing Agency CLARCS: http://www.cla.co.uk/www/clarcs.html
Corporation for National Research Initiatives: http://www.cnri.reston.va.us/about_cnri.html
The Digital Object Identifier Initiative: Current Position and View Forward: http://www.doi.org/white-paper-3.pdf
Discuss-DOI: http://www.doi.org/mailman/listinfo/discuss-doi
DONOR Homepage: http://www.konbib.nl/donor/
Dublin Core Metadata Initiative: http://purl.oclc.org/dc/
The IMPRIMATUR Business Model V2.1: http://www.imprimatur.alcs.co.uk/IMP_FTP/BMv2.pdf
Inc. Republication Request Form: http://www.copyright.com/Republication/inc.html
International DOI Foundation: http://www.doi.org/DOI-Found-Recruit.html
International Federation of Reproduction Rights Organisations: http://www.copyright.com/ifrro/
Media Image Resource Alliance: http://www.mira.com/
Metadata Activity: http://www.w3.org/Metadata/Activity.html
MIRA Terms and Conditions: http://www.mira.com/Services/MoreTermsConditions.htm
Stuart Weibel, Erik Jul and Keith Shafer. "PURLs: Persistent Uniform Resource Locators:" http://purl.oclc.org/oclc/purl/summary
The Road to XML: Adapting SGML to the Web: http://www.xml.com/xml/pub/w3j/s1.discussion.html
Stanford Digital Library Metadata Architecture: http://www.parc.xerox.com/istl/members/baldonad/medoc97-infobus.pdf
Brian Green and Mark Bide. "Unique Identifiers: a brief introduction:" http://www.bic.org.uk/bic/uniquid.html
U.S. Fair Use Guidelines: http://lcweb.loc.gov/copyright/circs/circ21.pdf
U.S. MARC Database: http://lcweb.loc.gov/marc/
Web Naming and Addressing Overview (URIs, URLs, . . .): http://www.w3.org/Addressing/
World Intellectual Property Organization (WIPO): www.wipo.int
WIPO e-commerce initiatives: http://ecommerce.wipo.int/
[Document continues]