Proposals for extending Ogg Vorbis comments
Metadata has become an ubiquitous part of digital audio formats, since attempting to encode all relevant information into a filename can quickly become unwieldy. MPEG audio attempted to do this after the fact, with ID3 tags, but subsequent standards have attempted to find a better way to build metadata into the files. One of the novel features that vorbis comments offer is that, unlike metadata formats that offer a fixed list of possible data types, vorbis comments are of the form “name=value”, placing no restrictions on the field name, making it easily extensible. Also, any comment field may appear any number of times, allowing for cleaner descriptions of multi-valued fields, such as if more than one artist contributed to a song. These extensible comments have found their way into standards besides Ogg Vorbis, such as the Free Lossless Audio Codec, so effective usage of vorbis comments can be relevant to anyone’s audio collection.
The developers of the vorbis format created a base set of comment field types, all of which are optional, leaving the task of specifying what should appear in metadata primarily up to the user. Essentially, all that is offered is a small amount of semantic information so that certain common fields can be understood among all vorbis users. Opinions on what role metadata should play vary widely, and the vorbis developers view comments as “the equivalent of a quick note scribbled on the bottom of a CDR,” so the current comment specification is not likely to change. This leaves those who desire a more complete, better defined comment system on their own to find that definition.
For the most common cases, the given set of comment fields is sufficient, but it is lacking for attempts to encode information from some types of audio, most notably with DJ mixes. Also, the fields are (intentionally) very generic, which makes it difficult to search for someone with a more specific role in the creation of the piece, such as a conductor or remixer. Finally, the roles of certain fields, such as artist and performer, become less clear when there is more than one class of either. Although all of these shortcomings are intentional decisions in the design of vorbis comments, they make it, in its barest form, unsuitable to concisely and clearly encode information about an audio work. The suggestions offered here attempt to alleviate this problem.
This document is a perpetual work in progress; music and its presentation change, and so should its metadata representation change to better encode relevant information. The primary goal is clarity. Any information that does not fit well into the base specification should be given its own field, and the meaning of old fields should be made more specific to prevent overlap. Field names should be kept as generic as is reasonable, so as to allow reuse when the same role is found in a different genre of audio, but the meaning of a field should be discernible from the name itself. In the case of multiple occurrences of a field, the order of the fields should carry no meaning. If additional information is needed, different field names should be used. Most importantly, I want your input. Feel free to email david@gophernet.org to discuss your ideas. It would be nice if, after writing all this on what vorbis comments could be, more than one person used it.
There is a another attempt to extend vorbis comments, listed in the references [3], which has goals similar to mine, conveniently allowing me to steal several of its ideas, but it has some severe flaws. The concept of singleton tags, though logical for some fields, drastically limits the flexibility of the comments when applied to AUTHOR. Ironically, the example given for multi-valued fields in the original specification uses AUTHOR. The addition of a language tag smacks of localized fields, which could result in non-portable field names. I don’t hate people who don’t speak English, but we have to agree on something. As for the contents of the fields, most fields will be in the language of the artist, such as the artist’s name, title of the work, &c, and thus most of the fields are effectively language-independent. Perhaps I don’t quite understand the scope of the problem, but currently I think that the idea of language tags would only damage portability. My final gripe is that the author of the recommendations seems to have an irrational fear of spaces. Spaces are explicitly allowed in field names and should be used when appropriate instead of creating jumbled messes of letters.
Involved People
The base specification provides ARTIST and PERFORMER. I find this a bit sparse.
- ARTIST
- Perhaps the best description for this field is “the person or group whose name is on the CD cover.” This is perhaps the most ambiguous of the fields, since the person thought of as responsible for the work varies by genre. In popular music, this would be the performing band. In classical music, this is usually the composer, but it may sometimes be a performer. In spoken word tracks (e.g., audio books), this is the author of the work. In DJ mixes, this is the DJ. Essentially, this field allows for the most prominent person or persons involved in the work to be easily displayed.
- SOURCE ARTIST
- The artist of the work being performed, when different from ARTIST. This field applies for DJ mixes, since the person or group in the ARTIST field will be making a performance of a recording done by another artist.
- COMPOSER
- The author of the work. This may be a composer in classical works sold by performer or a songwriter in popular music if different from ARTIST.
- PERFORMER
- A performer of a work not acting in another specific role, such as conductor or artist. This would include the speaker of audio works and sometimes the performer of classical works in cases when ARTIST or ENSEMBLE isn’t suitable. Instrument could be included in square brackets, such as in “PERFORMER[CLARINET]” or “PERFORMER[GUITAR]”, though this may make searching for a specific performer more difficult.
- ENSEMBLE
- The group performing the piece, when different from ARTIST. For example, this would be the orchestra in classical works.
- CONDUCTOR
- The leader of an ensemble performing the piece.
- REMIXER
- If the version of the sound recording is a remix, i.e., a new version created from the original tracks, the remixer should be included here, if known.
- PRODUCER
- The person responsible for the project, usually providing funding and some form of artistic direction.
- ENGINEER
- The person responsible for creating the mastered record from the recorded tracks, or, for live recordings, for keeping the sound levels properly balanced. Sometimes the sound engineer is also referred to as the “producer”; if these are the same people, producer is the preferred field.
- GUEST ARTIST
- Sometimes when doing a collaboration with another artist, there will be a primary artist in charge of the recording and a guest performer or guest performers. This field should be used for artists in a collaboration whose role should be distinguished from that of the primary ARTIST.
Album information
These fields should be the same for every track in an album. Vorbis provides ALBUM and ORGANIZATION. ORGANIZATION is a bit ambiguous, though, since different companies are often involved in production and publishing, so PUBLISHER was added. To provide a unique identifier for an album in the spirit of ISRC for tracks, PRODUCTNUMBER and CATALOGNUMBER were added.
I know I said earlier that I would use some spaces in field names (PRODUCTNUMBER, CATALOGNUMBER), but TRACKNUMBER kind of set the trend for these.
- ALBUM
- The name of the album or collection from which this track was taken
- ORGANIZATION
- The organization responsible for producing the album; i.e., the record label.
- PUBLISHER
- The organization responsible for publishing the album. This is often the same as ORGANIZATION. If this is the case, both should be given to allow for searching by PUBLISHER and CATALOGNUMBER, but if PUBLISHER is missing, it should be assumed to be the same as ORGANIZATION.
- PRODUCTNUMBER
- The Universal Product Code, EAN, or JAN code used to identify the album if it is a retail product. These are given in the form of a bar code. CDs will almost always use either the 12-digit UPC-A or 13-digit EAN-13 forms. The various bar-code systems are intended to be compatible with one another, so no identification of the code type should be necessary.
- CATALOGNUMBER
- The number used by the publisher to uniquely identify a recording. On CDs, this is usually printed along the spine and is occasionally the same as the PRODUCTNUMBER.
- VOLUME
- The volume number for a multi-volume work, such as a multi-disc album or boxed set. This doesn’t necessarily have to be a number; it could, for example, be a subtitle for the volume.
- RELEASE DATE
- The date that the album was published. This can be used to distinguish among various remasters and re-releases of an album. See DATE for suggestions on format. This date is often also included, though in a different form, with the COPYRIGHT information.
- SOURCE MEDIUM
- The medium from which the track was ripped; e.g., CD, Radio, Cassette, Vinyl LP
Track information
These fields are used to identify the specific song, and should usually appear only once in a track. The main addition I stole from [3] is OPUS, to more specifically identify a classical work. Vorbis provides TITLE, VERSION, TRACKNUMBER, DESCRIPTION, GENRE, DATE, LOCATION and ISRC.
- TITLE
- The title of the work
- SUBTITLE
- This field is intended for use with FLAC, in order to connect specific titles with an embedded cue sheet. A single file can effectively contain multiple works, indexed by TRACK and INDEX in the case of cue sheets, and they can be specified using subscripts like “SUBTITLE[TRACK 3]” or “SUBTITLE[TRACK 7:INDEX 2]”. This should only be used for the case of multiple works in the same file and not for cases where a single work has multiple titles.
- PART
- When a single work is spread across multiple files, this is to be used for the title of the portion of the work. For example, a symphony with multiple movements could use TITLE for the name of the symphony and PART for the movement.
- VERSION
- This field should be used to differentiate multiple versions of a track in a collection or provide remix information.
- OPUS
- The number of the work, if applicable. This is not always referred to as “Opus”, and as such should include the name of the numbering system. For example, a Bach work may be referred to as “BWV 872”, and a Mozart work might be referred to as “KV 339”. For clarity, the same abbreviation should be used for each system consistently (e.g., pick one for K and KV with Mozart), and, preferably, conflicts in abbreviation should be avoided (K could be for Kochel or Kirkpatrick). I suggest you use “BWV” for Bachwerkeverzeichnis, “KV” (Kochel verzeichnis) for Kochel’s catalog of Mozart’s works, “K” for Kirkpatrick’s catalog of Scarlatti’s works, “RV” for Ryom Verzeichnis when used for Vivaldi, and just spell out “Opus”.
- SOURCE WORK
- In the case of soundtracks or music inspired by a movie or play, this field is intended for the original work from which the music was taken.
- TRACKNUMBER
- The index number of the work in a collection.
- SPARS
- The Society of Professional Audio Recording Services designation of whether the recording process was analog or digital. It consists of three components, for the recording, mixing, and mastering of the recording. These designations are commonly seen on classical music CDs. DDD would be a completely digital process, ADD would be an analog recording that was digitally mixed and masterd, AAD would be analog recording and mixing, and so on.
- DESCRIPTION
- A short text description of the contents.
- GENRE
- A short text description of the genre
- DATE
- The date the track was recorded. I recommend using a variant of the ISO 8601 date and time representation, modified to look less stupid. Using YYYY/MM/DD HH:MM:SS as a basis, include as much information as known and omit the rest. “2004” would be a recording made in the year 2004, and “2002/04/01 13” would be a recording made at 1PM on April 1, 2002. The time zone of this time should be given local to LOCATION. Season can be used by spelling it out and placing it in the same order of specificity; for example, “2004 Fall”. Ranges can be given as date1-date2
- LOCATION
- The recording studio, venue, or other physical location where the recording took place.
- ISRC
- The International Standard Recording Code number for the track. This is used to identify the track for royalty purposes, and, if used, is included as part of the table of contents of audio CDs.
Copyright Information
Vorbis provides COPYRIGHT, LICENSE and CONTACT, and I see no reason to expand from these. I did, however, clarify LICENSE, since the inclusion of “All Rights Reserved” was more likely a half-hearted attempt to provide everyone a means of using LICENSE rather than a particular love of the Buenos Aires copyright convention. Also, for the COPYRIGHT field, I disagree with the method of display in vorbis-tools. The current version of vorbis-tools (1.1) will display COPYRIGHT fields as “Copyright field contents”, assuming that the field contents are always of the form “year copyright holder, which is not always correct. Copyright statements on audio works often include the information for both the song’s copyright and the recording’s copyright, and they are often more complex than just a year and copyright holder, so I suggest that you include the appropriate symbol, © or ?, and vorbis-tools should be modified to remove its assumption. (Windows seems to have trouble displaying the second symbol; it’s a circle-P, sound recording copyright, unicode 0×2117.)
These fields may be different for each track in an album, and COPYRIGHT and LICENSE, at least, should probably only appear once in a track.
- COPYRIGHT
- Copyright attribution; e.g., “© 2001 Nobody’s Band” or “? 2001 Lightning Records”.
- LICENSE
- License information for redistributable works. For example, “Any use permitted” or a URL to a license such as a Creative Commons license (Distributed under the terms of the Creative Commons Attribution license. See http://creativecommons.org/licenses/by/2.0/ for details.) or the EFF Open Audio License. Works not licensed for redistribution should not include this field.
- CONTACT
- Contact information for the creators or distributors of the track. This could be a URL, an email address, or the physical address of the producing label.
Examples
Here a few examples, taken from my CD collection. Of course, I don’t listen to every possible type of music, so if you have some examples that you think would be helpful, feel free to send them along, especially if they aren’t from a CD.
ARTIST=Johnny Cash COMPOSER=S. Silverstein PRODUCER=Bob Johnston PRODUCER=Bob Irwin ENGINEER=Neil Wilburn ENGINEER=Bob Breault ALBUM=Johnny Cash at San Quentin ORGANIZATION=Columbia PUBLISHER=Columbia/Legacy PRODUCTNUMBER=074646601723 CATALOGNUMBER=CK 66017 RELEASE DATE=2000 TITLE=A Boy Named Sue TRACKNUMBER=11 GENRE=Country DATE=1969/02/24 LOCATION=San Quentin Prison ISRC=USSM19901986 COPYRIGHT=© Sony Music Entertainment Inc., ? 2000 Sony Music Entertainment Inc. SOURCE MEDIUM=CD
ARTIST=Dieselboy SOURCE ARTIST=Dom SOURCE ARTIST=Kemal REMIXER=D. Higgins REMIXER=C. Ritter REMIXER=K. Danner PRODUCER=Damian Higgins PRODUCER=Eric Silver PRODUCER=Louis Montorio ENGINEER=Rick Essig @ The Master Cutting Room, NYC ALBUM=The Dungeonmaster's Guide ORGANIZATION=Human Imprint Recordings ORGANIZATION=System Recordings PUBLISHER=The Greenwich Music Group PRODUCTNUMBER=820997800823 CATALOGNUMBER=HUMA8008-2 VOLUME=The Dungeon Master's Guide RELEASE DATE=2004 TITLE=Moulin Rouge VERSION=Dieselboy + Kaos + Karl K Remix TRACKNUMBER=14 GENRE=Drum and Bass DATE=2004 LOCATION=Philadelphia, D.Cell COPYRIGHT=? & © Moving Shadow LTD. Courtesy of Moving Shadow LTD. SOURCE MEDIUM=CD
ARTIST=Wanda Landowska COMPOSER=J.S. Bach ENGINEER=Nathaniel S. Johnson ENGINEER=James Nichols ALBUM=The Well-Tempered Clavier, Book II ORGANIZATION=RCA Victor ORGANIZATION=Red Seal PUBLISHER=BMG Classics PRODUCTNUMBER=078635782523 CATALOGNUMBER=7825-2-RC VOLUME=Disc 2 RELEASE DATE=1988 TITLE=Fugue IX in E Major OPUS=BWV 878 TRACKNUMBER=2 GENRE=Baroque DATE=1951/06-1954/03 LOCATION=Lakeville, Connecticut, Wanda Landowska's home COPYRIGHT=© 1988, BMG Music, ? 1988, BMG Music. SOURCE MEDIUM=CD
ARTIST=Beethoven ENSEMBLE=Berlin Philharmonic Orchestra CONDUCTOR=Andre Cluytens PRODUCER=Ken Kahn ALBUM=Best of the Great Composers ORGANIZATION=Seraphim PUBLISHER=CEMA Special Markets PRODUCTNUMBER=077775786422 CATALOGNUMBER=S21-57864 RELEASE DATE=1992 TITLE=Symphony No. 9 in D minor PART=II. Molto vivace OPUS=Opus 125 TRACKNUMBER=3 GENRE=Classical COPYRIGHT=? © 1992 CEMA Special Markets SOURCE MEDIUM=CD
ARTIST=They Might Be Giants PRODUCER=Danny Bramson PRODUCER=Guy Oseary PRODUCER=Pat Dillett PERFORMER=Robin "Goldie" Goldwasser ALBUM=Austin Powers, the Spy Who Shagged Me: More Music from the Motion Picture ORGANIZATION=Maverick Recording Company PUBLISHER=Maverick Recording Company PRODUCTNUMBER=093624753827 CATALOGNUMBER=9 47538-2 RELEASE DATE=1999 TITLE=Dr. Evil SOURCE WORK=Austin Powers: The Spy Who Shagged Me TRACKNUMBER=11 GENRE=Pop COPYRIGHT=Courtesy of New Line Music Co., ? 1999 New Line Productions, Inc. SOURCE MEDIUM=CD
Remaining Ambiguities and Holes
Occasionally an artist will be given as something of the form “DJ Hejaz presents Sideproject Beatz”. This practice seems to be most common in electronic music, but might occur elsewhere. The artist is Sideproject Beatz, but it’s really just DJ Hejaz performing under a different name. Since this form could, theoretically, appear in any of the involved people roles, encoding this would require side-project fields for each possible involved person field, which would be a big mess and would fail to connect an artist to his side project in the case of multiple artists. I recommend simply dropping the information about DJ Hejaz and using “Sideproject Beatz” as the field content, or, if you really want the whole thing, just use the entire “presents” form as the artist name.
References and further reading
- Ogg Vorbis I format specification: comment field and header specification
- Goals and non-Goals of Vorbis Comments
- Ogg Vorbis Comment Field Recommendations
Credits
Initial version, 2004/09/28, David Shea. Thanks to David Cantrell and Simon Fowler for suggestions and proofreading.
2004/11/13, removed SOLOIST after Attila Bogár pointed out that the difference between it and PERFORMER is unclear.
2004/11/21, added PART and a suggestion for making LOCATION sortable, again from Atilla Bogár.