PBCore sessions at the 2010 AMIA Conference

Written by Jack Brighton on Friday, October 29, 2010

The 2010 AMIA Conference is next week. (How’d it get to be November already?!) For those PBCore denizens attending, please note two sessions that will be of particular interest, both on Thursday, November 4th:

8:30am - 10:00am
Moving to a Digital Asset Management Environment: A Case Study on Fresh Air
Chair: Dave Rice - AudioVisual Preservation Solutions
Speakers: Daniel Pisarski - TelVue Corporation, and Julian Herzfeld - WHYY

Since 1975 WHYY’s production, Fresh Air, has generated thousands of 1/4” analog reels, DAT tapes, CDs, and digital files as well as even more Microsoft Word and Excel documents reflecting a disconnected set of rights, inventory, descriptive, and technical information. This panel looks at all aspects of an initiative to assemble Fresh Air’s metadata collections under PBCore while bringing digital media and metadata into a production-oriented digital asset management system

11:00am - 12:00pm
Coming Attraction: PBCore 2.0
Chair: Courtney Michael - WGBH Media Library & Archives
Chris Beer - WGBH Interactive
Speakers: Courtney Michael - WGBH Educational Foundation
Jack Brighton - University of Illinois
Katrina Dixon - Northeast Historic Film
Kara Van Malssen - Broadway Video Digital Media

There are a number of metadata standards being used by the library and archival community. However few are adequate, and easy for describing media collections. PBCore is a metadata standard that was developed specifically to describe media. Many in the moving image archival community have begun to utilize the standard. After 2 years of a development hiatus, a new initiative has launched to continue development of the standard to bring it to PBCore 2.0. This session will give an overview of PBCore - why it is a good standard to use for media collections and the work to date to bring it to PBCore 2.0. It will demo and tour the new redesigned PBCore.org website highlighting changes, navigation, and the community input features. And finally there will be several use cases showing practical use of PBCore in real archive projects. The end will be a roundtable discussion to get more feedback from the AMIA/IASA community and take questions.


Metadata as media

Written by Jack Brighton on Sunday, October 17, 2010

(As always, I speak only for myself as a media producer and archivist.)

I attended the panel discussion on PBCore at the recent Open Video Conference, and was struck by something that should have been obvious.  Those of us pushing development of PBCore have failed to clarify one basic thing: What is PBCore for? I’ve been to workshops and sessions on PBCore over the past six years and have been on metadata panels at the AMIA Conference, the PBS Tech Conference, NETA, and iMA. We often focused on explaining the PBCore elements and why they are useful for cataloging media assets. But at the OVC, the question was raised “Why do we need PBCore to catalog our stuff if we already have a good media database?” The question reveals a conflation of two distinct things: having a media database, and being able to easily interoperate with other databases.

Most of us producers are not, after all, experts in database administration, XML, or programming in general. During the American Archive Pilot Project I talked with people at other public TV and radio stations trying to “use” PBCore without adequate tools, and without understanding why they had to use it in the first place. If the answer is “you need a data model for cataloging your media assets,” there are many other catalog and data models. A better answer is you use PBCore to create shareable metadata. If you have a media collection and you want to combine it with other collections, PBCore provides a machine-readable translation layer between systems.

Some people have asked why use PBCore instead of something simpler like RSS or Atom? I think that’s a really good question.

You can stuff lots of descriptive metadata into RSS or Atom. Their schemas are understood by a wide range of applications, and they are simple to implement. At the other end of the spectrum, MPEG7 provides an exhaustive schema for describing multimedia content…emphasis on the word “exhaustive.”

PBCore is somewhere between the simplicity of RSS and the verbose complexity of MPEG7. It provides a level of detail useful to media archives, without being ridiculous to implement. It’s sort of a “just right” format, allowing simple producers like me to share a great deal of useful metadata about my media assets with any system that can parse PBCore XML.

So in the example of the American Archive Pilot Project, my station used a MySQL-based Content Management System to catalog several hundred media assets. (See my earlier post for details.) With the CMS I could render web pages and an RSS feed, plus PBCore records for each asset. The AAPP project portal could just ingest my PBCore records into the national AAPP database, getting much more detail on each asset than would be provided by RSS.

Audio and video are media that connect human beings. RSS and PBCore are media that connect machines.

You can have a fantastic media database not designed around the PBCore standard. You can create a PBCore representation of that database by exporting XML records based on the PBCore schema. You can create other representations of your data based on RSS, JSON, and other formats. An example of this is the NPR API query generator, which provides multiple output format options including RSS, Atom, HTML, and NPR’s proprietary (and wonderfully detailed and useful) NPRML.

Given the flexibility of today’s tools, we can generate multiple different representations of our media database for different purposes. So what’s the use-case scenario with PBCore? With RSS or Atom, we know that many other systems can ingest our data. What systems can ingest PBCore?

A growing number of systems that speak PBCore have the word “archive” in them. Importantly, I hear the American Archive would adopt PBCore as a primary means for ingesting metadata from contributor collections. This would allow each contributor to use whatever database best suits their local needs, as long as each local system can create shareable metadata in the PBCore format.

PBCore won’t be used by media consumers and consumer-level applications like iTunes. PBCore will be used by media archives and the systems that contribute to them. That’s what PBCore is for.


Write a comment: