Article category: Basics
PBCore.org site refresh a welcome sight
If you haven’t been to http://pbcore.org lately you’re in for a good surprise. The site has been completely rebuilt, and contains up-to-date documentation, news, case studies, and most importantly, the complete PBCore 2.0 schema. There, I buried the lead: PBCore 2.0 has been officially released!
I might quibble with the color scheme of the new site, and I see some obvious CSS tweaks that could improve readability. But hey, my own websites need lots of work so who am I to talk? I have to give the folks at WGBH, who rebuilt the new PBCore site from the bones and ashes of its 2005 incarnation, lots of credit for getting things basically right.
Things I really like: In addition to the 2.0 Schema, there’s a How To section (contained in the Documentation menu item in the sidebar navigation) which really good guidance, and a Training section with clear instructions and code examples on things like “How to express collections in PBCore,” “How to sequence records within relationships,” and “How to express time segments within a video.” None of these was even possible prior to the 2.0 schema, and now we have clear documentation on how to do them.
Another area of the PBCore site that stands out is the Elements section. This section provides concise details on each of the PBCore 2.0 elements, usage rules (i.e. minOccurs) where it appears in the schema, and its available attributes. I find the Elements section highly usable, but I find myself Command-clicking element names to open them in a new browser tab (I’m on a Mac…) so I don’t lose the Elements index page in the original tab. It might be more usable to navigate the Elements more like the old PBCore User Guide, where clicking on an Element doesn’t take you away from the navigation. But that’s a minor quibble, I’m a geek, and am never completely happy.
One thing I am happy about is the inclusion of a “related discussions” link on each Element page, which includes a link to the home page of pbcoreresources.org. This leads to an idea about how pbcoreresources could directly add value to pbcore.org. As we discuss various PBCore elements here, our posts get aggregated in categories like pbcoreTitle. So for example on the pbcoreCollection Element page on pbcore.org, the “related discussions” link could go directly to the pbcoreCollection category page on pbcoreresources.org. This assumes enough of us are contributing to pbcoreresources with questions, answers, examples, and other useful conversation about PBCore elements. We have done that to some extent, and I’m suggesting we do it more. PBCore.org can then mine those discussions to enhance the official documentation over time.
Or maybe pbcore.org will continue to grow and supplant some of the stuff we have been doing here, and I’d be OK with that. The new pbcore.org site is built on WordPress, and it does allow comments in many places including the How To pages and the Element pages. Wherever it happens, I expect the user community will continue to build a shared understanding of how to move forward with PBCore. And above all, to keep the keepers of the PBCore standard in tune with the needs and realities of real-world media producers, publishers, and archivists who use it every day.
Metadata as media
(As always, I speak only for myself as a media producer and archivist.)
I attended the panel discussion on PBCore at the recent Open Video Conference, and was struck by something that should have been obvious. Those of us pushing development of PBCore have failed to clarify one basic thing: What is PBCore for? I’ve been to workshops and sessions on PBCore over the past six years and have been on metadata panels at the AMIA Conference, the PBS Tech Conference, NETA, and iMA. We often focused on explaining the PBCore elements and why they are useful for cataloging media assets. But at the OVC, the question was raised “Why do we need PBCore to catalog our stuff if we already have a good media database?” The question reveals a conflation of two distinct things: having a media database, and being able to easily interoperate with other databases.
Most of us producers are not, after all, experts in database administration, XML, or programming in general. During the American Archive Pilot Project I talked with people at other public TV and radio stations trying to “use” PBCore without adequate tools, and without understanding why they had to use it in the first place. If the answer is “you need a data model for cataloging your media assets,” there are many other catalog and data models. A better answer is you use PBCore to create shareable metadata. If you have a media collection and you want to combine it with other collections, PBCore provides a machine-readable translation layer between systems.
Some people have asked why use PBCore instead of something simpler like RSS or Atom? I think that’s a really good question.
You can stuff lots of descriptive metadata into RSS or Atom. Their schemas are understood by a wide range of applications, and they are simple to implement. At the other end of the spectrum, MPEG7 provides an exhaustive schema for describing multimedia content…emphasis on the word “exhaustive.”
PBCore is somewhere between the simplicity of RSS and the verbose complexity of MPEG7. It provides a level of detail useful to media archives, without being ridiculous to implement. It’s sort of a “just right” format, allowing simple producers like me to share a great deal of useful metadata about my media assets with any system that can parse PBCore XML.
So in the example of the American Archive Pilot Project, my station used a MySQL-based Content Management System to catalog several hundred media assets. (See my earlier post for details.) With the CMS I could render web pages and an RSS feed, plus PBCore records for each asset. The AAPP project portal could just ingest my PBCore records into the national AAPP database, getting much more detail on each asset than would be provided by RSS.
Audio and video are media that connect human beings. RSS and PBCore are media that connect machines.
You can have a fantastic media database not designed around the PBCore standard. You can create a PBCore representation of that database by exporting XML records based on the PBCore schema. You can create other representations of your data based on RSS, JSON, and other formats. An example of this is the NPR API query generator, which provides multiple output format options including RSS, Atom, HTML, and NPR’s proprietary (and wonderfully detailed and useful) NPRML.
Given the flexibility of today’s tools, we can generate multiple different representations of our media database for different purposes. So what’s the use-case scenario with PBCore? With RSS or Atom, we know that many other systems can ingest our data. What systems can ingest PBCore?
A growing number of systems that speak PBCore have the word “archive” in them. Importantly, I hear the American Archive would adopt PBCore as a primary means for ingesting metadata from contributor collections. This would allow each contributor to use whatever database best suits their local needs, as long as each local system can create shareable metadata in the PBCore format.
PBCore won’t be used by media consumers and consumer-level applications like iTunes. PBCore will be used by media archives and the systems that contribute to them. That’s what PBCore is for.