Dave Rice, with support of the Dance Heritage Coalition’s Secure Media Network managed by Bay Area Video Coalition, has created a PBCore 1.3 to 2.0 XSL translator tool. For valid input PBCore 1.3 records, the tool generates a valid 2.0 records as the output.
This process revealed some difficult mapping challenges that have yet to be fully resolved. The main issue is the move away from elements like genreAuthorityUsed to the @source attribute. At first glance, it makes sense that genreAuthorityUsed would become pbcoreGenre source=“authority” in PBCore 2.0. But wait, there was an @source attribute for genreAuthorityUsed! So if the value of genreAuthorityUsed is now 2.0 pbcoreGenre source=“authority”, what happens to 1.3 genreAuthorityUsed source=“name”? It’s lost!
Given that the definition of @source in PBCore 2.0 is “Attribute source provides the name of the authority used to declare data value of the element,” for this and a few other elements, it appears to be impossible to create a sematically lossless mapping.
We have posted the translator to github for public review. You will see in the header comments of the XSL a number of issues that have come up in the mapping, including the genreAuthorityUsed problem. Issues have been identified for mapping 1.3 titleType, descriptionType, and subjectAuthorityUsed.
You can find the translator here: https://github.com/avpreserve/random-pbcore-metadata-translators/blob/master/13_to_Pbcore2.xsl
We would love to get feedback on this from the PBCore community!
If you haven’t been to http://pbcore.org lately you’re in for a good surprise. The site has been completely rebuilt, and contains up-to-date documentation, news, case studies, and most importantly, the complete PBCore 2.0 schema. There, I buried the lead: PBCore 2.0 has been officially released!
If I’ve learned one thing about the PBCore user community, it’s that we’re not satisfied with the current state of PBCore. We’ve used it enough to discover its strengths in describing AV assets and creating shareable metadata, but we keep running into its gaps and flaws. We’ve been pushing for a change process, and have argued for specific changes. Common threads have emerged right here on this site:
- A need for PBCore to support multi-part instantiations, e.g. when you have one complete work comprised of several reels or tapes or files.
- A need to express rights information related to a specific Instantiation, instead of only the entire asset. For example, you might want to allow users to download an mpeg4 version of a film for personal use, but not grant the same kind of access to the actual film!
- Speaking of rights, formatting of the pbcoreRightsSummary element disallows inclusion of metadata from existing standards such as ORDL or Creative Commons, which seems odd to say the least. If you already have structured rights data, why not simply reuse it?
- A need to show relationships between Instantiations, like when you digitize a film to 10-bit uncompressed digital video, then encode an mpeg4 file from the 10-bit uncompressed file, it seems important to show that in the PBCore record.
- With pbcoreContributor, you can say that Harrison Ford is an Actor, but you can’t say what role he plays in the film.
- There’s no way to uniquely identify a person, subject term, location, or other value that might have an actual URI.
- The lack of attributes of any kind! Everything is elements and sub-elements, which seems inefficient and makes parsing more difficult.
- The lack of a valid way to identify clip information within an asset, for example where in the timeline a particular subject is discussed or a specific person appears.
- The lack of any way to bundle multiple PBCore XML records together in a feed or collection, so you could export/import large groups of records between systems or use PBCore in RESTful web applications.
Well good news folks! PBCore 2.0 is on the way, and it solves all these issues.
The 2010 AMIA Conference is next week. (How’d it get to be November already?!) For those PBCore denizens attending, please note two sessions that will be of particular interest, both on Thursday, November 4th:
8:30am - 10:00am
Moving to a Digital Asset Management Environment: A Case Study on Fresh Air
Chair: Dave Rice - AudioVisual Preservation Solutions
Speakers: Daniel Pisarski - TelVue Corporation, and Julian Herzfeld - WHYY
Since 1975 WHYY’s production, Fresh Air, has generated thousands of 1/4” analog reels, DAT tapes, CDs, and digital files as well as even more Microsoft Word and Excel documents reflecting a disconnected set of rights, inventory, descriptive, and technical information. This panel looks at all aspects of an initiative to assemble Fresh Air’s metadata collections under PBCore while bringing digital media and metadata into a production-oriented digital asset management system
11:00am - 12:00pm
Coming Attraction: PBCore 2.0
Chair: Courtney Michael - WGBH Media Library & Archives
Chris Beer - WGBH Interactive
Speakers: Courtney Michael - WGBH Educational Foundation
Jack Brighton - University of Illinois
Katrina Dixon - Northeast Historic Film
Kara Van Malssen - Broadway Video Digital Media
There are a number of metadata standards being used by the library and archival community. However few are adequate, and easy for describing media collections. PBCore is a metadata standard that was developed specifically to describe media. Many in the moving image archival community have begun to utilize the standard. After 2 years of a development hiatus, a new initiative has launched to continue development of the standard to bring it to PBCore 2.0. This session will give an overview of PBCore - why it is a good standard to use for media collections and the work to date to bring it to PBCore 2.0. It will demo and tour the new redesigned PBCore.org website highlighting changes, navigation, and the community input features. And finally there will be several use cases showing practical use of PBCore in real archive projects. The end will be a roundtable discussion to get more feedback from the AMIA/IASA community and take questions.
(As always, I speak only for myself as a media producer and archivist.)
I attended the panel discussion on PBCore at the recent Open Video Conference, and was struck by something that should have been obvious. Those of us pushing development of PBCore have failed to clarify one basic thing: What is PBCore for? I’ve been to workshops and sessions on PBCore over the past six years and have been on metadata panels at the AMIA Conference, the PBS Tech Conference, NETA, and iMA. We often focused on explaining the PBCore elements and why they are useful for cataloging media assets. But at the OVC, the question was raised “Why do we need PBCore to catalog our stuff if we already have a good media database?” The question reveals a conflation of two distinct things: having a media database, and being able to easily interoperate with other databases.