PBCore is not dead

Written by Jack Brighton on Wednesday, July 15, 2009

A person might be forgiven for wondering lately if PBCore might be a dead end. As some of us complained about elsewhere, there has been little evidence that the Corporation for Public Broadcasting, which has funded the PBCore project from the beginning, understood that it was at risk. Until recently that is.

Following conversations at the Open Video Conference, a meeting to discuss the future of PBCore was held on July 6th at CPB headquarters in DC. Rob Bole, who recently joined CPB as vice president of digital media strategy, pulled in folks from the original PBCore development project, plus a few of us beatnik geeks, to hit the reset button and map out a plan. The phrases "change management," "PBCore 2.0," and "resources will be allocated" were clearly spoken.

Basically, CPB is renewing its commitment to PBCore. This means funding to establish a project management entity, which will oversee further development of PBCore and provide training, tools, and community support. i think we can soon expect to see an RFP for the project management entity, although it may be tied in with the American Archive project which is currently being piloted by opb.org. We shall see.

Meanwhile in the interest of sharing as much as possible, here are the notes from the July 6th PBCore meeting at CPB in PDF form. CPB has invited input from the PBCore community, and now would be a good time to push this forward.



PBCore licensing?

Written by Jack Brighton on Saturday, June 06, 2009

A discussion at Drupal Groups pertains to development of a Metadata module, possibly incorporating PBCore. The thing is, PBCore is currently licensed under a Creative Commons License, not a GPL which would be required for a Drupal distributed module. Which raises the question: How easy would it be to place PBCore and its documentation under a GNU GPL license? Doing so would be in the interest of the PBCore community, and I would think in the interest of CPB which funded development of the PBCore standard. Feedback would be appreciated!



How to reference a related web page?

Written by Jack Brighton on Friday, June 05, 2009

Say you have an audio archive of a radio news story that's on a web page. In the PBCore record for the audio archive, how do you reference the web page?

Would it be:

<pbcoreRelation>
   <relationType>Is Part Of</relationType>
   <relationIdentifier>http://will.illinois.edu/news/story-item-foo/<relationIdentifier>
</pbcoreRelation>

Or perhaps its relationType should be Is Referenced By.

I have a reservation about using pbcoreRelation for this, because it assumes the web page is a "media item" capable of being expressed in another PBCore record. (See: http://www.pbcore.org/PBCore/relationType.html)  I doubt PBCore was ever intended to catalog web pages, and the practice of doing that would seem...daunting.

Regardless this question arises because, what if you have a media player capable of ingesting PBCore records and playing the media file, and you want to link to an original web page on which the media file was first published? This is not a hypothetical question. If pbcoreRelation isn't the way to do this, it would be important to settle the question very soon.

Thoughts?



Drupal developer updating Media module looking for metadata handling feedback

Written by Kevin Reynen on Thursday, June 04, 2009

The developer working to update media handling in Drupal through the Google SoC program is looking for feedback on metadata handling. This work is really going to influence how media handling and metadata work in Drupal 7. I've posted a response that includes some reference to PBCore, but any other feedback would be appreciated. If you'd like to see better support for PBCore in Drupal, this is going to be the time to help influence how that happens. http://groups.drupal.org/node/22915

Instantiations, Components, and Essence Tracks

Written by DaveRice on Friday, May 15, 2009

PBCore instantiation records work well for documenting renditions of an asset that are composed of a single tape or file, but when an instantiation requires multiple tapes, reels or files what should the protocol be? How can PBCore be used to efficiently document a rendition of an asset that is composed of multiple objects, each with its own set of technical metadata? Disclaimer, this post is based on my own personal experience in using PBCore 1.2.1 and resulting conclusions.

Within a PBCore asset record every element may be applied multiple times (one asset may have as many titles, contributors, and instantiations as one desires); however from the perspective of the instantiation (which is a single rendition of an asset) much of the descriptive information may only occur once. For instance, an instantiation may only have one formatDigital (i.e. one mime_type), one formatGenerations, one formatFileSize. The PBCore instantiation element appears to be designed to both document a single item or a single file and to document "all the details on how the asset is actualized" (quote from the PBCore 1.2.1 XSD). However, in some cases, in order to document how an instantiation actualizes the asset, multiple files or multiple items are necessary. Here are three situational examples:

- an asset describing a musical album may have an instantiation that is one CD, then the digitized version of that CD comprises 10 digital files each representing a track. The 10 digital items together represent the same asset as the single-item CD,

- an asset describing a film exists in a collection as two instantiations: a three-reel 35mm film print and a single Digibeta (this is similar to the example that Mary Miller describes at http://www.pbcoreresources.org/article/dealing_with_multi_part_instantiations/),

- an asset documenting a television episode contains two instantiations: one being a single Digibeta tape and another as two elementary stream files (an .m2v video stream and an .wav audio stream).

All three of these examples refer to audiovisual material that changes in number of components needed to represent an asset over the reformatting process. In some types of reformatting the number goes from more to less (like example 2, the film transfer) and in some cases from less to more (link example 1, the digitization of a CD).

If PBCore instantiations are understood to only represent single-item instantiations then the individual digitized tracks of a CD or the individual reels of a film print would need to be documented in their own asset records, where one asset represents the CD and then 10 other assets represent the individual digitized tracks. This is obviously less efficient than treating the set of digitized tracks as one instantiation and the CD as another instantiation of the same asset. Another option could be to zip or tar the 10 tracks into one file, but this requirement for effective PBCore description has its own disadvantages. Alternatively a directory that contains the 10 file-based tracks could be defined by the instantiation.

Best practices for documenting multi-object instantiations are not clear. With the m2v and wav elementary streams, the two files need to work together to represent the asset, but they have their own unique values for 'formatDigital', 'formatFileSize', 'formatDataRate' and possibly their own 'formatLocation'. All of these values may only occur once per instantiation. For the m2v and wav elementary streams to be defined as a single instantiation some options are:

- the two files could be moved into a directory or folder, which would serve the role of an audiovisual wrapper. In this case the formatDigital would be 'application/x-not-regular-file' (referring to the directory) the formatFileSize could be the directory size, etc.

- or the data from the individual files could be shoehorned into the instantiation fields meant for individual files, thus formatDigital would be "video/mpeg audio/x-wav" and formatFileSize could be the sum of the two file sizes.

- or the m2v and wav files could be either zipped or tarred into a single file or multiplexed into an audiovisual wrapper, so that the collection is then represented by a single file (the analog equivalent would be splicing together film reels in order that the metadata more cleanly fits into an instantiation record).

None of these options are ideal for describing a complex object, since potentially the levels of quality of resulting technical documentation become less precise, the implementation of instantiation becomes less standardized, or the metadata process potentially burdens collection management. This is the same sort of challenge that occurred in pre-1.2 versions of PBCore where discrete track-level metadata values had to be concatenated and labeled into single fields like formatDataRate = "Total 1930 kilobits/sec; Video 1700 kilobits/sec; Audio 230 kilobits/sec". This procedure was documented by pbcore.org at http://www.pbcore.org/PBCore/formatDataRate.html that "the pbcoreInstantiation container should not be repeated in order to express a video data rate and an associated audio data rate. The two combined are part of a single instantiation for an asset".

I have two suggestions regarding this potential challenge. The first would be documenting best practices the use PBCore 1.2.1 as is to document these complex objects in a way that fits the various examples above. The second suggestion would involve a modification to PBCore which would be to integrate an additional element in between instantiation and essenceTrack, perhaps called 'component'. Typically PBCore would document single-component instantiations; however in cases where a single instantiation is made up of multiple tapes, reels or files, the instantiation would have as many component records each with its own technical metadata.

In this arrangement some of the values currently attached to instantiation would move to the component level. Whereas PBCore 1.2.1 is

instantiation { {formatIdentifier, formatIdentifierSource } dateCreated, dateIssued, formatPhysical, formatDigital, formatLocation, formatMediaType, formatGenerations, formatFileSize, formatTimeStart, formatDurations, formatColors, formatTracks, formatChannelConfiguration, language, alternativeModes {essenceTrack see below } {dateAvailableStart, dateAvailableEnd } { annotation }

essenceTrack {essenceTrackType, essenceTrackIdentifier, essenceTrackIdentifierSource, essenceTrackStandard, essenceTrackEncoding, essenceTrackDataRate, essenceTrackTimeStart, essenceTrackDuration, essenceTrackBitDepth, essenceTrackSamplingRate, essenceTrackFrameSize, essenceTrackAspectRatio, essenceTrackFrameRate, essenceTrackLanguage, essenceTrackAnnotation }

the incorporation of a component level of data could look like

instantiation { assemblyMode, formatMediaType, formatGenerations, formatFileSize, formatColors,, formatChannelConfiguration, language, alternativeModes, {dateAvailableStart, dateAvailableEnd } { annotation }

component { {componentIdentifier, componentIdentifierSource } dateCreated, dateIssued, componentPhysical, componentDigital, componentLocation, componentTimeStart, componentDuration, componentTracks, {essenceTrack see below } }

essenceTrack {essenceTrackType, essenceTrackIdentifier, essenceTrackIdentifierSource, essenceTrackStandard, essenceTrackEncoding, essenceTrackDataRate, essenceTrackTimeStart, essenceTrackDuration, essenceTrackBitDepth, essenceTrackSamplingRate, essenceTrackFrameSize, essenceTrackAspectRatio, essenceTrackFrameRate, essenceTrackLanguage, essenceTrackAnnotation }

In this draft I added a field called 'assemblyMode'. Something like assemblyMode would be needed to document how the components are related to each other. In the case of the digitized CD, the components would be assembled through concatenation and played back-to-back, so assemblyMode could equal "concatenation". With the m2v and wav elementary streams the assemblyMode would be "multiplexion" since the component needs to be multiplexed for playback. In the case of "concatenation" the total duration of the instantiation would equal the total durations of the components whereas if the assemblyMode is "multiplexion" then the instantiation's duration is roughly equal to the duration of the component, so the value is relevant to how other pieces of metadata are determined.

Since the instantiation should contain "all the details on how the asset is actualized" (as stated by the PBCore 1.2.1 XSD), adding an addition element level to accommodate multi-tape or multi-objects would help this goal be achieved with cleaner and more descriptive data. I'm interested to hear if this is an issue another other PBCore users are thinking about and if there are any easier solutions that I'm missing.

David Rice

AudioVisual Preservation Solutions



Page 7 of 11 pages « First  <  5 6 7 8 9 >  Last »

Options:

Size

Colors