Handles

From Devwiki
Jump to: navigation, search

Handles

For the VITAL system we are using the www.handle.net handle system. We decided early on that each datastream should have its own unique handle as well as the object as a whole. During development I wrestled with the problem of how to link a PID to a handle ID while still allowing individual datastreams to have handles.

My initial thought was to use the number portion of a PID for the local handle ID. For example llgc-id:100 would give the handle:

http://hdl.handle.net/10107/100

Where 10107 is our unique institution ID. The problem arises when you try and give datastreams a unique number which won't conflict with the PID IDs. I discovered the local handle ID doesn't have to be numeric so I used a - symbol as a divider so the format is as follows:

http://hdl.handle.net/INISTITUION_ID/PID-DATASTREAMID

e.g. the Dublin Core datastream for llgc-id:100 would be:

http://hdl.handle.net/10107/100-1

I decided to use the following conventions for traceability:

Datastream Ids: 0-9 Are used for Metadata

* Not Present - Handle to the object
* 0 - Handle to the METS document
* 1 - Handle to the Dublin Core datastream
* 2 - Handle to the RELS-EXT datastream (RDF)
* 3 - XACML Policy for an object (Fedora) POLICY datastream only (ISSUE and PAGE policy do not require handles)
* 4 - METS rights for an object
* 5 - UKETD metadata for an object
* 6 - PREMISRights

10-onwards are used for data e.g. archive level files and derivatives

* 10 - Archive level file
* 11 - Main Reference file (50PNG, JPEG2000, JPEG)
* 12 - zoom
* 13 - thumbnail
* 14 - text
* 15 - coordinates (ALTO)
* 16 - Handle to Benchmarking document
* 17 - AbbyFineReader_XML_output
* 18 - datastream_100_PNG
* 19 - TEI
* 20 - Article - followed by number of article e.g. 20-1 (http://hdl.handle.net/10107/12345-20-1 
* 21 - 650pxPNG
* 22 - textMD

The above datastreams are present in all objects and could be classed as administration datastreams. To keep the image datastreams unique I decided to start their IDS at 10. The order of the images is decided by the order in the ingest METS document.

Some Example handles for object llgc-id:21 (only works internally)


Handle Record

There is more than just the URL stored in the handle record. When a handle points to an object the handle looks as follows:

index=100 type=HS_ADMIN 
index=1 type=URL rwr- "http://dams.llgc.org.uk:8080/fedora/get/llgc-id:21"
index=2 type=FEDORA-PID rwr- "llgc-id:21"

This should allow easy handle resolution. For datastreams the record looks as follows:

index=100 type=HS_ADMIN 
index=1 type=URL rwr- "http://dams.llgc.org.uk:8080/fedora/get/llgc-id:21/METS"
index=2 type=FEDORA-PID rwr- "llgc-id:21"
index=3 type=FEDORA-DS-ID rwr- "METS"

Implementation

Code available soon. I have created a library which uses the handle.net client libraries available at http://www.handle.net/client_download.html. They are located in the package uk.org.llgc.handles. More details to follow.