Ingest Preparation

From Devwiki
Jump to: navigation, search

Back to Implementation

Part 1 - Ingest perperation

The first part of the project is to create ingest program to take 4 classes of data. We have called them Class 1, Class 2, Class 3 and Class 4.

Development for these class can be seen below

Class 01

The Tasks and results for this class are shown below.

Default METS Document

Tasks

  • Create default METS documents
    • Decide where the data is coming from (default, from JHove or from file?)
    • Place pointers in the METS documents that will be replaced after ingest

Result

Get MARC records from Catalogue

Tasks

  • Export records from Virtua into MARC ISO format then convert to MARCXML
    • Find unique searchable property from MARC
    • Run VTLS scripts to retrieve MARC records
    • Run MARC4j to convert MARC record to MARC XML record
    • Split MARCXML collection into individual records as 1 MARC record = 1 METS document

Result

  • MARC data for the John Thomas Collection extracted successfully from GEAC.
  • The MARC documents were created from GEAC so will have to be updated with VTLS identifiers
  • Split MARC record successfully
    • This program is mets_create/src/uk/org/llgc/utils/SplitMarc.java
    • Usage: java SplitMarc <Marc File> <Ouput Directory>
    • It creates files named BIB_ID.xml e.g. LLGCb13538389.xml in the output directory

Create 'Ingest METS' documents

Tasks

  • Create ingest METS documents
    • Inputs:
      • MARCXML Record
      • Default METS document
    • Output:
      • Ingest ready METS document 1 per object

Results

Created the following packages:

  • uk.org.llgc.checksums - Handles Checksuming objects
  • uk.org.llgc.jhove - Creates jhove metadata wrapper for Jhove application
  • uk.org.llgc.mets.class01
    • Class01Mets.java - Main class which builds ingest mets
    • MARCHandler.java - Contains a collection of MARC records
    • MARCRecord.java - Contains a marc record with convenience methods for retrieving properties from MARC XML
  • uk.org.llgc.props
  • uk.org.llgc.xml

Ingest METS documents into Fedora

Tasks

  • Ingest METS document into Fedora
    • Add datastream in Fedora for each Image datastream in METS
    • Copy Dublin Core to Dublin Core datastream in object and replace with a pointer
    • Replace certain attributes of the METS document with the handle of the object e.g. the attribute OBJID in the parent METS element.
    • Create a RELS-EXT datastream with the OAI ID for OAI harvesting and relate object to collections

Results

Used mostly existing code from the Bridge Project for ingesting a METS document into Fedora. Unfortunately each METS document has some unique features so the repository bridge ingest mechanism was created with as much flexibility in design to allow each METS document to be handled differently. The advantage of using this program was that it handles all the ingest processes when it has been set up.

The main directory structure is shown below:

 uk/org/llgc/fedora/ingestMets  -- Main files (Explained Below)
 uk/org/llgc/fedora/junit  -- This package test the objects in Fedora to see if they have been 
                              added correctly and the METS is correctly updated
 uk/org/llgc/fedora/metadata  -- Object which handles creating the Dublin Core datastream from the XML contained in the ingest METS
 uk/org/llgc/fedora/mets -- Contains helper classes which help link objects to disseminators

uk/org/llgc/fedora/ingestMets/Class01Handler.java -- Main Class which implements the bridge project Handler Adapter and should be able to handle all Class 1 Objects

uk/org/llgc/fedora/ingestMets/utils/IngestDirClass01.java -- A helper class which allows you to pass in a directory containing METS documents and ingests them all

Create Simple Dessmination Programs

Simple dissemination programs need to be created to allow users to check the data in Fedora. These disseminators should be generic enough to work with any collection and object. The three disseminators I created were:

Show Collection:

This displays a collection's Dublin Core and gives a link to all the objects in the collection. This can be seen here. It could be assigned to any collection with has the showObject disseminator.

Show Object:

This displays an object's Dublin Core and allows someone to look at all the datastreams of an object. This can be seen by clicking on one of the links in the show collection disseminator but an example is here. This disseminator requires each object to have the getFullMets disseminator.

Get Full METS:

When the METS document is stored in the repository certain datastreams are extracted from it and a pointer is left in its place. This disseminator pulls together all the datastreams back into a METS document on dissemination. Currently it only pulls in the Dublin Core which is stored in a separate datastream of OAI-PMH harvesting (and Fedora requires it). In future this disseminator could also pull in the rights meta data.

Issues still needing to be addressed

  • METS Rights needs to be decided
    • A conversion from METS Rights to Fedora rights needs to be created
  • PREMIS dictionary needs to be discussed
    • List of actions used e.g. Object Ingested and Object Checksummed
  • DC Legacy data needs to be converted from DOC format to DC for inclusion in the METS


Problems During Ingest

Overall the ingest went very well and 4319 images were ingested successfully. Unfortunately there were a few problems:

Inaccessible Archive Images

The following 97 files contained a link to the Archive image which was inaccessible. This could be due to it not being digitised or we don't have rights to show the objects:

     LLGCb13537270 LLGCb13537271 LLGCb13537384 LLGCb13537403 LLGCb13537407 LLGCb13537414
     LLGCb13537415 LLGCb13537416 LLGCb13537417 LLGCb13537418 LLGCb13537419 LLGCb13537420
     LLGCb13537421 LLGCb13537422 LLGCb13537423 LLGCb13537424 LLGCb13537425 LLGCb13537426
     LLGCb13537427 LLGCb13537428 LLGCb13537429 LLGCb13537434 LLGCb13537435 LLGCb13537436
     LLGCb13537437 LLGCb13537479 LLGCb13537490 LLGCb13537688 LLGCb13537902 LLGCb13537930
     LLGCb13538279 LLGCb13538307 LLGCb13538308 LLGCb13538309 LLGCb13539035 LLGCb13539063
     LLGCb13539064 LLGCb13539065 LLGCb13539105 LLGCb13539106 LLGCb13539107 LLGCb13539162
     LLGCb13539371 LLGCb13539372 LLGCb13539564 LLGCb13539565 LLGCb13540039 LLGCb13540118
     LLGCb13540297 LLGCb13540298 LLGCb13540301 LLGCb13540302 LLGCb13540303 LLGCb13540304
     LLGCb13540305 LLGCb13540306 LLGCb13540356 LLGCb13540363 LLGCb13540364 LLGCb13540398
     LLGCb13540461 LLGCb13540462 LLGCb13540463 LLGCb13540480 LLGCb13540481 LLGCb13540483
     LLGCb13540485 LLGCb13540486 LLGCb13540487 LLGCb13540488 LLGCb13540489 LLGCb13540522
     LLGCb13540523 LLGCb13540524 LLGCb13540525 LLGCb13540526 LLGCb13540527 LLGCb13540528
     LLGCb13540743 LLGCb13541380 LLGCb13541381 LLGCb13541498 LLGCb13541499 LLGCb13541709
     LLGCb13541728 LLGCb13541741 LLGCb13541742 LLGCb13543473 LLGCb13543600 LLGCb13543607
     LLGCb13543614 LLGCb13543617 LLGCb13543620 LLGCb13543623 LLGCb13543628 LLGCb13543635
     LLGCb13543657

No 856 link

The 3 following records did not have a 856 link in them:

     LLGCb13546932 LLGCb13546933 LLGCb13547456

Time

To create the METS Documents it took UNKOWN

To ingest the METS documents it took about 8 hours

Class 02

The Tasks and results for this class are shown below.

Default METS Document

Tasks

  • Create default METS documents
    • Decide where the data is coming from (default, from JHove or from file?)
    • Place pointers in the METS documents that will be replaced after ingest
  • Create Parent METS with a link in the structural map to point to the location of the child METS documents
  • Create Child METS which will be the same as the Class01 objects

Result

Get MARC records from Catalogue

Tasks

  • Export records from Virtua into MARC ISO format then convert to MARCXML
    • Find unique searchable property from MARC
    • Run VTLS scripts to retrieve MARC records
    • Run MARC4j to convert MARC record to MARC XML record
    • Split MARCXML collection into individual records as 1 MARC record = 1 METS document

Result

  • MARC data for the Geoff Charles Collection extracted successfully from GEAC.
  • The MARC documents were created from GEAC so will have to be updated with VTLS identifiers
  • Split MARC record successfully
    • This program is mets_create/src/uk/org/llgc/utils/SplitMarc.java
    • Usage: java SplitMarc <Marc File> <Ouput Directory>
    • It creates files named BIB_ID.xml e.g. LLGCb13538389.xml in the output directory

Create 'Ingest METS' documents

Tasks

  • Create ingest METS documents
    • Inputs:
      • MARCXML Record
      • Default Parent METS document
      • Default Child METS document (Same as Class01 METS document)
    • Output:
      • Ingest ready METS document 1 Parent per MARC record and 1 Child document per image

Results

Had to re-write METS create functionality. Decided to split the functionality into functionality that adds things to the default METS from the MARC record (a METSEnhancer) and things that process the Datastreams (a METSProcessor). For an object to be a METSEnhancer it must implement the uk.org.llgc.mets.util.METSEnhancer interface and the following methods:

 public void initalize(final XMLProperties pProps) throws IOException;
 public Document getMets();
 public void setMets(final Document pMets);
 public void process(final MARCRecord pMARCRecord) throws JDOMException, IOException;

The main program calls the methods in the following order: initalize, setMets, process and getMets.

Initalize passes an XML Tree of the METSEnhancers individual properties (see properties explained below). This method is used to give the object a chance to setup its attributes before the process method is called.

Once the METS document has been passed to the METSEnhancer and process has been called the result should be accessible from the getMets method.

The process method will get the properties from the Marc record and put them in the METS document. An example of a METSEnhancer is uk.org.llgc.marc.HeaderEnhancer shown below:

 package uk.org.llgc.marc;
 import uk.org.llgc.mets.util.METSEnhancer;
 import org.jdom.Document;
 import org.jdom.Element;
 import org.jdom.JDOMException;
 import uk.org.llgc.props.XMLProperties;
 import uk.org.llgc.mets.class01.MARCRecord;
 import java.io.IOException;
 public class HeaderEnhancer implements METSEnhancer {
       protected Document _mets = null;
       public HeaderEnhancer() {
       }
       public void initalize(final XMLProperties pProps) throws IOException {
       }
       /**
        * Get mets.
        *
        * @return mets as Document.
        */
       public Document getMets() {
           return _mets;
       }
       /**
        * Set mets.
        *
        * @param mets the value to set.
        */
       public void setMets(final Document pMets) {
            _mets = pMets;
       }
       public void process(final MARCRecord pMARCRecord) throws JDOMException {
               // Process Header
               Element tRoot = this.getMets().getRootElement();
               tRoot.setAttribute("ID", pMARCRecord.getID());
               tRoot.setAttribute("LABEL", pMARCRecord.getLabel());
       }
 }

Other METS enhancers include:

uk.org.llgc.marc.DublinCoreEnhancer - Places a Dublin Core datastream in the METS values are from the MARC record using the Library of Congress MARC to DC XSL stylesheet conversion uk.org.llgc.marc.MARCPointer - Places the location of the MARC record from virtua into the METS record uk.org.llgc.marc.MODSEnhancer - Converts some of the MARC attributes to the MODS record uk.org.llgc.marc.PREMISIdentifiers - Adds some PREMIS meta data from the MARC record


METS Processors act in the same way. The methods that need to be implemented are as follows:

 public void initalize(final XMLProperties pProps) throws IOException;
 public Document getMets();
 public void setMets(final Document pMets);
 public void process(final HashMap pDatastreams) throws JDOMException, IOException;

As you can see they are exactly the same method names as the METSEnhancer and the only difference is that a HashMap pDatastreams is passed to process instead of the MARC record. This hash contains the following key value pairs (example URLs only):

Key => Value archive => http://tapedrive.llgc.org.uk/gch/00/00/1.tiff reference => http://fastimagedrive.llgc.org.uk/gch/00/00/1.jpg thumbnail => http://fastimagefrive.llgc.org.uk/gch/00/00/1_t.jpg

The key values correspond to the use attribute in the METS file sections and the URLS point to the actual image locations. A simple example of a METProccessor is uk.org.llgc.mets.utils.AssignDatastreamURLs which adds the mime type for the reference and thumbnail images rather than running them trough JHove.

 public void process(final HashMap pDatastreams) throws JDOMException, IOException {
   Namespace METS = Namespace.getNamespace("mets", "http://www.loc.gov/METS/");
   Namespace XLINK = Namespace.getNamespace("xlink", "http://www.w3.org/1999/xlink");
   List tMETSandXLINKNS = new ArrayList();
   tMETSandXLINKNS.add(METS);
   tMETSandXLINKNS.add(XLINK);
   if (_mimeType == null) {
     throw new IllegalStateException("You must call initalize(XMLProperties) before calling process");
   }
   // Now set links
   Element tFileSec = XMLUtilities.getXPathEl(this.getMets(), "//METS:fileSec", METS);
   Iterator tFiles = tFileSec.getChildren().iterator();
   Element tFile = null;
   Attribute tLink = null;
   while (tFiles.hasNext()) {
     tFile = (Element)tFiles.next();
     tLink = XMLUtilities.getXPathAttribute(tFile, "./METS:file/METS:FLocat/@xlink:href", tMETSandXLINKNS);
     // TODO May be should look at running these through Jove to discover mime-type
     tFile.getChild("file", METS).setAttribute("MIMETYPE", _mimeType.getMimeType((String)pDatastreams.get(tFile.getAttributeValue("USE"))));
     /**/_logger.debug("Use is " + tFile.getAttribute("USE"));
     tLink.setValue((String)pDatastreams.get(tFile.getAttributeValue("USE")));
   }
 }


Controlling which METSEnhancer to assign to the METS document is handled by the configuration file.

 <?xml version="1.0" encoding="UTF-8"?>
 <CONFIG xmlns:METS="http://www.loc.gov/METS/" xmlns:premis="http://www.loc.gov/standards/premis">
       <METS_PROCESSORS>
               <JHOVE class="uk.org.llgc.jhove.JhoveHandler">
                       <JHOVE_CONF>/home/gmr/development/mets_create/conf/jhove.conf</JHOVE_CONF>
                       <PREMIS>
                               <EVENT>
                                       <METS:digiprovMD ID="**FILL_ME**">
                                               <METS:mdWrap MDTYPE="PREMIS">
                                                       <METS:xmlData>
                                                               <premis:event>
                                                                       <premis:eventIdentifier>
                                                                               <premis:eventIdentifierType>WlAbNL</premis:eventIdentifierType>                                                                                <premis:eventIdentifierValue>FORMAT_VALIDATION-001</premis:eventIdentifierValue>
                                                                       </premis:eventIdentifier>
                                                                       <premis:eventType>validation</premis:eventType>
                                                                       <premis:eventDateTime>**TO_FILL**</premis:eventDateTime>
                                                                       <premis:eventOutcomeInformation>
                                                                               <premis:eventOutcome>V-001</premis:eventOutcome>
                                                                       </premis:eventOutcomeInformation>
                                                                       <premis:agentIdentifier>
                                                                               <premis:agentIdentifierType>WlAbNL</premis:agentIdentifierType>                                                                                <premis:agentIdentifierValue>http://hul.harvard.edu/jhove/</premis:agentIdentifierValue>
                                                                       </premis:agentIdentifier>
                                                               </premis:event>
                                                       </METS:xmlData>
                                               </METS:mdWrap>
                                       </METS:digiprovMD>
                               </EVENT>
                               <AGENT>
                                       <METS:digiprovMD ID="**FILL_ME**">
                                               <METS:mdWrap MDTYPE="PREMIS">
                                                       <METS:xmlData>
                                                               <premis:agent>
                                                                       <premis:agentIdentifier>
                                                                               <premis:agentIdentifierType>WlAbNL</premis:agentIdentifierType>                                                                                <premis:agentIdentifierValue>http://hul.harvard.edu/jhove/</premis:agentIdentifierValue>
                                                                       </premis:agentIdentifier>
                                                                       <premis:agentName>Jhove version 1.0</premis:agentName>
                                                                       <premis:agentType>software</premis:agentType>
                                                               </premis:agent>
                                                       </METS:xmlData>
                                               </METS:mdWrap>
                                       </METS:digiprovMD>
                               </AGENT>
                       </PREMIS>
               </JHOVE>
               <CHECKSUMS class="uk.org.llgc.checksums.ChecksumHandler" generator="uk.org.llgc.checksums.UnixChecksums">
                       <command type="md5">/usr/bin/md5sum</command>
                       <command type="sha">/usr/bin/gpg --print-md sha1</command>
                       <PREMIS>
                               <EVENT>
                                       <METS:digiprovMD ID="**FILL_ME**">
                                               <METS:mdWrap MDTYPE="PREMIS">
                                                       <METS:xmlData>
                                                               <premis:event>
                                                                       <premis:eventIdentifier>
                                                                               <premis:eventIdentifierType>WlAbNL</premis:eventIdentifierType>                                                                                <premis:eventIdentifierValue>MESSAGE_DIGEST_CALCULATION-001</premis:eventIdentifierValue>
                                                                       </premis:eventIdentifier>
                                                                       <premis:eventType>message digest calculation</premis:eventType>
                                                                       <premis:eventDateTime>**TO_FILL**</premis:eventDateTime>
                                                                       <premis:eventOutcomeInformation>
                                                                               <premis:eventOutcome>MDC-001</premis:eventOutcome>
                                                                       </premis:eventOutcomeInformation>
                                                                       <premis:agentIdentifier>
                                                                               <premis:agentIdentifierType>WlAbNL</premis:agentIdentifierType>
                                                                               <premis:agentIdentifierValue>UNIX_TOOLS-001</premis:agentIdentifierValue>
                                                                       </premis:agentIdentifier>
                                                               </premis:event>
                                                       </METS:xmlData>
                                               </METS:mdWrap>
                                       </METS:digiprovMD>
                               </EVENT>
                               <AGENT>
                                       <METS:digiprovMD ID="**FILL_ME**">
                                               <METS:mdWrap MDTYPE="PREMIS">
                                                       <METS:xmlData>
                                                               <premis:agent>
                                                                       <premis:agentIdentifier>
                                                                               <premis:agentIdentifierType>WlAbNL</premis:agentIdentifierType>
                                                                               <premis:agentIdentifierValue>UNIX_TOOLS-001</premis:agentIdentifierValue>
                                                                       </premis:agentIdentifier>
                                                                       <premis:agentName>UNIX checksum tools (/usr/bin/md5sum, /usr/bin/gpg --print-md sha1), see dev.llgc.org.uk</premis:agentName>
                                                                       <premis:agentType>software</premis:agentType>
                                                               </premis:agent>
                                                       </METS:xmlData>
                                               </METS:mdWrap>
                                       </METS:digiprovMD>
                               </AGENT>
                       </PREMIS>
               </CHECKSUMS>
               <ASSIGN_DATASTREAM_URLS class="uk.org.llgc.mets.util.AssignDatastreamURLs">
                       <mime-location>/etc/mime.types</mime-location>
               </ASSIGN_DATASTREAM_URLS>
       </METS_PROCESSORS>
       <METS_ENHANCERS>
               <HEADER class="uk.org.llgc.marc.HeaderEnhancer" />
               <DUBLIN_CORE class="uk.org.llgc.marc.DublinCoreEnhancer">
                       <MARC2DC>/home/gmr/development/mets_create/xsl/MARC21slim2OAIDC_NLW.xsl</MARC2DC>
                       <MARC2MODS>/home/gmr/development/mets_create/xsl/MARC21slim2MODS3.xsl</MARC2MODS>
               </DUBLIN_CORE>
               <MODS_ENHANCER class="uk.org.llgc.marc.MODSEnhancer" />
               <MARC_POINTER class="uk.org.llgc.marc.MARCPointer">
                       <URL>http://virtua.llgc.org.uk/access/**NUM**</URL>
               </MARC_POINTER>
               <PREMIS_IDENTIFIERS class="uk.org.llgc.marc.PREMISIdentifiers" />
       </METS_ENHANCERS>
 </CONFIG>

This class adds the attribute ID to the root of the METS document and places the ID from the MARC record into it. The label is then placed in the LABEL attribute on the root of the METS document.

Created the following packages:

  • uk.org.llgc.checksums - Handles Checksuming objects
  • uk.org.llgc.jhove - Creates jhove metadata wrapper for Jhove application
  • uk.org.llgc.mets.class01
    • Class01Mets.java - Main class which builds ingest mets
    • MARCHandler.java - Contains a collection of MARC records
    • MARCRecord.java - Contains a marc record with convenience methods for retrieving properties from MARC XML
  • uk.org.llgc.props
  • uk.org.llgc.xml

Ingest METS documents into Fedora

Tasks

  • Ingest METS document into Fedora
    • Add datastream in Fedora for each Image datastream in METS
    • Copy Dublin Core to Dublin Core datastream in object and replace with a pointer
    • Replace certain attributes of the METS document with the handle of the object e.g. the attribute OBJID in the parent METS element.
    • Create a RELS-EXT datastream with the OAI ID for OAI harvesting and relate object to collections

Results

Used mostly existing code from the Bridge Project for ingesting a METS document into Fedora. Unfortunately each METS document has some unique features so the repository bridge ingest mechanism was created with as much flexibility in design to allow each METS document to be handled differently. The advantage of using this program was that it handles all the ingest processes when it has been set up.

The main directory structure is shown below:

 uk/org/llgc/fedora/ingestMets  -- Main files (Explained Below)
 uk/org/llgc/fedora/junit  -- This package test the objects in Fedora to see if they have been 
                              added correctly and the METS is correctly updated
 uk/org/llgc/fedora/metadata  -- Object which handles creating the Dublin Core datastream from the XML contained in the ingest METS
 uk/org/llgc/fedora/mets -- Contains helper classes which help link objects to disseminators

uk/org/llgc/fedora/ingestMets/Class01Handler.java -- Main Class which implements the bridge project Handler Adapter and should be able to handle all Class 1 Objects

uk/org/llgc/fedora/ingestMets/utils/IngestDirClass01.java -- A helper class which allows you to pass in a directory containing METS documents and ingests them all

Create Simple Dessmination Programs

Simple dissemination programs need to be created to allow users to check the data in Fedora. These disseminators should be generic enough to work with any collection and object. The three disseminators I created were:

Show Collection:

This displays a collection's Dublin Core and gives a link to all the objects in the collection. This can be seen here. It could be assigned to any collection with has the showObject disseminator.

Show Object:

This displays an object's Dublin Core and allows someone to look at all the datastreams of an object. This can be seen by clicking on one of the links in the show collection disseminator but an example is here. This disseminator requires each object to have the getFullMets disseminator.

Get Full METS:

When the METS document is stored in the repository certain datastreams are extracted from it and a pointer is left in its place. This disseminator pulls together all the datastreams back into a METS document on dissemination. Currently it only pulls in the Dublin Core which is stored in a separate datastream of OAI-PMH harvesting (and Fedora requires it). In future this disseminator could also pull in the rights meta data.

Issues still needing to be addressed

  • METS Rights needs to be decided
    • A conversion from METS Rights to Fedora rights needs to be created
  • PREMIS dictionary needs to be discussed
    • List of actions used e.g. Object Ingested and Object Checksummed
  • DC Legacy data needs to be converted from DOC format to DC for inclusion in the METS


Problems During Ingest

Overall the ingest went very well and 4319 images were ingested successfully. Unfortunately there were a few problems:

Inaccessible Archive Images

The following 97 files contained a link to the Archive image which was inaccessible. This could be due to it not being digitised or we don't have rights to show the objects:

     LLGCb13537270 LLGCb13537271 LLGCb13537384 LLGCb13537403 LLGCb13537407 LLGCb13537414
     LLGCb13537415 LLGCb13537416 LLGCb13537417 LLGCb13537418 LLGCb13537419 LLGCb13537420
     LLGCb13537421 LLGCb13537422 LLGCb13537423 LLGCb13537424 LLGCb13537425 LLGCb13537426
     LLGCb13537427 LLGCb13537428 LLGCb13537429 LLGCb13537434 LLGCb13537435 LLGCb13537436
     LLGCb13537437 LLGCb13537479 LLGCb13537490 LLGCb13537688 LLGCb13537902 LLGCb13537930
     LLGCb13538279 LLGCb13538307 LLGCb13538308 LLGCb13538309 LLGCb13539035 LLGCb13539063
     LLGCb13539064 LLGCb13539065 LLGCb13539105 LLGCb13539106 LLGCb13539107 LLGCb13539162
     LLGCb13539371 LLGCb13539372 LLGCb13539564 LLGCb13539565 LLGCb13540039 LLGCb13540118
     LLGCb13540297 LLGCb13540298 LLGCb13540301 LLGCb13540302 LLGCb13540303 LLGCb13540304
     LLGCb13540305 LLGCb13540306 LLGCb13540356 LLGCb13540363 LLGCb13540364 LLGCb13540398
     LLGCb13540461 LLGCb13540462 LLGCb13540463 LLGCb13540480 LLGCb13540481 LLGCb13540483
     LLGCb13540485 LLGCb13540486 LLGCb13540487 LLGCb13540488 LLGCb13540489 LLGCb13540522
     LLGCb13540523 LLGCb13540524 LLGCb13540525 LLGCb13540526 LLGCb13540527 LLGCb13540528
     LLGCb13540743 LLGCb13541380 LLGCb13541381 LLGCb13541498 LLGCb13541499 LLGCb13541709
     LLGCb13541728 LLGCb13541741 LLGCb13541742 LLGCb13543473 LLGCb13543600 LLGCb13543607
     LLGCb13543614 LLGCb13543617 LLGCb13543620 LLGCb13543623 LLGCb13543628 LLGCb13543635
     LLGCb13543657

No 856 link

The 3 following records did not have a 856 link in them:

     LLGCb13546932 LLGCb13546933 LLGCb13547456

Time

To create the METS Documents it took UNKOWN

To ingest the METS documents it took about 8 hours

Class 03

Considering changing Child objects to be part of a collection. For example in the Geoff Charles collection only the parent objects are part of the collection:geoff_charles may be all the images should be as well. This would allow easier searching of children using fedoragsearch.

The following xsl is present in the fedoragsearch file demoFoxmlToLucene.xslt

 <xsl:if test="substring(/foxml:digitalObject/foxml:objectProperties/foxml:extproperty[@NAME='http://dev.llgc.org.uk/digitisation/identifiers/nlw_id']/@VALUE, 1, 3)='gch'">
   <xsl:if test="/foxml:digitalObject/foxml:objectProperties/foxml:property[@NAME='info:fedora/fedora-system:def/model#contentModel' and @VALUE='METS-VITAL01']">
     <xsl:apply-templates mode="activeDemoFedoraObject"/>
   </xsl:if>
 </xsl:if>

it would be better if we could check the collection in the rels-ext datastream rather than have to look at the nlw_id starting with gch.

This would mean a change to the code which retrieves collections from:

 select $member $title from <#ri> where 
 $member <fedora-rels-ext:isMemberOf> <info:fedora/collection:digital_books> 
 and $member <dc:title> $title

to:

 select $member $title from <#ri> where 
 $member <fedora-rels-ext:isMemberOf> <info:fedora/collection:digital_books> 
 and $member <fedora-model:contentModel> 'METS-VITAL03-Parent'
 and $member <dc:title> $title

Class 04