Design of the Relationships Project

From Devwiki
Revision as of 12:39, 8 June 2007 by Pab (talk | contribs) (Design moved to Design of the Relationships Project: Name Unclear)
(diff) ←Older revision | view current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Back

Design of the Relationships Project

Outline Design

The system will be split into three separate modules:

  • Relationship Builder
  • Query Builder
  • Graph

Each module has separate areas of responsibility and they will work from the resource index and RELS-EXT. This means it should work with a repository that is already populated with digital objects. The relationship builder should make the process of creating relationships simpler and less error prone. The query builder will make the retrieval of relationship meta data more intuitive and with the auto creation of ITQL queries should act as a training tool for more complex relationship retrieval. The Graph module should improve the visualization of a repository and should make error tracking and general use trouble-free.

The project will be deliverable over the web using HTML code and the http protocol. The other options are a stand-alone client or applet. The disadvantage of a stand-alone client is when an upgrade needs to take place, each upgrade has to be done manually and this can lead to clients being left on incorrect versions. The applet requires a plug in to be installed on the users’s side and although it has the advantage of only needing to be upgrade in one place, it is limited in what it can do for security reasons.

The project will use the Web Framework Struts which is produced by the Apache Software foundation. Documentation on Struts is located at:

 http://struts.apache.org/ 

As the Fedora team are going to produce their own workflow system called Fire, using this technology it is expected this project will be more acceptable to the community if it follows this lead. It should be noted that Elated also uses the Struts framework and it has many users in both the public and private sectors. The advantage of Struts is its abstraction of control which is different from the conventional servlet framework. It achieves this by moving the control into Action classes which are driven by XML configuration files. In the long run this makes maintenance of web application a simple task, although it generally takes longer to create in the first place. Struts uses the Java Internationalization API to generate different languages and allow switching between them. With the use of property files it is possible to quickly translate all text on a page into multiple languages. Unlike Java Server Faces, Struts does not rely on javascript so it is available to a wider set of Web Browsers. The other alternative is cocoon which is powered by XML. This also has a relatively wide following in the Fedora community with a lot of front ends being developed using it but it was decided to follow the Fedora core team and choose Struts.

When creating the web application it is important to create a consistent and intuitive user interface. To this ends standards will be adhered to. The look and feel will resemble other web applications and through the use of CSS style sheets it should be possible to enable the consistency of the application while allowing different institutions to customize their portal.

Where possible the application will only interact with the Fedora published web services are these will be the most static interface. When it is necessary to bypass these and go straight to the underlying database the code will be kept as abstracted as possible so future updates to this code can be done with out affecting the rest of the system.


Relationship Builder

The relationship builder is has a number of work flows with pages that may well be shared with the Query Builder and Graph View. The workflows are as follows:

Workflow 1

  1. Find Parent Object
  2. Find Child Object
  3. Select Link to create
  4. Store
  5. Add another child?

This workflow requires a find object page which should allow:

Search by PID Using the Fedora API-A interface and methods:

 fedora-types:ObjectProfile getObjectProfile( 
                   xsd:string pid, 
                   xsd:string asOfDateTime )
 fedora-types:MIMETypedStream getDatastreamDissemination( 
                   xsd:string pid, 
                   xsd:string dsID, 
                   xsd:string asOfDateTime )

Search by Dublin Core Using the Fedora API-A interface and method: fedora-types:FieldSearchResult findObjects(

                   fedora-types:ArrayOfString resultFields, 
                   xsd:nonNegativeInteger maxResults, 
                   fedora-types:FieldSearchQuery query )

Search by Property (see Query Builder) This will use the resource index and run the following query:


Where ‘##IDENTIFIER##’ is the name of the property and ‘##VALUE##’ is the value of the property that is being searched on.

The link will allow any type of link from the Fedora Types and the relationship will be created in RDF xml and stored in the RELS-EXT stream. This will mean the relationships should work in updates in Fedora and other repositories.

Workflow 2

  1. Find all Objects of a Certain Mime Type
  2. Add in details of a collection object
  3. Create collection

This will use the query features explained below and it should allow users to create a collection of images, Audio or Video. A collection parent object will be created which acts as Meta data about the collection.

Workflow 3

  1. Find all objects with certain Dublin Core attributes
  2. Select which objects you would like in the collection
  3. Add details of collection object (i.e. Dublin Core)
  4. Create Collection

This will use the Dublin Core search explained below but will allow the user to select which objects they would like to include in a collection. The relationships will be created reciprocally so that when you view the child you can still find which collection it is a member of.

Query Builder

Workflows 4 and 5 should aid in the understanding of ITQL as the user will enter parameters, which will interactively build the ITQL query. It will then be possible to see the results.

Workflow 4

  1. Select Parent Object
  2. Select Relationship Type
  3. Display Results
  4. and Display generated ITQL query

Workflow 5

  1. Enter parameter’s name
  2. Enter parameter’s value
  3. Display Results
  4. and Display generated ITQL Query

The following workflows will need to be clearly separated from the rest of the relationship system as they directly access the database which drives Fedora. This is the section that may need to be upgrade with new releases of Fedora. Within the Fedora database there are a list of mime-types and associated PID. There is also a list of which objects use which disseminator. Using SQL queries it should be possible to retrieve this data.

Workflow 6

  1. Enter Mime-Type
  2. Search
  3. Return and display Results

There is no equivalent ITQL query for this.

Workflow 7

  1. Enter PID of bDef
  2. Search
  3. Return and display Results

There is no equivalent ITQL query for this.

Graph View

Scalable Vector Graphics (SVG) will be used to draw the graphs as a high accuracy is required. This can then be translated into either JPEG or PNG as need using the following Apache application called Batik:

http://xml.apache.org/batik

Batik is a SVG library written for Java and should be easy to integrate into the Struts framework. The following workflows should be available when ever someone views an object.

Workflow 8

  1. Select Object
  2. Decided how many levels
  3. Display Object

Once the object is selected it will be necessary to recursively go through the repository and find which objects are related. When the relationship is very complicated this could take some time so it needs to be possible for the user to specify how far down the tree to go. As it is possible to have circular relations care must be taken not to enter into an infinite loop.

Workflow 9

  1. Select Repository Target
  2. Display graph of entire repository

This will show the graph of the entire repository although it may take quite a bit of time. This should only be used for statistic gathering as depending on the size of the repository it may take quite a bit of time to generate the graph.