OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



Harvester and metadata schemas

Open Harvester Systems support questions and answers, bug reports, and development issues.

Moderators: jmacgreg, michael, John

Forum rules
Developer Resources:

Git: You can access our public Git Repository here. Comprehensive Git usage instructions are available on the wiki.

Bugzilla: You can access our Bugzilla report tracker here.

Search: You can use our Google Custom Search to search across our main website, the support forum, and Bugzilla.

Questions and discussion are welcome.

Harvester and metadata schemas

Postby fredriley » Fri Sep 14, 2007 11:33 am

Hi

It's my first time here, so please be gentle with me :o)

I've just downloaded Harvester today to my Macbook to test it out, prior to deploying it on a couple of websites I run. I was very impressed by the ease of installation - no editing of config files, no chmodding, no database setup, truly wash 'n' go. It seems to work fine accessing the OAI service (http://www.rlo-cetl.ac.uk:8080/test/IntraLibrary-OAI) of the test repository of the organsation I do some work for (http://www.rlo-cetl.ac.uk), and it's harvested the records fine. However, I know that the objects in that repository are catalogued with the UK LOM Core, because I've added some myself and because that's the default schema for the repository software, Intralibrary (http://www.intralibrary.com), but in the Manage Archive form the only schema available is Dublin Core. Searching for "metadata format" on this forum dug up a thread which appeared to say that other schemas need plugins installed, but if the repository supports schema X then Harvester will go ahead and download it. Is this the case, or do I need to find the appropriate schema plugin and install it manually? Does Harvester rely on the repository to tell it which schema is being used?

One suggestion for the user interface - as it can take ages to harvest a large collection, it might be worth while having a 'progress bar' of sorts, or just a simple 'Fetching archive data - please wait' notice, appear. At least on OS/X with Firefox, the only indication I've got that something's happening is the standard browser 'loading' message in the status bar. At least, until I looked in the mySQL tables and found loads of records. This isn't a criticism, though - it's great software, and has saved me an awful lot of PHP programming.

Finally, is there an index anywhere of OAI-PMH archives together with their OAI-PMH service URLs? OAIster has a list of archives (http://www.oaister.org/viewcolls.html) but the archive info doesn't include the service URL - you have to go to the archive itself and trog around its website to find a service URL, if there is one.

Cheers

Fred
Learning Technologist
School of Nursing, University of Nottingham, UK
fredriley
 
Posts: 27
Joined: Fri Sep 14, 2007 10:47 am

Re: Harvester and metadata schemas

Postby asmecher » Fri Sep 14, 2007 3:33 pm

Hi Fred,

Glad to hear the Harvester is working out well so far.

The OAI protocol allows the Harvester to query any archive for its list of supported metadata formats, and the Harvester than then compare that list against its own set of supported formats. The metadata format pull-down only includes the formats that are supported by both.

The Harvester currently supports MARC, MODS, MARCXML, and DC schemas, but implementing a new one is typically not too complicated; have a look in plugins/schemas/dc for the implementation of the Dublin Core plugin as an example.

A status bar or some sort of progress indicator is an excellent idea for a future release; in the meantime, I'd suggest using the command-line harvester as this method of running a potentially long-running script can be more reliable than invoking it via the web. Most hosts have an execution time limit and harvesting via the web will often encounter it.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8347
Joined: Wed Aug 10, 2005 12:56 pm

Re: Harvester and metadata schemas

Postby fredriley » Mon Sep 17, 2007 6:20 am

Thanks for the reply. It's useful to clear up where the schema support comes from.

The Harvester currently supports MARC, MODS, MARCXML, and DC schemas, but implementing a new one is typically not too complicated; have a look in plugins/schemas/dc for the implementation of the Dublin Core plugin as an example.


Hmm, well it looks a bit complicated to me - you have to create a custom PHP class for the schema, and the UK LOM core (http://zope.cetis.ac.uk/profiles/uklomcore) is pretty damn large, based as it is on the IEEE LOM. Plus my object-oriented programming in PHP is pretty basic, and restricted so far to just using existing classes (not helped by our sysadmins only having installed PHP 4.3 :( ). It would be easy enough to add LOM fields to the getFieldList() but it could be non-trivial to write methods to extract LOM data that's not expressed in DC (such as fields on educational content)

At the risk of looking a gift horse in the mouth, I don't suppose that there are any 'ready-rolled' classes for the IEEE LOM or similar? if not, no big, I'll use DC for the time being. Chances are that our users would only search on basic DC fields anyway.

Cheers

Fred
fredriley
 
Posts: 27
Joined: Fri Sep 14, 2007 10:47 am

Re: Harvester and metadata schemas

Postby asmecher » Mon Sep 17, 2007 9:00 am

Hi Fred,

There is a thread on IEEE LOM at http://pkp.sfu.ca/support/forum/viewtopic.php?t=2026; perhaps you could join forces.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8347
Joined: Wed Aug 10, 2005 12:56 pm

Re: Harvester and metadata schemas

Postby fredriley » Mon Sep 17, 2007 10:13 am

asmecher wrote:Hi Fred,

There is a thread on IEEE LOM at http://pkp.sfu.ca/support/forum/viewtopic.php?t=2026; perhaps you could join forces.

Regards,
Alec Smecher
Public Knowledge Project Team


Thanks for the pointer, Alec - I've had a skim and plainly it's pretty damn relevant. I'll look at it in more detail tomorrow. Thanks again for your help, and of course to the team for producing the Harvester in the first place :)

Cheers

Fred
fredriley
 
Posts: 27
Joined: Fri Sep 14, 2007 10:47 am


Return to Open Harvester Systems Support and Development

Who is online

Users browsing this forum: No registered users and 1 guest