by Scott » Sun Jul 02, 2006 11:09 pm
Hi Alec (or other PKP developers),
OK, I had a look at the PKPDC plugin and have some followup queries. The PKPDC appears to be more concerned with handling a different schema. In the ListSets case I'm still dealing with a standard DC metadata harvest but harvesting from a different OAI request (i.e. the DC is in the setDescription element rather than the metadata element), so I think it is more an extension/enhancement of the exisiting OAI plugin than a new one. An example of a ListSets "record":
<set>
<setSpec>hdl_1030.58_1952</setSpec>
<setName>Chinese Revolution (New)</setName>
<setDescription>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/ ">
<dc:title>Chinese Revolution</dc:title>
<dc:description>Scholarly Information Services/The Library at ANU holds a number of unique and in some cases rare and fragile collections in both print and microfilm relating to the Chinese Cultural Revolution period (1966 to 1976).</dc:description>
<dc:rights>http://www.anu.edu.au/legal/copyrit.html</dc:rights>
<dc:identifier>1030.58/1952</dc:identifier>
</oai_dc:dc>
</setDescription>
</set>
The setSpec content acts as the record identifer in this case (same purpose as identifier in ListRecords), and there is no datestamp information. The harvested information is the setDescription content (same as metadata in ListRecords).
I'd value the developers' opinion on how best to implement this so it is in a useful form for the core code base, then I can contribute it back. I have noted the changes I've made to get it working at the end of the email. Other than modifying the OAI harvester, an alternative is to use a separate harvester altogether that only implements ListSets harvesting but I'm not sure how much code duplication there would be or whether that is the right approach.
Anyhow, the mods I've made to get this to work in the harvester plugin include:
- adding the ListSets to the Index Method dropdown
- only show "All Sets" in the set selection list where index method is ListSets
- passing a "noindex" parameter in the OAIXMLHandler instantiation in getSets method (this method appears to only be used for retrieving a list of sets so no side-effects in doing this). I needed to do this to avoid a harvest when the set selection list is built
- modifying the OAIXMLHandler to harvest from the setDescription. This involved some slight changes in the case statements for the elements in the ListSets request
The metadata updates, searching and flushing all appears to function correctly, and no side-effects appear to have been introduced on the existing OAI harvesting.
I can either post these to the list for a closer look or mail to you direct if you want to have a look.
Scott.