OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



[Beginner] Is there a way to export harvested records

Open Harvester Systems support questions and answers, bug reports, and development issues.

Moderators: jmacgreg, michael, John

Forum rules
Developer Resources:

Git: You can access our public Git Repository here. Comprehensive Git usage instructions are available on the wiki.

Bugzilla: You can access our Bugzilla report tracker here.

Search: You can use our Google Custom Search to search across our main website, the support forum, and Bugzilla.

Questions and discussion are welcome.

[Beginner] Is there a way to export harvested records

Postby mhawksey » Wed Dec 14, 2011 2:15 am

Hi, Is there a way to export records out of OHS other than a MySQL dump?
Thanks,
Martin
mhawksey
 
Posts: 1
Joined: Wed Dec 14, 2011 2:00 am

Re: [Beginner] Is there a way to export harvested records

Postby alexukua » Wed Dec 14, 2011 2:23 am

Very simple
Use phpmyadmin
select OHS database and export table records
alexukua
 
Posts: 32
Joined: Thu Oct 16, 2008 3:27 am

Re: [Beginner] Is there a way to export harvested records

Postby asmecher » Wed Dec 14, 2011 9:48 am

Hi all,

Alternately, recent versions of OHS can re-serve data via OAI; your OAI URL will look like http://url-to-ohs/path/to/index.php/oai.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8857
Joined: Wed Aug 10, 2005 12:56 pm

Re: [Beginner] Is there a way to export harvested records

Postby tlchristian » Wed Feb 15, 2012 8:44 pm

I would like to export the dc metadata so that I can perform a content analysis on particular dc elements. I've used the export function in myphp, but when I attempt to convert or open the xml in a table or ms access database for analysis, no dice. I've even put the xml into a text file, but it just won't work. Any suggestions??

Thanks in advance!
tlchristian
 
Posts: 6
Joined: Wed Feb 15, 2012 8:40 pm

Re: [Beginner] Is there a way to export harvested records

Postby asmecher » Thu Feb 16, 2012 9:46 am

Hi tlchristian,

Depending on how you've extracted the data from the Harvester, it might be in one of several formats. Depending on what kind of analysis you want to do, or how you want it to be represented in Access, you'll have to figure out a data conversion process. It won't be as simple as exporting the XML from one tool and importing it into another.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8857
Joined: Wed Aug 10, 2005 12:56 pm

Re: [Beginner] Is there a way to export harvested records

Postby tlchristian » Thu Feb 16, 2012 7:23 pm

Thank you for the info.

Maybe my approach is all wrong. What I'm trying to do is run a content analysis on the dc:type, dc:identifier, and perhaps other dc elements. Basically, I want to figure out how each of the repositories for which I've harvested their metadata are using those elements. What terms are they using to describe collection types? What are the most common identifier formats among the repositories?

Maybe there is a better way to perform such an analysis. Any advice would be much appreciated!
tlchristian
 
Posts: 6
Joined: Wed Feb 15, 2012 8:40 pm

Re: [Beginner] Is there a way to export harvested records

Postby asmecher » Thu Feb 16, 2012 10:30 pm

Hi tlchristian,

Unfortunately OHS's database isn't very suitable for this kind of analysis directly. You can get the raw XML for each record from the database by querying the contents column of the records table, but you'll still need to parse the XML for the particular fields you're looking for. (If you need some data crosswalked into Dublin Core, you're better getting XML from the OAI interface as described above.)

I tend to use command-line tools like grep, sort, uniq, and wc -- available on most *NIX and MacOSX systems, but also available under e.g. Cygwin for Windows -- to do basic analysis from there, as they can operate directly on the XML files. However, they aren't particularly intuitive.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8857
Joined: Wed Aug 10, 2005 12:56 pm

Re: [Beginner] Is there a way to export harvested records

Postby tlchristian » Fri Feb 17, 2012 12:27 pm

Thanks so much, Alec.

I executed a ListRecords request, which worked beautifully. With the XML in hand, I'll be able to do what I need to do for my analysis.

However, the ListRecords only returned headers and not the actual dc metadata. In fact, the <metadata> tag has nothing in it. I must be doing something wrong, right?

Sorry for all the questions. School project deadlines... :?
tlchristian
 
Posts: 6
Joined: Wed Feb 15, 2012 8:40 pm

Re: [Beginner] Is there a way to export harvested records

Postby asmecher » Fri Feb 17, 2012 1:08 pm

Hi tlchristian,

Hmm, are these records crosswalked from something that was not originally DC?

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8857
Joined: Wed Aug 10, 2005 12:56 pm

Re: [Beginner] Is there a way to export harvested records

Postby tlchristian » Fri Feb 17, 2012 1:52 pm

No, I don't think so. I haven't even set up any crosswalks. Everything in the OHS GUI looks good; all of the dc metadata values are present. The contents field in the records table contain the XML. A missing link...
tlchristian
 
Posts: 6
Joined: Wed Feb 15, 2012 8:40 pm

Re: [Beginner] Is there a way to export harvested records

Postby asmecher » Fri Feb 17, 2012 2:26 pm

Hi tlchristian,

I think I found it -- there's a weird reference quirk affecting newer releases of PHP that was causing this, at least on my system. Try applying the change described at https://github.com/pkp/harvester/commit/fe3ea4fd4ffe82fef8610daa8f69e4094b674b24. The Bugzilla entry for this is at http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=7157.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8857
Joined: Wed Aug 10, 2005 12:56 pm

Re: [Beginner] Is there a way to export harvested records

Postby tlchristian » Fri Feb 17, 2012 2:49 pm

Excellent! In the meantime, I found a workaround that involves putting the content data into a text file; inserting <ListRecords>, <record>, and <metadata> tags; saving as XML; then importing into MS Access. Not efficient, but it's working so far.

I'll try your fix. I'm sure it'll be much more efficient!

Thank you so much for your help!!
tlchristian
 
Posts: 6
Joined: Wed Feb 15, 2012 8:40 pm

Re: [Beginner] Is there a way to export harvested records

Postby tlchristian » Fri Feb 17, 2012 2:55 pm

Yes! Your fix worked like a champ.

THANK YOU!! :mrgreen:
tlchristian
 
Posts: 6
Joined: Wed Feb 15, 2012 8:40 pm

Re: [Beginner] Is there a way to export harvested records

Postby asmecher » Fri Feb 17, 2012 3:12 pm

Hi tlchristian,

Great, thanks for confirming! It'll go into the next release.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8857
Joined: Wed Aug 10, 2005 12:56 pm

Re: [Beginner] Is there a way to export harvested records

Postby diegospano » Wed Sep 11, 2013 7:27 am

Following with the previous posts, I´m exporting the records with a simple php file, but I have a doubt: in "records" table there is no reference to the set that the record belongs to. I´m harvesting through the command line with the following syntax:

php harvest.php 2 set=com_10469_899 skipExistingEntries
php harvest.php 2 set=com_10469_900 skipExistingEntries

From the archive_id 2 I´m harvesting two different sets, but I can´t find any way to distinguish the records in the Records table!

Thanks in advance.

Diego
diegospano
 
Posts: 3
Joined: Wed Sep 11, 2013 7:10 am

Next

Return to Open Harvester Systems Support and Development

Who is online

Users browsing this forum: No registered users and 2 guests