OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



dc.type

Open Harvester Systems support questions and answers, bug reports, and development issues.

Moderators: jmacgreg, michael, John

Forum rules
Developer Resources:

Git: You can access our public Git Repository here. Comprehensive Git usage instructions are available on the wiki.

Bugzilla: You can access our Bugzilla report tracker here.

Search: You can use our Google Custom Search to search across our main website, the support forum, and Bugzilla.

Questions and discussion are welcome.

dc.type

Postby josipkp » Fri Feb 06, 2009 4:39 pm

Hi,

We are using Harvester 2.3 beta in Windows and Unix environment.

Question 1)
Using the function "Show metadata" in our OJS installation, we can see both items:
type | status e genre | Article peer reviewed
type | method/approach | Scientific article
(for example).

Harvesting this journal, just the first dc.type appears in the Harvester "View Record" function (Windows and Unix).
When we tried to harvest another OJS installation we could see the 2 dc.types listed in the Windows Harvester, but not in Unix.

What could be wrong?

Question 2)
When "manage archives", we can select specific sets to "update metadata index".
But we can not find where this information is saved in mySQL.
Is the set information not saved?

Thank you in advance,
Josi Perez
josipkp
 
Posts: 61
Joined: Fri Jun 27, 2008 8:51 am

Re: dc.type

Postby asmecher » Fri Feb 06, 2009 6:46 pm

Hi josipkp,

Hope you're finding the beta useful -- we've already fixed a number of minor issues in CVS, and the results will be released around the end of the month. Feel free to pass along any feedback you've got.

Regarding the differences in dc.type, can you check the Harvester database to see if they appear there? Look in the "records" table (you can find the appropriate record_id in the URL of the Harvester when you're browsing that record), first at the "contents" column (which contains the raw XML source of the record), then at the "parsed_contents" column (which contains a quicker-to-read version of the record that is used for presentation). You should see it in the "parsed_contents" column as something like:
Code: Select all
s:7:"subject";a:3:{i:0;s:25:"discipline/subdisciplines";i:1;s:17:"keywords; keyword";i:2;s:13:"subject class";}
The operative part is the first "subject" string; the rest are metadata values for the various fields.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8597
Joined: Wed Aug 10, 2005 12:56 pm

Re: dc.type

Postby josipkp » Sun Feb 08, 2009 8:11 am

Alec Smecher, thank you very much for your answer.

The subject field it is ok: it is showed up like in OJS.
There are 3 subjects for this specific article and the 3 lines appears ok in the Harvester Record Details and in the record table.

The same not occurs with the type field.
In OJS we could see
8. Tipo Situação & gênero Avaliado por Pares
8. Tipo Tipo trabalho de campo com observação participante e entrevistas

but, in the Harvester Record Details we can see just:
Date 2008-09-29
Type Avaliado por Pares
Format application/pdf

and the words "trabalho de campo com observação participante e entrevistas" do not appear on the record table, neither in contents or parsed_contents columns.

We are looking too for the "set" (the name of OJS section) where this article is (showed in the box set when harvesting an archive), but we can not find it.

The idea is to separate the harvested documents in categories (articles, reviews, thesis and so) using one of this fields.

Thanks in advance,
Josi Perez
josipkp
 
Posts: 61
Joined: Fri Jun 27, 2008 8:51 am

Re: dc.type

Postby asmecher » Sun Feb 08, 2009 4:06 pm

Hi Josi,

Try doing a flush and re-harvest on that archive -- if the data isn't in the "contents" column of the "records" table, it didn't come into the Harvester in the first place. Have you been harvesting incrementally?

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8597
Joined: Wed Aug 10, 2005 12:56 pm

Re: dc.type

Postby josipkp » Wed Feb 11, 2009 10:49 am

Alec Smecher, thank you very much for your answer.

Harvester 2.3beta environment.
I tried to flush each archive, without success. I even installed a clean Harvester environment.
There are two languages defined for our Harvester: English and Portuguese, but, we tried in a clean installation, with the same effect: no two lines for dc.type.

Considering one example. In OJS I can see:
[...]
7. Data (YYYY-MM-DD) 2008-12-01
8. Tipo Situação & gênero Artigo Avaliado por Pares
8. Tipo Tipo Artigo científico

9. Formato Formato do Documento PDF
[...]

but, when I harvest the archive where this article is, I can not see "type: scientific article" in "record details":
[...]
Date 2009-02-08
Type Artigo Avaliado por Pares
Format application/pdf
[...]

The field "parsed_contents" in the "records" table do not show this information - is not only a visual question:
[...]"date";a:1:{i:0;s:10:"2009-02-08";}s:4:"type";a:1:{i:0;s:25:"Artigo Avaliado por Pares";}s:6:"format";a:1:{i:0;s:15:"application/pdf";}s:10:[...]

When you harvest an archive, do you receive two lines of dc.type in "record details" in Harvester 2.3? Is this a specific problem in our installation?

I do not find the "set" field too (the names that are in the box when harvesting an archive: the section's name from an OJS archive, or a last level communities in a DSpace archive).

We are looking for the "dc.type" or the "set" information to permit to do a search for specific documents (articles, thesis, and so).

Waiting for a tip, thanks in advance,
Josi Perez
josipkp
 
Posts: 61
Joined: Fri Jun 27, 2008 8:51 am

Re: dc.type

Postby asmecher » Wed Feb 11, 2009 11:27 am

Hi Josi,

I suspect you can track this back to OJS, not the Harvester -- have a look at the OAI data that it serves by going to a URL like the following: http://my-server-name-here/path/to/ojs2/index.php/index/oai?verb=listRecords&metadataFormat=oai_dc

See if you can find the record in question. It may be something quirky like the metadata language; are the values you mention entered in several languages?

Likewise, to see the set list, query http://my-server-name-here/path/to/ojs2/index.php/index/oai?verb=listSets

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8597
Joined: Wed Aug 10, 2005 12:56 pm

Re: dc.type

Postby josipkp » Fri Mar 06, 2009 10:54 am

Alec Smecher, thank you very much for your answer. It was very elucidative and I learned a lot, but I have new doubts.

1--------
When I tried to access the records using
http://revistas.univerciencia.org/turis ... fix=oai_dc

received the message:
'output handler 'ob_gzhandler' conflicts with 'zlib output compression''
in line 67 of classes/oai$ nano OAI.inc.php

we have zlib on, then I tried to ask something like "if zlib.output_compression = off" but I did not have success - then I just comment this line.
// Encode data with gzip, deflate, or none, depending on browser support
// ob_start('ob_gzhandler');

How is the better solution here?


2---
We made some tests using other formats provided by OJS and changed the code in an erroneous way because I don't know how is the right way…

http://revistas.univerciencia.org/index ... x=oai_marc
<b>Notice</b>: Undefined index: pt_BR in <b>/var/www/ojs/classes/oai/format/OAIMetadataFormat_MARC.inc.php</b> on line <b>36</b><br />

In /classes/oai/format/OAIMetadataFormat_MARC.inc.php and
/classes/oai/format/OAIMetadataFormat_MARC21.inc.php
changed line 36
from
$this->formatElement('260', ' ', ' ', 'b', $record->publishers[$record->primaryLocale]) .
to
$this->formatElement('260', ' ', ' ', 'b', '') .

In /classes/oai/format/OAIMetadataFormat_MARC21.inc.php changed too the line 46:
from
$this->formatElement('540', ' ', ' ', 'a', $record->rights[$record->primaryLocale]) .
to
$this->formatElement('540', ' ', ' ', 'a', '') .


Easy, no? :(
How should I fill the last parameter in this lines?



3---
Back to dc.type metadata
Following your suggestion I looked in OJS code.
In classes/oai/ojs/OAIDAO.inc.php there is the function _returnRecordFromRow(&$row) that contains in the 360 line:

$types = $this->stripAssocArray((array) $section->getIdentifyType(null));
$record->types = empty($types)?array(Locale::getLocale() => Locale::translate('rt.metadata.pkp.peerReviewed')):$types;

Then, the OJS code do not send the "Method Type" available in $article->getArticleType().
I tried to join both strings (the section and article variables) considering locales without success.


Answering your question: yes, the metadata are entered in several languages - usually Portuguese, English and Spanish; metadata appears ok in all languages, including "type". Now, I can focus my problem: "type" brings just the section information, not the article "type" (Type, method or approach).


What do you suggest me?
Is there a technical or administrative reason for not to send this field?


Thank you,
Josi Perez
josipkp
 
Posts: 61
Joined: Fri Jun 27, 2008 8:51 am

Re: dc.type

Postby asmecher » Tue Mar 10, 2009 9:45 pm

Hi Josi,

Sorry about the delay in responding.

Regarding #1 -- I haven't seen this before, but your solution looks good to me.

#2 -- There may have been some problems in the current release with metadata formats and multilingual data; we're currently beta-testing OJS 2.2.3, and it should be available around the end of the month. I'd suggest waiting for that release, or perhaps checking out its CVS tag (ojs2-branch-2_2_2).

#3 -- I've posted a Bugzilla entry for this at http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=4121 and expect to have a patch posted there shortly.

Regards,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8597
Joined: Wed Aug 10, 2005 12:56 pm

Re: dc.type

Postby josipkp » Fri Mar 13, 2009 6:29 am

Delay??!! The PKP Team is so very fast in answer.
Congratulations for both: software development and availability for keep this forum always up-to-date.


I made the modification code based in your Bugzilla and we have now both dc.types in Harvester.
Thank you.
Josi Perez
josipkp
 
Posts: 61
Joined: Fri Jun 27, 2008 8:51 am


Return to Open Harvester Systems Support and Development

Who is online

Users browsing this forum: No registered users and 1 guest