OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



Database issue

Are you responsible for making OJS work -- installing, upgrading, migrating or troubleshooting? Do you think you've found a bug? Post in this forum.

Moderators: jmacgreg, btbell, michael, bdgregg, barbarah, asmecher

Forum rules
What to do if you have a technical problem with OJS:

1. Search the forum. You can do this from the Advanced Search Page or from our Google Custom Search, which will search the entire PKP site. If you are encountering an error, we especially recommend searching the forum for said error.

2. Check the FAQ to see if your question or error has already been resolved.

3. Post a question, but please, only after trying the above two solutions. If it's a workflow or usability question you should probably post to the OJS Editorial Support and Discussion subforum; if you have a development question, try the OJS Development subforum.

Re: Database issue

Postby khufu » Wed Jun 06, 2012 11:40 am

Jason, all,
I realized that I am using the importExport.php improperly.
I need to import news defined as follows:
a TITLE
a BODY
a DATE
nothing else.
The browsing of the news using the web page is optional; what is very important is the atom/rss feed.
So now my question.
How should be the xml file ?
I mean the simplest that does not overload the dabase.

thanks
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby JasonNugent » Wed Jun 06, 2012 2:42 pm

Hi khufu,

The importExport script can accept an issue_id on the command line during the import. You can do something like this:

Code: Select all
php tools/importExport.php NativeImportExportPlugin import yourXmlFile.xml journalPath issue_id #####  section_abbrev #####


If you do it that way, all you need to assemble in the XML are the <articles><article...></article></articles> that you want to bring in. You'd specify the issue_id (the ##### bit) on the command line as well, and also the section abbreviation. Doing it this way would prevent the creation of a new issue each time.

You can get detailed usage instructions with:

Code: Select all
php tools/importExport.php NativeImportExportPlugin usage


Regards,
Jason
JasonNugent
Site Admin
 
Posts: 877
Joined: Tue Jan 10, 2006 6:20 am

Re: Database issue

Postby khufu » Wed Jun 06, 2012 11:20 pm

Jason,
As usual your replays are so interesting !
The way you suggested to import articles into the same issue are very good but now I have some doubts about the xml file.
Should I use only the following as xml file :
Code: Select all
<article>
            <title locale="en_US">Jane Doe's Article Title</title>
            <abstract locale="en_US">Jane Doe's Article Abstract...</abstract>
            <indexing>
               <discipline locale="en_US">Education; Literature Education</discipline>
               <subject locale="en_US">Young adult literature; Holocaust</subject>
               <coverage>
                  <geographical locale="en_US">North America</geographical>
                  <chronological locale="en_US">Contempoary</chronological>
                  <sample locale="en_US">Adolescent readers</sample>
               </coverage>
            </indexing>
            <author primary_contact="true">
               <firstname>Jane</firstname>
               <lastname>Doe</lastname>
               <email>JaneDoe@sample.test</email>
               <biography locale="en_US">Jane Doe's  Bio statement...</biography>
            </author>
            <date_published>2004-10-05</date_published>
            <htmlgalley locale="en_US">
               <label>HTML</label>
               <file>
                  <href src="myfile.html" mime_type="text/html"/>
               </file>
               <stylesheet>
                  <href src="myfile.xsl" mime_type="text/xsl"/>
               </stylesheet>
               <image>
                  <embed encoding="base64" filename="myfilename.png" mime_type="application/png">(base64-encoded data would appear here)
      </embed>
               </image>
               <image>
                  <href src="myimage2.png" mime_type="application/png"/>
               </image>
            </htmlgalley>
            <galley locale="en_US">
               <label>PDF</label>
               <file>
                  <href src="mygalley.pdf" mime_type="application/pdf"/>
               </file>
            </galley>
         </article>


or should I also add an header !?!
If yes, could you please let me know how is the header ?

Another question : How should be the section_abbrev parameter ? Something like ART

many thanks
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby JasonNugent » Thu Jun 07, 2012 2:55 am

Hi khufu,

You can wrap that <article> element with an <articles> container. Have a look at the native.dtd file which describes the format of the XML in plugins/importexport/native/native.dtd for your file. It is perfectly okay for an <articles> element to only have one <article> inside it (and probably less error-prone if you decide to import more than one article at a a time).

And yes, that would be an acceptable format for a section identifier.

Regards,
Jason
JasonNugent
Site Admin
 
Posts: 877
Joined: Tue Jan 10, 2006 6:20 am

Re: Database issue

Postby khufu » Thu Jun 07, 2012 4:39 am

Jason,
If my xml file is so structured :

Code: Select all
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE issue PUBLIC "-//PKP//OJS Articles and Issues XML//EN" "http://pkp.sfu.ca/ojs/dtds/native.dtd">
<issue published="true" current="true">
   <title>ANA_ISSUE</title>
   <volume> 1 </volume>
   <number> 1 </number>
   <year> 2012 </year>
   <section>
      <title locale="en_US">ANA_SECTION</title>
      <abbrev locale="en_US">ANA_ABR</abbrev>
      <article>
         <title> …article_title… </title>
         <abstract> …article_body… </abstract>
         <pages> 1 </pages>
         <date_published> …date... </date_published>
         <author primary_contact="true">
            <firstname>…name...</firstname>
            <middlename>…midlename...</middlename>
            <lastname>…surname…</lastname>
            <email>…email_address...</email>
         </author>
      </article>
   </section>
</issue>


and lunch the command :

Code: Select all
php /var/www/ojs/tools/importExport.php NativeImportExportPlugin import test.xml journal_name user_name issue_id 1 section_abbrev ANA_ABR


It keep creating an issue for each article !

What is wrong ?
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby JasonNugent » Thu Jun 07, 2012 5:11 am

Leave the issue stuff out of it :) Just start with an <articles> element and then your <article> elements inside. No need for the other stuff at all.

Code: Select all
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE issue PUBLIC "-//PKP//OJS Articles and Issues XML//EN" "http://pkp.sfu.ca/ojs/dtds/native.dtd">
<articles>
      <article>
         <title> …article_title… </title>
         <abstract> …article_body… </abstract>
         <pages> 1 </pages>
         <date_published> …date... </date_published>
         <author primary_contact="true">
            <firstname>…name...</firstname>
            <middlename>…midlename...</middlename>
            <lastname>…surname…</lastname>
            <email>…email_address...</email>
         </author>
      </article>
    </articles>
JasonNugent
Site Admin
 
Posts: 877
Joined: Tue Jan 10, 2006 6:20 am

Re: Database issue

Postby khufu » Thu Jun 07, 2012 5:21 am

Jason,
If I put the xml file as you suggested and lunch :
php /var/www/ojs/tools/importExport.php NativeImportExportPlugin import test.xml journal_name user_name issue_id 1 section_abbrev ANA_ABR

the system answers :
ERROR:
No section matched the specifier "ANA_ABR"

even if the ANA_ABR section is into the database !!!!

Any idea ?
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby JasonNugent » Thu Jun 07, 2012 5:29 am

I hate to ask, but are you *sure* it exists? Check your section_settings table for an entry with a setting_name of 'abbrev' and a setting_value of what you've specified. the query that actually runs there is:

Code: Select all
SELECT s.* FROM sections s, section_settings l WHERE l.section_id = s.section_id AND l.setting_name = 'abbrev' AND l.setting_value = ? AND s.journal_id = ?


Where those question marks get filled in by the various bits and pieces. Your other settings are correct? What are you putting for the journal name? It should be the 'PATH' that shows up in the URL for your journal, not the actual title of the journal.

Regards,
Jason
JasonNugent
Site Admin
 
Posts: 877
Joined: Tue Jan 10, 2006 6:20 am

Re: Database issue

Postby khufu » Thu Jun 07, 2012 6:09 am

Jason,
I reinstalled ojs from the beginning. So from an empty database I have done the following.
1) Created a new journal called 'notizie_italiane'
2) Created (with the web page) a new issue called 'Notizie'. (Volume = 1, Number = 1 , year = 2012) : Checked with phpmyadmin its id =1
3) In the Journal Sections there is one default section
Section title : articles
Abbreviation = ART
Using your xml example I lunch the command :
php /var/www/ojs/tools/importExport.php NativeImportExportPlugin import test.xml notizie_italiane admin issue_id 1 section_abbrev ART


The system answered :
ojs2 has produced an error
Message: WARNING: Missing argument 2 for SectionDAO::getSectionByAbbrev(), called in /var/www/ojs/plugins/importexport/native/NativeImportExportPlugin.inc.php on line 394 and defined
In file: /var/www/ojs/classes/journal/SectionDAO.inc.php
At line: 78
Stacktrace:
Server info:
OS: Linux
PHP Version: 5.3.10-1ubuntu3.1
Apache Version: N/A
DB Driver: mysql
DB server version: 5.5.22-0ubuntu1
NOTICE: Undefined variable: journalId (/var/www/ojs/classes/journal/SectionDAO.inc.php:80)
ERROR:
No section matched the specifier "ART".


Any idea ?
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby khufu » Thu Jun 07, 2012 6:39 am

Jason,
I solved using the section_id instead of section_abbrev.
Now I am going to introduce the flow of news and see what happens ....
I will let you know.

Many thanks
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby JasonNugent » Thu Jun 07, 2012 6:55 am

Hi again,

Hey, thanks for finding that bug. I'll fix that in a bit. For now, though, if you want to be able to go back to using your ART section_abbrev instead of the section_id, you can edit the NativeImportExportPlugin.inc.php file and change line 394 to:

Code: Select all
$section =& $sectionDao->getSectionByAbbrev(($sectionIdentifier = array_shift($args)), $journal->getId());


Cheers,
Jason
JasonNugent
Site Admin
 
Posts: 877
Joined: Tue Jan 10, 2006 6:20 am

Re: Database issue

Postby khufu » Fri Jun 08, 2012 5:51 am

Jason,
Now the importExport.php plugin imports articles correctly in one issue but anyway after imported 100 articles the import procedure takes too long. Now it is around 10 seconds but it is getting worse. The server I am using now is a 1.8 GHz. A second server that I tested is a 3GHz but after 1000 articles it takes 10-15 seconds to import articles. Both machines have 2 Gb of ram. What I think is that with so many articles (my goal is to have around 5000 articles into the database) what is so slow is the indexing process. Actually I only need the rss feed of those articles so I am wondering if there is a way to bypass the indexing process.
What do you think ?
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby JasonNugent » Mon Jun 11, 2012 7:55 am

Hi khufu,

Are you running any index commands for the file types you import? You are loading PDFs, if I recall. Do you have commands to parse those files defined in your config.inc.php? If so, have you tried disabling them?

Regards,
Jason
JasonNugent
Site Admin
 
Posts: 877
Joined: Tue Jan 10, 2006 6:20 am

Re: Database issue

Postby khufu » Mon Jun 11, 2012 2:13 pm

Jason,
I am not loading pdf files, just text files based on a title and a body.
The xml file that I give to importExport.php plugin is in the same format you suggested.
I have the feeling that the importExport plugin or the database had not been tested with a large numbers of articles.
For the first 200-300 hundred no problem, after that it becomes slower and slower.
This is pretty strange considering that the database is so small. Mine database has 2800 articles and is only 32 MB big.
Importing a new article now takes 10-15 seconds while when the db was empty it took less then one sec.
You can easily replicate the problem with a simple bach script that imports files composed by random characters.
If you will find the same problem we can work together for the solution. I am available for any test.

What do you think ?

cheers
khufu
 
Posts: 23
Joined: Tue Mar 20, 2012 8:19 am

Re: Database issue

Postby JasonNugent » Mon Jun 11, 2012 2:42 pm

Hi khufu,

I can assure you that there are OJS installations with many thousands of articles. I've managed one that had more than 20 thousand.

If you believe that it is the article search index that is causing the problem, I suggest perhaps modifying the NativeImportDom.inc.php file in plugins/importexport/native by commenting out the lines beginning on line 900 (or thereabouts). That would prevent the article search index from being updated.

Regards,
Jason
JasonNugent
Site Admin
 
Posts: 877
Joined: Tue Jan 10, 2006 6:20 am

PreviousNext

Return to OJS Technical Support

Who is online

Users browsing this forum: No registered users and 2 guests