PDF full-text indexing not working

Are you responsible for making OJS work -- installing, upgrading, migrating or troubleshooting? Do you think you've found a bug? Post in this forum.

Moderators: jmacgreg, btbell, michael, bdgregg, barbarah, asmecher

Forum rules
The Public Knowledge Project Support Forum is moving to http://forum.pkp.sfu.ca

This forum will be maintained permanently as an archived historical resource, but all new questions should be added to the new forum. Questions will no longer be monitored on this old forum after March 30, 2015.
Posts: 1
Joined: Thu Jul 19, 2012 8:18 am

PDF full-text indexing not working

Postby atlopes » Thu Jul 19, 2012 9:19 am

Solved: upgrade from 2.3.6 to 2.3.7 (after running deeper in the forum...)

Hi, all.

I'm not being able to make the PDF full-text indexing work for our OJS installation.

I wrote a small php program to test if pdftotext is working properly. I tried to replicate, to the best of my knowledge, the OJS code that invokes the extractor. The test program is located at http://www.iuc-revistas.com/bin/info.php and goes like this:

Code: Select all


echo 'Safe mode status: ';
if (ini_get('safe_mode')) echo 'ON';
else echo 'OFF';

echo '<br />Mime type using mime_content_type function: ';
echo mime_content_type('/hsphere/local/home/revistasiuc/iuc-revistas.com/bin/test.pdf');

echo '<hr />Output from pdftotext:';
echo '<pre>';
$fp = popen('/hsphere/local/home/revistasiuc/bin/pdftotext/pdftotext /hsphere/local/home/revistasiuc/iuc-revistas.com/bin/test.pdf','r');
echo fgets($fp,4096);
echo '</pre>';


pdftotext is actually a wrapper that invokes the real program:

Code: Select all

/hsphere/local/home/revistasiuc/bin/pdftotext/pdftotext.script -enc UTF-8 -nopgbrk $1 -

I set the PDF index entry in the search section of the configuration files this way:

Code: Select all

index[application/pdf] = "/hsphere/local/home/revistasiuc/bin/pdftotext/pdftotext %s"

Nevertheless, the PDF documents are not being indexed.

When I run the rebuildSearchIndex tool in my Windows box, against a local replica of our site, I can recreate the index for the articles files, but I'm stuck to make it work in the web installation and at publication time.

Is there anything I am missing here?

Thanks in advance,

António Lopes

Return to “OJS Technical Support”

Who is online

Users browsing this forum: No registered users and 1 guest