You are viewing the PKP Support Forum | PKP Home Wiki

Search not via PDFtoText

OJS development discussion, enhancement requests, third-party patches and plug-ins.

Moderators: jmacgreg, btbell, michael, bdgregg, barbarah, asmecher

Forum rules
Developer Resources:

Documentation: The OJS Technical Reference and the OJS API Reference are both available from the OJS Documentation page.

Git: You can access our public Git Repository here. Comprehensive Git usage instructions are available on the wiki.

Bugzilla: You can access our Bugzilla report tracker here.

Search: You can use our Google Custom Search to search across our main website, the support forum, and Bugzilla.

Questions and discussion are welcome, but if you have a workflow or usability question you should probably post to the OJS Editorial Support and Discussion subforum; if you have a technical support question, try the OJS Technical Support subforum.

Search not via PDFtoText

Postby pashton » Mon Sep 03, 2007 5:28 pm

I would absolutely love a feature that allowed you to have full text searching on a system that cannot (or I should say will not) install the pdftotext feature.

For me an ideal kind of thing would be to be able to upload a HTML galley but have a tick box that hides it from public view, or just upload a text/html file that can be indexed.

P.S. I have seen the hack to gray out the html but I am in the process of planning to bring a number of journals into the OJS framework and hacks are undesirable when dealing with a larger number of journals.


Paul Ashton
Posts: 38
Joined: Fri Dec 17, 2004 5:51 pm

Re: Search not via PDFtoText

Postby JasonNugent » Tue Sep 04, 2007 5:26 am


A while ago, we implemented something like what you want because we were having pdftotext issues. It involves creating a text file containing the full text of the PDF, which is uploaded as a hidden file and handed to the indexer for searching.

I'm attaching a unified diff to my post, which contains the code. It's a diff against version 2.1.0 of OJS, and may/may not go in cleanly against the latest branch. We've since moved away from it, because our issue with pdftotext has been remedied and we're using that now, instead.


(8 KiB) Downloaded 83 times
Site Admin
Posts: 910
Joined: Tue Jan 10, 2006 6:20 am

Return to OJS Development

Who is online

Users browsing this forum: Baidu [Spider] and 1 guest