OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



PDF to text problem

Are you responsible for making OJS work -- installing, upgrading, migrating or troubleshooting? Do you think you've found a bug? Post in this forum.

Moderators: jmacgreg, btbell, michael, bdgregg, barbarah, asmecher

Forum rules
The Public Knowledge Project Support Forum is moving to http://forum.pkp.sfu.ca

This forum will be maintained permanently as an archived historical resource, but all new questions should be added to the new forum. Questions will no longer be monitored on this old forum after March 30, 2015.

PDF to text problem

Postby carror » Sun Aug 25, 2013 4:41 pm

I have read all the information about PDF TO TEXT command and after 2 days studying it I an still getting an error message!

Before I give up I will give it a try here in the forum. So I have this the my server:

a) found the directory where all my pages are

/files/journals/0/files-pdf/

b) I am running the command to convert the PDF to text:

/files/journals/0/files-pdf/ UTF-8-nopgbrk% s - | /usr/bin/tr '[: cntrl:]' ''

After that nothing happens but in the log appears:

Could not write the PDF on remote host http://www.sitepor500.com.br
Check the permission table

So what? What does this mean? The remote host has the permission 777 which allows reading and writing! What could be the problem?
carror
 
Posts: 2
Joined: Sun Aug 25, 2013 4:35 pm

Re: PDF to text problem

Postby JasonNugent » Mon Aug 26, 2013 6:16 am

Hi carror,

the pdf2text command doesn't actually convert a PDF file to a text one. It just extracts the text from it. OJS uses the command to build search indexes. What exactly are you trying to do with the command?

The command you typed:

Code: Select all
/files/journals/0/files-pdf/ UTF-8-nopgbrk% s - | /usr/bin/tr '[: cntrl:]' ''


Is missing the actual path to pdf2text. Is this exactly what you typed?

regards,
Jason
JasonNugent
Site Admin
 
Posts: 910
Joined: Tue Jan 10, 2006 6:20 am

Re: PDF to text problem

Postby carror » Mon Aug 26, 2013 6:47 pm

Friend,

The command I used is an old command my brother gave me in order to crawl PDF files and insert their content in a protected directory that will allow users to search words in it. The protected directory is daily assigned to a database so I can run queries.

I called my brother and he said me that always when he executed that command, all the files like "01file-pdf.txt" were saved at the destination folder. Only the content in plain text. After asking his help he solved my problem! So I will share with you if someone have the same problem:

if you export the content to a directory it must be in the same domain and must have the 777 permission. I was exporting to another domain (my FTP server) and did not grant the write permission. So, right after that I changed the domain to "localhost" and the files "01file-pdf.txt" started showing up!

I would like to thank Jason for trying to help me since nobody else did!
carror
 
Posts: 2
Joined: Sun Aug 25, 2013 4:35 pm


Return to OJS Technical Support

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 1 guest

cron