Fetch archive metadata with Proxy

Open Harvester Systems support questions and answers, bug reports, and development issues.

Moderators: jmacgreg, michael, John

Forum rules
The Public Knowledge Project Support Forum is moving to http://forum.pkp.sfu.ca

This forum will be maintained permanently as an archived historical resource, but all new questions should be added to the new forum. Questions will no longer be monitored on this old forum after March 30, 2015.
evadelhi
Posts: 2
Joined: Fri Nov 19, 2010 3:49 am

Fetch archive metadata with Proxy

Postby evadelhi » Fri Nov 19, 2010 4:13 am

Hello,

I am working in a project in Spain to have published all scientific journals of Research Public Organization. We are doing a web page where everybody can search for research published articles. We do not have any problem in doing it in the laboratory we have before implementing it in production. But, when we try to launch the web page on the server in production, we find some problems in the harvesting itself.
When we try to fetch archive metadata, we can not do it. In the beta testing server we are using a public IP with no problems. I think the main problem in the production server is because they have a proxy server. Thanks to you r forum, i discovered in the config.inc.php, i needed to modify this part:

; The HTTP proxy configuration to use
http_host = 10.20.5.72
http_port = 8080
; proxy_username = username
; proxy_password = password

I have already done it but.. i hve not success, i was debugging when i did the "Fetch archive metadata" and i obtained "Undeclared identity warning" at the left top and then on the bottom i had:

Loaded locale file "plugins/generic/ipban/locale/en_US/locale.xml" from XML
Loaded locale file "plugins/generic/pkpdc/locale/en_US/locale.xml" from XML
Loaded locale file "plugins/harvesters/oai/locale/en_US/locale.xml" from XML
Loaded locale list "registry/locales.xml" from XML
Load schema map "registry/schemaMap.xml" from XML
Loaded locale file "plugins/preprocessors/typemap/locale/en_US/locale.xml" from XML
Loaded locale file "plugins/preprocessors/languagemap/locale/en_US/locale.xml" from XML
Loaded locale file "plugins/preprocessors/regex/locale/en_US/locale.xml" from XML
Loaded locale file "plugins/schemas/dc/locale/en_US/locale.xml" from XML
Loaded locale file "plugins/schemas/mods/locale/en_US/locale.xml" from XML
Loaded locale file "plugins/schemas/marc/locale/en_US/locale.xml" from XML
Loaded locale file "locale/en_US/locale.xml" from XML

Moreover, i updated metadata index and i obtained:

An unknown header tag was received from the server: html
An unknown header tag was received from the server: head
An unknown header tag was received from the server: meta
An unknown header tag was received from the server: link
An unknown header tag was received from the server: title
An unknown header tag was received from the server: script
An unknown header tag was received from the server: body
An unknown header tag was received from the server: div
An unknown header tag was received from the server: h1
An unknown header tag was received from the server: a
An unknown header tag was received from the server: h2
An unknown header tag was received from the server: ul
An unknown header tag was received from the server: li
An unknown header tag was received from the server: span
An unknown header tag was received from the server: form
An unknown header tag was received from the server: label
An unknown header tag was received from the server: select
An unknown header tag was received from the server: option
An unknown header tag was received from the server: input
An unknown header tag was received from the server: br
An unknown header tag was received from the server: img
An unknown header tag was received from the server: p


I am wotking on it since long time ago.
I would be very grateful if you can help me.

Sincerely yours,

Eva

evadelhi
Posts: 2
Joined: Fri Nov 19, 2010 3:49 am

Re: Fetch archive metadata with Proxy

Postby evadelhi » Thu Nov 25, 2010 7:20 am

Continue..
Since last time I posted, the customer decided to set allow_url_fopen to On and it is working now. But i read it is not recommended to do that because it is a safety vulnerability.
The customer sent me as well the Network Topology and ask me if it is ok to work with this type of topology. I attached you as well.
We had a test server where Harvester is working perfectly, the difference is we do have public IP and we do not have any proxy. I sniffered our traffic (htmlxmlresponse.jpg) and they did it as well before setting allow_url_fopen to On, according the results i attached you as well, it seems the proxy (IP:
10.20.5.72, port:8080) is not sending a proper xml format file to the server (htmltextresponse.jpg).
Attachments
networktopology.JPG
networktopology.JPG (78.24 KiB) Viewed 2616 times
htmlxmlresponse.JPG
htmlxmlresponse.JPG (284.51 KiB) Viewed 2616 times
htmltextresponse.JPG
htmltextresponse.JPG (353.92 KiB) Viewed 2616 times

asmecher
Posts: 10015
Joined: Wed Aug 10, 2005 12:56 pm
Contact:

Re: Fetch archive metadata with Proxy

Postby asmecher » Mon Nov 29, 2010 1:02 pm

Hi Eva,

I suspect your proxy server is sending back a notice of some kind (maybe about authentication, or something similar) rather than the XML OAI response that the Harvester is expecting. Can you capture that response and see what it contains? You can do this through network sniffing tools, or perhaps your proxy log, and if that doesn't help, then let me know and I'll suggest how you can do it with a PHP modification.

Also, what version of OHS (as the Harvester is now called) are you using?

Regards,
Alec Smecher
Public Knowledge Project Team


Return to “Open Harvester Systems Support and Development”

Who is online

Users browsing this forum: No registered users and 1 guest