Bug 8815 - OAI + Mod_Rewrite + getRequestPath() + $_SERVER['PATH_INFO'] + getBaseUrl()
OAI + Mod_Rewrite + getRequestPath() + $_SERVER['PATH_INFO'] + getBaseUrl()
Status: NEW
Product: OJS
Classification: Unclassified
Component: Framework
To be determined
All All
: P3 normal
Assigned To: PKP Support
Depends on:
  Show dependency treegraph
Reported: 2014-06-19 20:12 PDT by QUT Library eServices
Modified: 2014-08-10 19:45 PDT (History)
2 users (show)

See Also:
Version Reported In: 2.3.7
Also Affects:

Screenshot of Request Url + Base Url reporting (191.87 KB, image/jpeg)
2014-06-19 20:12 PDT, QUT Library eServices

Note You need to log in before you can comment on or make changes to this bug.
Description QUT Library eServices 2014-06-19 20:12:49 PDT
Created attachment 4034 [details]
Screenshot of Request Url + Base Url reporting

SUMMARY : [PKPRequest.inc.php]
Request::getRequestUrl() / Request::getRequestPath()  does not honour base_url[jld] with RESTful URLs on

CONFIG SETTINGS : [config.inc.php]
base_url = http://www.jld.edu.au
base_url[jld] = http://www.jld.edu.au
disable_path_info = Off 
restful_urls = On
force_ssl = Off
repository_id = "jld.www.jld.edu.au"

httpd.conf - RewriteRule RewriteRule ^(.*)$ /index.php/jld$1 [QSA,L] 

Separate instances of OJS on same server using virtualhost configuration
The system(s) work using this configuration

3.5 Register Journal for Indexing (Metadata Harvesting)
>>> To register, you will need the base URL for your repository: https://www.jld.edu.au/oai <<< [This is correct] 

http://www.jld.edu.au/oai >> displays the expected page eg : badVerb error
http://www.jld.edu.au/oai?verb=Identify  >> works as do the other links

Request URL >> https://www.jld.edu.au/jld/oai
Base    URL >> https://www.jld.edu.au/jld/oai
See attachment (OAI-Issue-1.jpg)

CAUSE : [PKPRequest.inc.php]
Request URL provided by Request::getRequestPath() returns value relative to $_SERVER['PATH_INFO'] which returns the modified Request URL from mod_rewrite in httpd.conf

There appears to be no specific code handling RESTful urls within OJS structure apart from a flag relating to RESTful URLs being used as switch logic  to determine which part(s) of the Request URL are returned.

Later versions up to OJS (Alpha) 3 release have not made any changes to the [PKPRequest.inc.php] Request::getRequestPath() code and the lack of similar reference in forum infers that this mod_rewrite path issue is relatively unique to our installation. 

Comment 1 Clinton Graham 2014-07-17 12:26:22 PDT
Can you provide more detail on your virtualhost setup in apache, and any effective re-write rules?  I'm having trouble duplicating this issue.

I assume you're using OJS 2.4.4-1 ?
Comment 2 QUT Library eServices 2014-07-22 20:06:14 PDT
We are running Version 2.3.7 

The rewrite rule is in original post :
httpd.conf - RewriteRule RewriteRule ^(.*)$ /index.php/jld$1 [QSA,L] 

Separate instances of OJS on same server using virtualhost configuration
The system(s) work using this configuration

Vhost Conf file:
# Journal of Learning Design : Production Host
# MG 20120717
# 2014-06-24 : MK : remove cronolog

    DocumentRoot /www/www.jld.edu.au/webroot
    ServerName www.jld.edu.au
    ServerAlias jld.edu.au
    RewriteEngine On

    # Rewrite OAI-PMH requests
    RewriteRule ^/(jld)/oai /index.php/$1/oai [QSA,L]

    # Redirect all other traffic to https
    RewriteCond %{HTTPS} !^on$
    # ... but don't redirect requests to /lib/* so that user agents
    # can load the XSL for the OAI-PMH response
    RewriteCond %{REQUEST_URI} !^/lib/
    RewriteRule (./*) https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]



    SSLEngine on

    DocumentRoot /www/www.jld.edu.au/webroot
    ServerName www.jld.edu.au
    ServerAlias jld.edu.au

    DocumentRoot /www/www.jld.edu.au/webroot
    ServerName www.jld.edu.au
    ServerAlias jld.edu.au
    RewriteEngine On

    # Redirect requests for root to default journal homepage
    RewriteCond %{REQUEST_URI} =/
    RewriteRule ^.*$ /index

    # Rewrite RESTful site admin URLs
    # don't redirect real files
    # don't redirect real folders
    # only paths starting in '/index/'
    RewriteCond %{REQUEST_URI} ^/index/
    RewriteRule ^(.*)$ /index.php/$1 [QSA,L]

    # Rewrite requests for RESTful URLs to full URLs,
    # e.g., /about to /index.php/jld/about
    # don't redirect real files
    # don't redirect real folders
    # don't redirect SPEP URLs
    RewriteCond %{REQUEST_URI} !^(/spep/|/index\.php)
    RewriteRule ^(.*)$ /index.php/jld$1 [QSA,L]

# The old journal site, www.jld.qut.edu.au, is maintained here to
# support continued redirection.

    DocumentRoot /www/www.jld.edu.au/webroot
    ServerName www.jld.qut.edu.au

    RewriteEngine On

    # About
    RewriteRule ^/about/contact.jsp https://www.jld.edu.au/about/contact [R=301,L]
    RewriteRule ^/about/ https://www.jld.edu.au/about [R=301,L]

    # Issues
    RewriteRule ^/publications/vol1no1/ https://www.jld.edu.au/issue/view/2 [R=301,L]
    RewriteRule ^/publications/vol1no2/ https://www.jld.edu.au/issue/view/5 [R=301,L]
    RewriteRule ^/publications/vol1no3/ https://www.jld.edu.au/issue/view/6 [R=301,L]
    RewriteRule ^/publications/vol2no1/ https://www.jld.edu.au/issue/view/7 [R=301,L]
    RewriteRule ^/publications/vol2no2/ https://www.jld.edu.au/issue/view/8 [R=301,L]
    RewriteRule ^/publications/vol3no1/ https://www.jld.edu.au/issue/view/9 [R=301,L]
    RewriteRule ^/publications/vol3no2/ https://www.jld.edu.au/issue/view/10 [R=301,L]
    RewriteRule ^/publications/vol3no3/ https://www.jld.edu.au/issue/view/11 [R=301,L]
    RewriteRule ^/publications/vol4no1/ https://www.jld.edu.au/issue/view/12 [R=301,L]
    RewriteRule ^/publications/vol4no2/ https://www.jld.edu.au/issue/view/13 [R=301,L]
    RewriteRule ^/publications/vol4no3/ https://www.jld.edu.au/issue/view/14 [R=301,L]
    RewriteRule ^/publications/vol4no4/ https://www.jld.edu.au/issue/view/15 [R=301,L]
    RewriteRule ^/publications/ https://www.jld.edu.au/issue/archive [R=301,L]

    # Other key pages
    RewriteRule ^/editorial/ https://www.jld.edu.au/about/displayMembership/1 [R=301,L]
    RewriteRule ^/international/ https://www.jld.edu.au/about/displayMembership/2 [R=301,L]
    RewriteRule ^/author/ https://www.jld.edu.au/information/authors [R=301,L]
    RewriteRule ^/copyright/ https://www.jld.edu.au/about/submissions#copyrightNotice [R=301,L,NE]

    # Catch any un-rewritten URLs
    RewriteRule ^/ https://www.jld.edu.au/index [R=301,L]


Please advise if any other information is required
Comment 3 QUT Library eServices 2014-07-22 20:17:15 PDT
Also Apache : 2.2.3
Linux : Red Hat Enterprise Linux Server release 5.10 (Tikanga)
Comment 4 Clinton Graham 2014-07-23 07:33:26 PDT
2.3.7 is a really old version.  Have you considered an upgrade to 2.4?  While Request::getRequestPath() has not changed, it depends on Request::getBasePath(), which has.  Specifically, there were a number of changes attached to Bug #8014 that might end up being relevant.

As an aside, I don't think a rewrite rule applied in httpd.conf outside of a VirtualHost context with matching VirtualHost directives will be applied, so I think only your rewrites from the Vhost conf file are effective.  Note that this re-writes your OAI-PMH requests to HTTPS, contrary to your documentation.
Comment 5 Clinton Graham 2014-08-04 13:07:35 PDT
I used your provided configuration to test against OJS 2.4.4, both with and without the proposed patch in #8797.  Both tests were successful, returning OAI-PMH base URLs without the "/jld/" root.

Have you been able to test version 2.4?
Comment 6 Alec Smecher 2014-08-05 10:20:38 PDT
(See also bug #8773, which may resolve this issue.)
Comment 7 QUT Library eServices 2014-08-10 19:45:43 PDT
Thanks for updates. I haven't tested with 2.4 yet. We haven't planned for the  2.4 upgrade yet so I assume we'll be running 2.3.7 for at least another few months. I'll have a look at the patches from bug #8773 and see if that's a quick fix.