We are moving to Git Issues for bug tracking in future releases. During transition, content will be in both tools. If you'd like to file a new bug, please create an issue.

Bug 7166 - Investigate web cache file name generation algorithm
Investigate web cache file name generation algorithm
Product: OJS
Classification: Unclassified
Component: General
All All
: P3 normal
Assigned To: Alec Smecher
Depends on:
  Show dependency treegraph
Reported: 2012-02-21 17:10 PST by Matthew Crider
Modified: 2013-01-02 13:45 PST (History)
2 users (show)

See Also:
Version Reported In:
Also Affects:


Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Crider 2012-02-21 17:10:58 PST
It seems like the method used to generate web cache files is not working properly -- one journal using the feature has generated over 250k cache files.
Comment 1 Matthew Crider 2012-02-23 11:27:22 PST
From what I can tell, the caching is working fine and can't be optimized any further (but I'm hardly an expert with this code).  The only problem I see is that expired cache files just hang around until they are regenerated, which may never happen.  This might be a good candidate for a scheduled task (periodic cleanup of cache files older than web_cache_hours).  Though I think only megajournals (like the one I was working with) will really see major disk usage from web caching.
Comment 2 Alec Smecher 2012-02-23 11:46:05 PST
Even for a megajournal, it does seem like there were an unreasonable number of files there.

In any case, I think a periodic cleanup as you describe would be helpful.
Comment 3 Alec Smecher 2013-01-02 12:01:30 PST
(Started logging the bd install in /tmp/wc.log; see lib/pkp/classes/core/PKPPageRouter.inc.php for an error_log statement. Will check back after some time has elapsed to log weird boundary cases. Suspect there are 404s getting logged or something similar that causes the potential URL space to be limitless.)
Comment 4 Alec Smecher 2013-01-02 13:30:02 PST
Added note for web_cache CRON setup
Comment 5 Alec Smecher 2013-01-02 13:30:02 PST
Added note for web_cache CRON setup
Comment 6 Alec Smecher 2013-01-02 13:40:30 PST
This looks to be working OK to me too. I'm installing in the crontabs for journals that use webcache a line that reads:

@daily find ~/ojs/cache -maxdepth 1 -name wc-\*.html -mtime +1 -exec rm "{}" ";"

This will remove files older than 24 hours on a daily basis and solve the current cache maintenance headaches.

I've installed this in all accounts using web_cache on lib-journals[x].

I also added a note in config.TEMPLATE.inc.php to help others with this config.
Comment 10 Alec Smecher 2013-01-02 13:45:02 PST
Added note for web_cache CRON setup