PKP Bugzilla – Bug 7166
Investigate web cache file name generation algorithm
Last modified: 2013-01-02 13:45:02 PST
It seems like the method used to generate web cache files is not working properly -- one journal using the feature has generated over 250k cache files.
From what I can tell, the caching is working fine and can't be optimized any further (but I'm hardly an expert with this code). The only problem I see is that expired cache files just hang around until they are regenerated, which may never happen. This might be a good candidate for a scheduled task (periodic cleanup of cache files older than web_cache_hours). Though I think only megajournals (like the one I was working with) will really see major disk usage from web caching.
Even for a megajournal, it does seem like there were an unreasonable number of files there. In any case, I think a periodic cleanup as you describe would be helpful.
(Started logging the bd install in /tmp/wc.log; see lib/pkp/classes/core/PKPPageRouter.inc.php for an error_log statement. Will check back after some time has elapsed to log weird boundary cases. Suspect there are 404s getting logged or something similar that causes the potential URL space to be limitless.)
Added note for web_cache CRON setup https://github.com/pkp/ojs/commit/76de03f284fb1fb3dd69282e1ecaf840e0ad629f
Added note for web_cache CRON setup https://github.com/pkp/ojs/commit/8858f0686a6bef69d938ce3a7b3fcf2ef34794ed
This looks to be working OK to me too. I'm installing in the crontabs for journals that use webcache a line that reads: @daily find ~/ojs/cache -maxdepth 1 -name wc-\*.html -mtime +1 -exec rm "{}" ";" This will remove files older than 24 hours on a daily basis and solve the current cache maintenance headaches. I've installed this in all accounts using web_cache on lib-journals[x]. I also added a note in config.TEMPLATE.inc.php to help others with this config.
Typo https://github.com/pkp/ojs/commit/f7064eb50793d3033f2ed798fb0e7cf67f45f274
Typo https://github.com/pkp/ojs/commit/348dbd5d989bb7d4a92b0eff71b49bcf209a94c1
Typo https://github.com/pkp/omp/commit/1b2c6c638a873eff46e829d59f224e5c5a51352c
Added note for web_cache CRON setup https://github.com/pkp/omp/commit/ae699c0b9df3f476b89a2a847e89a7b51d0bf8d8