PKP Bugzilla – Bug 6835
published_articles 'views' count inconsistent
Last modified: 2012-09-21 14:55:05 PDT
It appears that the views count in the published_articles table is updated inconsistently. In some cases in my local install, I'm seeing a single galley view add two increments to an article view count, and in other cases I'm not seeing any increment at all. Also, there doesn't seem to be any relation between what you see in published_articles.views and article_galleys.views, even taking into consideration that published_articles.views *should* include abstract counts, while article_galleys.views *doesn't* include abstract counts and separates views distinctly by galley type. I've been doing a bit of testing on nepjol for this. Here are the results of two queries: SELECT article_id, views FROM article_galleys ORDER BY views DESC LIMIT 10; 1362 78926 290 18900 2088 17312 247 12889 34 12222 484 11277 491 9960 2693 9684 37 9676 43 9562 SELECT article_id, views FROM published_articles ORDER BY views DESC LIMIT 10; 4100 19830 4099 9718 1029 6657 1031 5164 159 3972 1363 3426 596 3388 1034 2163 1618 2156 34 2059 Note a few things: 1) Nepjol only publishes PDFs, so you don't have to worry about adding different galley view counts together to approximate a total close to the published article view count; 2) even though IIRC the published article view count should include abstract views, in this case the view counts are actually LOWER across the board for published_galleys; 3) the article_ids don't match up, making me think that one or the other view count algorithm is markedly inconsistent, not just incorrect -- if it were merely incorrect (eg. inflated or deflated one way or another), you'd probably see a similar article_id order. From what little testing I've done, my suspicion is that the article_galleys.views counts are correct, but that something's borked with the published_articles.views count algorithm. I can provide a DB dump if need be; alternatively, nepjol is probably a good place to take a look.
James, I think NepJOL is going to be too confusing a data source for chasing down this bug. The install was updated from an older version, and IIRC there was a problem in OJS < 2.3.3-2 with counts being recorded twice; that bug will still be reflected in view counts. I also wouldn't expect any correlation between high numbers of abstract views and high numbers of PDF views; a high search engine ranking of one over the other would explain widely diverging numbers, and therefore the different order of article IDs you're getting from your queries (since you're sorting by view counts). If possible, I'd suggest doing some testing with a less convoluted source of data to make sure you're not too bogged down in legacy stuff.
(In reply to comment #1) > James, I think NepJOL is going to be too confusing a data source for chasing > down this bug. The install was updated from an older version, and IIRC there > was a problem in OJS < 2.3.3-2 with counts being recorded twice; that bug will > still be reflected in view counts. I also wouldn't expect any correlation > between high numbers of abstract views and high numbers of PDF views; a high > search engine ranking of one over the other would explain widely diverging > numbers, and therefore the different order of article IDs you're getting from > your queries (since you're sorting by view counts). > > If possible, I'd suggest doing some testing with a less convoluted source of > data to make sure you're not too bogged down in legacy stuff. Points taken, Alec -- I just tried from a totally clean install, using an imported issue (from our demo journal) as data. It appears that the public_galleys.views field is only incremented when an abstract is viewed -- before I continue testing any further, could you maybe give me a pointer as to where this behaviour is controlled in the code, and possibly whether the public_galleys.views count is meant to only count abstract views (although with a pointer to the code I should be able to figure that out)?
...public_galleys?
(In reply to comment #3) > ...public_galleys? What the? Yeah, I meant published_articles.
That's right, published_articles.views is only supposed to count abstract views.
Needs discussion.