OJS OCS OMP OHS

You are viewing the PKP Support Forum | PKP Home Wiki



Public IDs (aka URLs and DOIs) notation and SEO improvements

Are you an Editor, Author, or Journal Manager in need of help? Want to talk to us about workflow issues? This is your forum.

Moderators: jmacgreg, michael, vgabler, John

Forum rules
This forum is meant for general questions about the usability of OJS from an everyday user's perspective: journal managers, authors, and editors are welcome to post questions here, as are librarians and other support staff. We welcome general questions about the role of OJS and how the workflow works, as well as specific function- or user-related questions.

What to do if you have general, workflow or usability questions about OJS:

1. Read the documentation. We've written documentation to cover from OJS basics to system administration and code development, and we encourage you to read it.

2. take a look at the tutorials. We will continue to add tutorials covering OJS basics as time goes on.

3. Post a question. Questions are always welcome here, but if it's a technical question you should probably post to the OJS Technical Support subforum; if you have a development question, try the OJS Development subforum.

Public IDs (aka URLs and DOIs) notation and SEO improvements

Postby mbria » Wed Jun 26, 2013 10:12 am

Hi all,

During the last year (based on former work done by the UOC fellows) we have been reflecting on "public ID notation".

UOC people proposed to follow a few basic SEO practices to build URLs that "made sense" to google (et al) and suggested a notation that is smart so we also adopted and adapted it a little. The notation could summarized as:

    * Issue: v[volum]-n[number]-[year]
    * Article: v[volum]-n[number]-[first_author_surname]-[second_author_surname]...
    * Article with more than 3 authors: v[volum]-n[number]-[first_author_surname]-et-al...
    * Downloadable resource: pdf | html | whatever...
    * Downloadable resource in multilang: [extension]-[3_digit_lang]

Disclaimer: Sorry in advance to publish it summarized and not following a regular grammar. :-)

In OJS 2.3.6 you can configure your system to build nice URLs as follows:


Our magazines rarely publish supplementary files, so we didn't pay much attention to this. Proposals are welcomed.

Why those URLs? Google (et. al.) only distinguish 3 special characters in the URL and those chars are "-", "." and "/" that are used as separators.
And this is why every word is separated by "-" to facilitate the "perfect match".

We also expect to build URLs smaller than 100 chars (that it's also a SEO recommendation).

So for instance, article's URLs will include separate strings with magazine name/tag ("mymagazine"), the basic issue info ("v1" and "n1") and the author's surnames ("bria-smith-chen") and the resources will include the mime-type ("pdf") and the language ("eng")...

I believe this is a good practice and this is why I share... BUT first, I also have some questions-proposals: :-)

    * Why "/" char is wrongly parsed? When you try to include the slash it's translated to %2F that is ok in general to avoid security issues but "/" is also a good element to play with in urls (and also DOIs), isn't it? So my suggestion is not parsing this char and letting the editors add "/" in the "public IDs" if they like.
    * Why Galleys names need to be unique? In new OJS versions "Public IDs" are moved as plugins (and this is a really good idea) but now, in this change, "Galleys" tags (and supplementary files) need to be unique and this breaks the former notation. Is there a reason for this? Will downloaded file keep the personalized name? I mean, the article URL need to be unique, but the downloadable resource url will be a combination of the article's url+the resource "public id" so it will include duplicities in the final address (pe: http://mysite/mymagazine/article/view/v ... 1-bria.pdf)
    * Why not including a %x variable for DOIs? Would be nice to have a variable to build patterns that generate a "DOI Sufix" including "Custom Identifiers"? It will let us create DOIs as 10.1234/magazines/mymagazine/v1-n1-bria-smith-chen (with a pattern like "magazines/%j/%x"). Once again, slash is "forbidden" here as far as will be translated to %2F. :-(

That's all...

Thanks for your comments,
m.
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby mbria » Tue Jul 16, 2013 1:55 am

bumping
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby asmecher » Wed Jul 17, 2013 11:41 am

Hi Marc,

Why "/" char is wrongly parsed? When you try to include the slash it's translated to %2F that is ok in general to avoid security issues but "/" is also a good element to play with in urls (and also DOIs), isn't it? So my suggestion is not parsing this char and letting the editors add "/" in the "public IDs" if they like.
I'm assuming you mean "/" characters placed in custom identifiers. This is a reflection of the way OJS URLs are parsed and handled; given two of the examples above:
...how would OJS know in the second example whether we were referring to a PDF belonging to article "v1-n1-bria", or an article landing page with "v1-n1-bria/pdf" as the public identifier?

For DOIs, there was a recent adjustment to the handling of "/" in suffixes; see http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=8190. This fix will be released in OJS 2.4.3 (but you can patch it yourself if you like).

Why Galleys names need to be unique?
There may be implications related to the relevant PID standards; I'll check with the developer who was most recently working on those.
Why not including a %x variable for DOIs?
That seems like a good idea, and I can't think of any reason not to do it. I've filed it at http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=8320.

Thanks,
Alec Smecher
Public Knowledge Project Team
asmecher
 
Posts: 8851
Joined: Wed Aug 10, 2005 12:56 pm

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby swing » Wed Jul 17, 2013 12:33 pm

Hi guys! :-)

The galley custom identifiers have to be unique now in order for galley DOIs using custom identifiers as suffix to be unique. Earlier, there were no possibility to assign a DOI to a galley so this wasn't important, I suppose, but now...

I also see no reason at the moment why %x shouldn't be considered.

Thanks and best wishes!
Bozana :-)
swing
 
Posts: 142
Joined: Tue Oct 09, 2007 2:59 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby mbria » Mon Jul 22, 2013 2:59 am

Hi fellows !!

Nice to read you both. :-D

Answers below:

Marc: Why "/" char is wrongly parsed? When you try to include the slash it's translated to %2F that is ok in general to avoid security issues but "/" is also a good element to play with in urls (and also DOIs), isn't it? So my suggestion is not parsing this char and letting the editors add "/" in the "public IDs" if they like.

Alec: I'm assuming you mean "/" characters placed in custom identifiers...how would OJS know whether we were referring to a PDF belonging to article "v1-n1-bria", or an article landing page with "v1-n1-bria/pdf" as the public identifier?


Humm... I understand know. I don't see an easy solution then. :-(
I was just wondering (and wishing :-P)

Alec: For DOIs, there was a recent adjustment to the handling of "/" in suffixes; see http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=8190.
This fix will be released in OJS 2.4.3 (but you can patch it yourself if you like).


Good news. Thanks.

Alec: There may be implications related to the relevant PID standards; I'll check with the developer who was most recently working on those.

Bozana: The galley custom identifiers have to be unique now in order for galley DOIs using custom identifiers as suffix to be unique. Earlier, there were no possibility to assign a DOI to a galley so this wasn't important, I suppose, but now...


I didn't know you were talking about Bozana. ;-)

I don't claim about DOIs (we are introducing DOIs right now, so I can't talk much about it), my real concern is about URLs.

Right now, the magazines were able to assign to galleys a NON unique public ID (in OJS 2.3.x or lower, it means same public ID for DOI&URL).
As far as OJS don't warns, I suspect that a lot of magazines apply a syntax like:
    http://mysite/mymagazine/article/view/article-public-ID/doc-type

If now, the galley-public-ID need to be unique, we will break the backward compatibility.

More than this... if the URL is build in the same way was built in former versions, the result will be URLs with duplicates as:

    http://mysite/mymagazine/article/view/v1-n1-bozana-et-al/v1-n1-bozana-et-al.pdf

That it's not nice to read by humans and probably will imply SEO penalties (because of length, redundancy...)

Bozana: I also see no reason at the moment why %x shouldn't be considered.


Good news. :-)
Bozana, let me know if you need help with it... I'm trust more in your coding skills than in mines :-), but if you need help I think I can submit a patch for it after vacations.

Marc: ... we have been reflecting on "public ID notation".


BTW, no comments about the "public ID notation"?
Would be nice to have kind of "official" recommendations.
Should I split the post? (DOI & URLs)

Cheers,
m.
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby swing » Mon Jul 22, 2013 4:44 am

Hello :-)

Hmmm... I am not sure how to solve this problem best. At the moment I think the following solutions are possible:
1. Remove the uniqueness constraint for the galley custom identifiers as well as the possibility to use those custom identifiers for DOI suffixes (there will still be a possibility to use custom DOI suffixes),
2. Remove the uniqueness constraint for the galley custom identifiers and automatically add the article best identifier to galley DOI suffix (when the custom identifiers are used for it), for example:
article custom identifier = v1-n1-bria,
galley custom identifier = pdf,
DOI suffix = v1-n1-bria_pdf
And, of course, explain it to the users.
3. To leave it as it is, although I think this way of naming that Marc mentions makes lots of sense for the URLs.

I think the procedure is the same for the supplementary files, but it's maybe OK so i.e. could be left as it is, because supp files are something different than galleys.

What do you think?
Thanks a lot!
Bozana
swing
 
Posts: 142
Joined: Tue Oct 09, 2007 2:59 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby mbria » Mon Jul 22, 2013 7:49 am

Hi,

Thanks for the fast answer. :wink:

Just to think together... is there is a 4rth option?

    4. Remove the uniqueness constraint for the galley custom identifiers... and check the full URL&DOI to be sure it's unique (instead of just checking the PID).
Does it make sense?
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby swing » Mon Jul 22, 2013 8:33 am

Hi,

There is 1) the possibility to use the object custom identifier for the DOI suffix and 2) the possibility to use a custom DOI suffix.
If #1 is chosen, then the object custom identifier (as it is) is used for the suffix (DOI suffix = custom identifier) and the DOI = DOI prefix + suffix i.e. DOI = DOI prefix + custom identifier. There is no possibility here to change/edit it. Thus there is no uniqueness check for the DOI in this case, because it wouldn't bring anything, I think. For example:
DOI prefix = 10.12345
DOIs should be assigned to galleys
There is article 1 galley named 'pdf' and article 2 galley named 'pdf'
Then the two DOIs, for the both galleys, would be the same: 10.12345/pdf. Checking this without the possibility to edit it wouldn't bring anything, right?

If the #2 is chosen (that has nothing to do with the custom identifiers from #1) then there is the DOI uniqueness check.

Best,
Bozana
swing
 
Posts: 142
Joined: Tue Oct 09, 2007 2:59 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby mbria » Tue Jul 23, 2013 7:56 am

Hi Bozana,

Please, let me go a little back before jumping. ;-)

I feel like I'm losing something here so let's see if I'm following you:

    0) In DOIs as well as in URLs, what need to be unique is the full identificator (it means, the full "path") not only the last element

In former versions of OJS (2.3.x or lower):
    1) ...there was only one "public identifcator" used for DOI & URLs.
    2) ...the minimal addressable unit was the article's summary (galleys and supplementary didn't have a DOI or custom-id)

In new versions of OJS (2.4.x or greater):
    3) ..."Public Identifier Plugins" were introduced so DOI & URL could be now independent.
    5) ...the elements that could be referenced now are: issues, article-summary, article-galley, article-sup-files.
    6) ...two "Public Identifier Plugins" were developed, to manage DOIs & URNs.
    7) ...DOI Plugin is able to use "custom Identifier" (see point 4.) as a suffix. Then DOI=DOI prefix + "custom-id".
    8) ...DOI Plugin is able to use individual item's PIDs as a suffix that is independent from the custom-id (in other words, URLs).

Am I right? Please, correct any of the former sentences if something is wrong.

IMHO, the point here is how the DOI and the URL is checked: If I'm following you, right now we are checking the ID against the DB, instead of checking the full path (see point 0.)

If we all agree that URLs like "http://mysite/mymagazine/article/view/n1-v2-bozana/pdf" are a good idea, we need to check the full URL and not only "custom-id" (in this case "pdf") to be sure are unique.

If we check the full path and we include the "custom-id" as a variable for suffix patterns, I think we will have the best of both worlds. ;-)

What do you think?

Cheers,
m.

PD: BTW, congratulations Bozana if DOI&URN plugins are yours. You did an incredible work.
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby swing » Tue Jul 23, 2013 9:05 am

Hi Marc :-)

mbria wrote:Hi Bozana,

Please, let me go a little back before jumping. ;-)

I feel like I'm losing something here so let's see if I'm following you:

    0) In DOIs as well as in URLs, what need to be unique is the full identificator (it means, the full "path") not only the last element


True.

mbria wrote:In former versions of OJS (2.3.x or lower):
    1) ...there was only one "public identifcator" used for DOI & URLs.


Just lets say it this way: article custom/public identifier could be used for DOI suffix, because DOIs could only be assigned to articles.

mbria wrote:
    2) ...the minimal addressable unit was the article's summary (galleys and supplementary didn't have a DOI or custom-id)


Not totally true -- Issues, galleys and supp files could also have custom/public identifiers -- The only addressable unit/item/object for DOIs was article.

mbria wrote:In new versions of OJS (2.4.x or greater):
    3) ..."Public Identifier Plugins" were introduced so DOI & URL could be now independent.


Not totally true -- The pubIds plug-ins are introduced for the better handling and code structure. The dependency between DOIs and custom/public identifiers stayed the same.

mbria wrote:


True. As it was earlier.
And just for the information: the custom/public identifier feature should be implemented as a pubIds plug-in in the future (I think).

mbria wrote:
    5) ...the elements that could be referenced now are: issues, article-summary, article-galley, article-sup-files.


True, for custom/public identifiers (like it was earlier) as well as for DOIs and URNs (now).

mbria wrote:
    6) ...two "Public Identifier Plugins" were developed, to manage DOIs & URNs.


True.

mbria wrote:
    7) ...DOI Plugin is able to use "custom Identifier" (see point 4.) as a suffix. Then DOI=DOI prefix + "custom-id".


True. It is the 3. option for DOI suffix in the DOI plug-in settings.

mbria wrote:
    8) ...DOI Plugin is able to use individual item's PIDs as a suffix that is independent from the custom-id (in other words, URLs).


Hmmm... What do you mean with 'individual item's PIDs'? Somehow the 'custom' and 'public' identifier are the same for me here -- 'custom identifier' is used in the UI (setup 4) and 'public identifier' in the code (I think).
Do you maybe mean: Since OJS 2.4.0 there is also a possibility to create and use a custom DOI suffix (independent from the object/item custom/public identifier above)? This is true.

mbria wrote:Am I right? Please, correct any of the former sentences if something is wrong.


Done :-)

mbria wrote:IMHO, the point here is how the DOI and the URL is checked: If I'm following you, right now we are checking the ID against the DB, instead of checking the full path (see point 0.)

If we all agree that URLs like "http://mysite/mymagazine/article/view/n1-v2-bozana/pdf" are a good idea, we need to check the full URL and not only "custom-id" (in this case "pdf") to be sure are unique.


Yes you are right for the URLs, but for DOIs:
If 'Galleys' is chosen for the journal content in the DOI plug-in settings and
if the 3. option is chosen for the DOI suffix in the DOI plug-in settings
then
all galley DOIs, for all galleys, will be constructed like this: DOI prefix + custom identifier (= DOI prefix + 'pdf') which will make all DOIs be the same i.e. be not unique.

mbria wrote:If we check the full path and we include the "custom-id" as a variable for suffix patterns, I think we will have the best of both worlds. ;-)


True. If we introduce the custom/public identifier as a variable for the DOI suffix and then construct a pattern for galleys (in the 1. option for the DOI suffix in the DOI plug-in settings) using this variable but also something else (only this varialbe wouldn't be enough -- it would be the same as the problem we are talking now) everything will be fine.
The problem is this 3. option for DOI suffix in the DOI plug-in settings.

mbria wrote:What do you think?


I suppose still that we have those 3 solutions I mentioned earlier :-)

mbria wrote:Cheers,
m.

PD: BTW, congratulations Bozana if DOI&URN plugins are yours. You did an incredible work.


Thanks a lot! It wasn't just me -- there was also Florian Grandel working on this, so I will tell him -- he will surely be, just as me, very glad to hear it :-)
swing
 
Posts: 142
Joined: Tue Oct 09, 2007 2:59 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby mbria » Thu Jul 25, 2013 3:34 am

Everything more clear now.
Thanks for your time Bozana.

Summarizing:

mbria wrote:If we check the full path and we include the "custom-id" as a variable for suffix patterns, I think we will have the best of both worlds. ;-)

Bozana: True. If we introduce the custom/public identifier as a variable for the DOI suffix and then construct a pattern for galleys (in the 1. option for the DOI suffix in the DOI plug-in settings) using this variable but also something else (only this varialbe wouldn't be enough -- it would be the same as the problem we are talking now) everything will be fine. The problem is this 3. option for DOI suffix in the DOI plug-in settings (option 3 at settings).


So if it's ok for you let's "divide&conquer": we can close first DOI for galleys before we talk about supplementary files and URLs.

For DOI galleys you pointed the following solutions:
1. No custom-id, just doi-id: Remove the uniqueness constraint for the galley custom identifiers as well as the possibility to use those custom identifiers for DOI suffixes (there will still be a possibility to use custom DOI suffixes),
2. "Auto doi" based on article&galley: Remove the uniqueness constraint for the galley custom identifiers and automatically add the article best identifier to galley DOI suffix (when the custom identifiers are used for it), for example: article custom-id = v1-n1-bria, galley custom-id = pdf ---> DOI suffix = v1-n1-bria_pdf
3. "Do nothing": To leave it as it is, although I think this way of naming that Marc mentions makes lots of sense for the URLs.


I'm not sure witch one of the former solutions it's the one that allows %x (custom-id variable), but I will vote for it. :-)
BTW, you point that a suffix that it's "only this variable wouldn't be enough -- it would be the same as the problem we are talking now".
I agree: A simple pattern as "%x" will be the same as "option 3 at settings", but it won't be a problem for some editors that like this syntax.

Finally, about solution 3 I think isn't really an option.
It's not only about making URLs look nicer or SEO friendly... my concern is the backward compatibility.

Cheers,
m.

PD: Then, congrats and kudos to Florian Grandel ;-)
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby swing » Thu Jul 25, 2013 4:59 am

Hi Marc,

The mentioned solutions are in order I would prefer.
Else, before starting to change something, I would like to hear Alec's opinion. Aaaaaalec, are you still there? :-)

The %x will be introduced however, independent of this problem/solutions :-)

Thanks!
swing
 
Posts: 142
Joined: Tue Oct 09, 2007 2:59 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby mbria » Thu Jul 25, 2013 7:43 am

Great. :-)

Just to be sure, one last question about the first solution:

No custom-id, just doi-id: Remove ... the possibility to use those custom identifiers for DOI suffixes


If I catch you, this is not completely true as far as "%x" will be valid pattern.
And if it's valid as pattern, why do we need to remove it as an option in doi suffix settings?

Did I miss something?

Cheers,
m.
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby swing » Thu Jul 25, 2013 8:08 am

Hi Marc,

the first solution:
"Remove the uniqueness constraint for the galley custom identifiers as well as the possibility to use those custom identifiers for DOI suffixes (there will still be a possibility to use custom DOI suffixes)"

means:
The galleys custom identifiers will not have to be unique. I.e. the pattern you were describing (using always 'pdf' for each PDF galley) will be possible.
The problematic 3. option for DOI suffix should be removed, because this could lead to not unique DOIs. For example above they would always look like DOI prefix + 'pdf'. This is what I've meant with "the possibility to use those custom identifiers for DOI suffixes".
If a journal is manually controlling the uniqueness of their galley custom ids and would like to have DOI suffixes to be the same as the custom identifiers, then there is still the currently existing possibility to use custom DOI suffixes (4. option for DOI suffix) and to enter the appropriate custom identifier for a DOI manually. Also, there will be better/easier way -- to use %x -- when this is implemented. /* I just didn't wanted to mention it above, because it is still not there. */

This is what/how I've meant it :-)

Best,
Bozana
swing
 
Posts: 142
Joined: Tue Oct 09, 2007 2:59 am

Re: Public IDs (aka URLs and DOIs) notation and SEO improvem

Postby mbria » Mon Jul 29, 2013 12:38 am

Great Bozana. Thanks a lot for the explanations.

Alec, now it's up to you. :-)

Cheers,
m.
mbria
 
Posts: 306
Joined: Wed Dec 14, 2005 4:15 am

Next

Return to OJS Editorial Support and Discussion

Who is online

Users browsing this forum: Google [Bot] and 2 guests