Difference between revisions of "Talk:Authorization Framework"

From PKP Wiki
Jump to: navigation, search
(Added lots of background information (concepts considered, requirements, questions asked, etc.) to the discussion page.)
m (Fixed bug in page structure)
Line 140: Line 140:
* a declarative approach is required if components should be re-used across different security contexts (e.g. different applications, different pages, etc.)
* a declarative approach is required if components should be re-used across different security contexts (e.g. different applications, different pages, etc.)
=== Implementation components ===
== Implementation components ==
* decision request
* decision request
* rules grouped into policies: a target (subject, action, resource, environment attributes), an effect, optional: additional condition, obligations and advice
* rules grouped into policies: a target (subject, action, resource, environment attributes), an effect, optional: additional condition, obligations and advice

Revision as of 22:47, 1 September 2010

Florian's opinion about the role GUI/code design currently proposed for OMP

Commenting about "Flexible Roles" etc.

Giving feedback on http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=4987 I had a long discussion with Tyler about the proposed authorization system for OMP.

Here my understanding of the current approach:

The basic permission building block is the handler operation. Basic access "roles" are then "hard coded" into the operations ("validation"). The number of roles will be considerably increased but the basic concept remains. Every operation is being assigned to one fixed role. This means that most of the time pre-defined roles are intersection-free: An operation that is part of one role will not be part of another role at the same time.

The "flexible role" approach enables users to assigns synonyms to these "hard coded" roles.

The full system would look something like this:

  1. Users are assigned to role synonyms ("flexible roles") or pre-defined base roles.
  2. "Flexible role" synonyms will be resolved to the pre-defined "base roles" if necessary.
  3. "Validation" (Authorization) is being executed on handler operations level by checking whether the required pre-defined role for a given operation is part of the user's assigned roles.

The difficulties I have with this approach are:

  • The terminology chosen for the approach is not very accessible for somebody not part of the PKP team. What PKP calls "validation" is usually called "authorization". (The term "validation" is usually used in the context of checking form values.) What PKP calls "roles" in the current approach would probably be termed "access object group" by other frameworks or projects. Of course words are just labels. But it certainly helps to keep the code basis auto-documenting and accessible if we use a terminology that is not specific to our project (where possible).
  • If I understand the proposed system correctly then there are no roles (in the classic sense) at all but permission groups are directly assigned to users. The difference between a "real" role and an access object group is that roles can have intersections and access object groups cannot. In other words: I can assign permission to the same access object to two distinct roles but I cannot usually assign the same access object to two access object groups. Access object groups are usually strictly hierarchical while roles can have intersections. Introducing "real" roles between access object groups and users will drastically reduce the number of required roles and the number of necessary assignments per user and thereby increase end-user usability. There have been purely hierarchical access object group systems historically but they are no longer "state of the art"...
  • Implementing custom roles by assigning "synonyms" to pre-defined roles is IMO quite unorthodox. It is ok to use this approach if custom roles will always be 1:1 assignable to pre-defined roles in the future. But IMO a more flexible approach is possible without extra cost. I'm also afraid that usability may suffer if we don't use an authorization approach that is well known to users.
  • Hard coding roles into handler operations is IMO also not the best solution. I guess that the new role system will force us to touch all validation methods anyway. Why not benefiting from this opportunity and refactor authorization into a central place (e.g. somewhere in the dispatcher) which is generally considered best practice for an authorization system.

An alternative implementation proposal

The approach I propose is very similar to what has been implemented with great success by major frameworks like CakePHP, phpgacl, Spring Framework, Flow 3 and probably hundreds of other projects. I believe that the approach I propose is even cheaper to implement than the currently proposed approach (the way I understand it). And above all: I'm quite sure that it will considerably improve usability of custom role configuration and day-to-day user-to-role assignments.

I propose to establish the following entities to implement authorization:

  1. access objects (=currently page or component handlers, can be extended to other objects)
  2. access operations (=currently page or component operations, can be extended to other operations)
  3. access objects and access operations can be grouped together into access object groups (=in our case this could be workflows like submission, proofreading, etc.)
  4. access object groups are assigned to roles (=editor role, author role or any "flexible" custom role)
  5. users are assigned to roles
  • The GUI for custom role administration could IMO be greatly simplified:
| Drop down: role                                  |
|                                                  |
| Available workflows:       Assigned workflows:   |
| _____________________      _____________________ |
| |workflow 1         |      |workflow 6         | |
| |workflow 2         |      |workflow 7         | |
| |workflow 3         |  ->  |workflow 8         | |
| |workflow 4         |  <-  |                   | |
| |workflow 5         |      |                   | |
| |...                |      |                   | |
| ---------------------      --------------------- |
|                                                  |
| Button: Create new role | Drop down: based on    |

This will replace all six role configuration sections in press setup step 3 (Workflows). The workflows can also be graphically grouped together into subgroups if we have too many of them. (Think of a tree-like arrangement on the left side.) As they are strictly hierarchical, this is not a problem. The many new "roles" that have been created for OMP will be labeled "workflows" (=access object groups) instead.

  • The role-to-user assignment would remain exactly as it is (with custom roles appearing as any other role).
  • On the database side we need two new tables. One that assigns access object groups to roles and another that assigns low-level access object-operation tuples to access object groups. The existing user to role assignment does not need to be touched. The access objects (handlers) and access operations (handler operations) can be quickly retrieved in a semi-automatic way from the existing index.php pages for page handlers to create the initial database entries in the permission tables.
  • Authorization (="user validation") can be removed (with a few exceptions) from the handlers and replaced by a single authorization call in the dispatcher (after the handler operation has been identified but before the handler is actually being called). It comes down to 1 line of code in the dispatcher and a single "AuthorizationManager" class with maybe 50-100 lines of code that does all the work.
  • A nice side-effect: Once we have the operation to handler mapping in the database we can also easily get rid of the index.php files altogether and retrieve the contained routing information from the database instead (solves #4876).

Generally I think this kind of design would greatly improve the flexibility of our authorization system while simplifying it at the same time. We'd drastically reduce the amount of authorization code necessary. We'd move from decentralized to centralized authorization and from a blacklist to a whitelist approach which both should reduce the probability of security breaches. And we'd IMO get a much more usable and easy-to-understand user interface for the end user. We'd get fully flexible "custom roles" that can be configured in a more intuitive way (=appealing to concepts already known by the user) than they are now. And "custom roles" would no longer be restricted to mere "synonyms" of hard coded roles.

Tyler's response

Flexible roles

The 'flexible' in flexible roles does not come from their non-existent abilities to support granular access divisions, but from the fact that the roles are now in the database and can be renamed, disabled, and restored. It was thought that someday, the 'flexible roles' label could change to something simple like 'roles'. The system name for these 'flexible role' entities did not suit the seemingly more appropriate label of 'custom roles' because, in order to support use cases like being able to rename roles, and to prevent having to deal with two types of roles separately (i.e. custom vs shipped), every role is a flexible role, each of which can be associated with a user by way of the 'roles' table.


I think your distinction between roles and access object groups is important. Access rights to certain parts of the system are what is really being associated with a user when a role is assigned to that user in PKP. I think using the term role has worked well so far, as it is a familiar concept to most (i.e. when logging-in in the morning, people identify more closely with 'it is my duty to perform the actions associated with this role'); however, I agree that, at the system level, using the more security-oriented language is more appropriate, as it reserves the term 'roles' for a different purpose and brings the terminology closer to what programmers might expect.


The use cases we have so far only involve intersection-free roles. All 'press-type' roles have access to the same operations in the Role handler. The 'author-type' roles are a bit of a special case in that they need access to author functions, which seemingly boils down to a routing issue. Besides creating more intuitive role spaces, these synonyms do not pack much else in terms of functionality at the moment, but I think they provide a basis for extension and an implementation of something that allows users to be more creative with roles.

In general

In general, I think that it would be good to work in the direction of your proposal, Florian. Pre-handler authorization checks in the dispatcher, as far as I can tell, would be great. Also, since we might not have the energies for another overhaul at this stage in the development of OMP, we could review what we have to make sure that it'll be easily re-factored into new security patterns in the future and that it contains the terminology we'll be able to put up with, while at the same time coming up with a definite approach to the more general design issues. Building custom roles, in the sense that you're describing, should definitely be part of the plan, but an overhaul probably isn't necessary for the current use cases.

Tylkkn 20:06, 7 January 2010 (UTC)

Some background for the implemented framework (Florian)

Requirements collected from different sources were

  • whitelist approach
  • we need a way to implement "public pages" (no log-in required), e.g. via a dummy user id
  • implement "restricted site access" (subscription model), publishing mode (none, subscription, open), even differentiated per series/issue/etc.
  • validate the presence of a context (e.g. journal, press), even a specific context instance (e.g. "index" for admin functions)
  • we need a way to enforce SSH
  • role based access
  • messages depending on the type of validation, even on the handler/operation, etc.
  • the validate() method has been abused as an initialize() method. We shouldn't do that in OMP, use the initialize() method instead
  • the validate() method should be called by the framework and not by the operation (this is enforced for operations, can we enforce this for handlers as well?)

Questions Asked (and answered in our spec, see main page)

  • can we join various targets (operations, handlers) into groups with the same authorization requirements thereby saving space?
  • does a database-approach make sense (=changing authorization config)?
  • where is it most intuitive for developers to define authorization requirements?
  • should we cache assigned roles in the session so that we don't have to query them on all requests? how would we make sure that changes in roles will be reflected in the session?
  • do we want/need central auditing/logging of user actions?
  • how do we handle security for components that are shared between applications if the policies differ? (app-specific base classes or declarative approach)
  • how do we combine several rules? (deny overrides)

Implementation options I'm aware of

declarative security with annotations & AOP

  • pro: intuitive declaration at the class/method level, easy to read for developers
  • pro: very flexible
  • pro: we can encapsulate arbitrary complex validation checks in annotation classes
  • contra: performance overhead
  • contra: AOP is not a very common concept in the PHP world and may be difficult to explain to community developers
  • contra: relatively high initial development cost
  • contra: I don't know any annotations implementation that is compliant with PHP4
  • contra: difficult to debug

declarative security with nomenclature convention & AOP

  • pro: no explicit declaration or configuration necessary, less code
  • pro: all security code is in one location and can be centrally managed
  • contra: not very flexible, rigid nomenclature
  • contra: not obvious, AOP adds "magic" in the background
  • contra: difficult to debug
  • contra: difficult to configure
  • contra: performance overhead

ACL approach, ACL rules in the database

  • pro: highly configurable, even by the end-user
  • pro: all security code is in one location and can be centrally managed
  • contra: performance overhead
  • contra: configuration overhead
  • contra: security rules are not obvious when looking at the code

Imperative/programmatic vs. declarative security

The declarative approach allows security definitions to be decoupled from the application

  • pro: security can be defined by the user without changing the code
  • contra: declarative security adds a level of abstraction and tends to be non-intuitive from a developer pov because security attributes are not visible in the code
  • contra: declarative security usually has a performance impact as it introduces a level of indirection to code execution (e.g. through method call or object instantiation interception)

The programmatic approach makes sense if it is improbable that users want to modify the standard security policy

  • a declarative approach is required if components should be re-used across different security contexts (e.g. different applications, different pages, etc.)

Implementation components

  • decision request
  • rules grouped into policies: a target (subject, action, resource, environment attributes), an effect, optional: additional condition, obligations and advice
  • attributes of the environment, subject, action and resource (attributes can be multi-valued or based on the content of the resource)
  • policy source
  • actions to be performed before/after a policy decision
  • policy administration point (pap)
  • policy decision point (pdp) - renders authorization decisions based on policy
  • policy enforcement point (pep) - makes decision requests and enforces authorization decisions
  • policy information point (pip) - source of attribute values
  • attribute: has a name and type
  • which rule-combining algorithm do we choose?