Multiple access levels: How to get there

The plugins Quasi Access and Granular Access provide a couple ways to associate multiple ACLs with content. I have an internal plugin that does something similar (manages an ACL binding zero or more users and ACLs) and a design for how I think Elgg core would provide this flexibility.

My knowledge of Granular Access is limited. My reading of Quasi Access is that it builds a new "quasi" ACL (and object) for every possible combination of ACL and doesn't actually populate those ACL members. When it's time to build an access query, it looks up all the ACLs the current user should see and stuffs them into the existing list going into the SQL. This certainly seems better than giving each content its own ACL and linking all the right users to it (which requires rebuilding that list each time a user leaves/joins one of the groups, etc).

I'd like to know the longer-term scaling challenges with these approaches; where we think we'll hit pain and what we could do at that point.

I'm also wondering if there's a path for slowly integrating my preferred schema into the current system. What if Elgg wrote to both storage schemas, but required a switch to read from the new one?

Part of me thinks that getting from here to there is hopeless without fully rebuilding Elgg, in which case we should just endorse some plugin and pour all effort into optimizing it. Or what if we could add a table or two to hack this in in a way that was as fast as possible?

Where do we go from here?

  • +1 for writing to both storage systems and using a switch to determine which to read from. This is how we do it at Google when we have a data migration to do... Can add it during 2.x pretty safely and then remove relationships down the line when we feel confident enough that the new system is ready.

  • +1 I've always considered Granular Access a hack, it works, and it uses real acls so the scalability is the same for default elgg with the exception that there's just *more* acls floating around.  I actually prefer the method of quasi-access with the sql injections, it's lighter on the db, I think it's probably better in the long run, but I would prefer a real solution to either.

  • Oh my. I didn't think anyone would dare open this Pandora box :)

    I like your preferred schema. I think it gives enough flexibility to create custom logic, however, I am not sure the dynamic ACL part is necessarily more efficient than an SQL approach (e.g. /123/followers, aka people who added me as a friend, would potentially result in thousands of list ids, unless you maintain duplicate tables for each type of revamped relationship).

    We also need to consider that certain ACLs are exclusive, e.g. you can't have something both Public and something else.

    It would also be nice to allow exceptions, e.g. allow members in Group A and B, except members who are also in C.

    I think dumping relationships is not feasible, and I would hate to seem them go. They are not perfect, but they make a lot of things possible. I would instead build upon the relationship model: pretty much any group of users can be retrieved from the relationship table, and it's easier to maintain relationships table than to maintain the access collections table

    access_collections

    • id

    • target_guid

    • relationship_name

    • inverse_relationship

     

    This would effectively make the whole system dynamic. We can upgrade custom ACLs to entities and add members by relationship.

    No relationship name, would mean self.

    We would then build SQL by joining the relationship table by relationship_name and inversing if needed. So, the convention would be similar to what we do with ACCESS_FRIENDS now (as well as the Quasi Access without the need for extra metacollection objects, which will move to acl_grants_entity table).

    This would also work well with roles. E.g. a site could implement ACCESS_ADMINS type by creating the 'is_admin' relationship for existing admins and adding some listeners. This is similar to how ACLs are currently managed, but it removes the overhead for existing relationship types (e.g. friends), we just grab the data directly from the relationships table.

    Let's say we want to allow access to object 23 to friends and followers of user 123, as well as members of groups 234, as well as users 456 and 457.

    access_collections

    id    target_guid        relatioship_name    inverse_relationship

    1    0
    2    1                        member_of_site
    3    123                    friend                       0
    4    123                    friend                       1
    5    234                    member                   1
    6    456
    7    457

     

    acl_grants_entity

    guid  access_collection_id

    23     3
    23     4
    23     5
    23     6
    23     7

    All we need is just one query! 

    SELECT DISTINCT e.*
    FROM entities e
    JOIN acl_grants_entity ag ON (e.guid = ag.entity_guid)
    WHERE ag.access_collection_id IN (
                SELECT id FROM access_collections acl
                WHERE (e.owner_guid = $user_guid) OR (acl.target_guid = 0) OR ($user_guid = acl.target_guid AND acl.relationship_name = '') OR EXISTS(SELECT * FROM entity_relationships er1 WHERE er1.guid_two = $user_guid AND er1.guid_one = acl.target_guid AND er1.relationship = acl.relationship_name AND acl.inverse_relationship = 0) OR EXISTS(SELECT * FROM entity_relationships er2 WHERE er2.guid_one = $user_guid AND er2.guid_two = acl.target_guid AND er2.relationship = acl.relationship_name AND acl.inverse_relationship = 1));

    Public can be implemented by setting target_guid to 0. Logged in could be implemented by using member_of_site relationship (not in the above query).

     

     

     

  • This is just a braindump, but it can be implemented without breaking existing functionality, and I think it will be a performance gain.

  • I am not sure the dynamic ACL part is necessarily more efficient than an SQL approach

    In the entity lists spec, there'd usually be only 2 dynamic ACLs in use: /1/logged_in and /current_guid/user. Something like /123/followers would be populated in the DB.

    We also need to consider that certain ACLs are exclusive, e.g. you can't have something both Public and something else.

    The API would need to enforce that. The schema doesn't need to know about the meanings of ACLs.

    It would also be nice to allow exceptions, e.g. allow members in Group A and B, except members who are also in C.

    You could have an acl_hide_items table that functions the same way. Not sure what the query would look like though. I think the hide would have to take precedent over any grants.

    I think dumping relationships is not feasible, and I would hate to seem them go.

    In the entity lists model they're just stored in a more normalized way. The relationship name and guid_two combine to form a list, then guid_one are the list items. The API wouldn't need to change at all.

    That's probably the low hanging fruit as far as migration difficulty.

    This would also work well with roles. 

    Right, each list can also be thought of as a role. /234/members is the relationship "members" for group 234, the ACL given to content visible only by the group, AND can be thought of as users who have the "member" role within group 234.

    Though I'd probably separate roles by a prefix: /234/role:moderator = users with moderator role within group 234.

    Let's say we want to allow access to object 23 to friends and followers of user 123, as well as members of groups 234, as well as users 456 and 457

    So basically acl_grants_entity would link entity 23 to these lists:

    • /123/friends
    • /123/followers
    • /234/members
    • /456/user
    • /457/user

    If I were logged in as user 457, my query would have 2 dynamic ACL list_ids (/457/user and /1/logged_in) and the subquery would link my GUID to other list_ids. The query then lets me see anything in any of those lists.

  • Why would we need two ACLs for people friended by user 123?

    I guess I don't see why the inverse_relationship bit is necessary.

  • Cleaning up your query a bit:

    SELECT DISTINCT e.*
    FROM entities e 
    JOIN acl_grants_entity ag ON (e.guid = ag.entity_guid) 
    WHERE ag.access_collection_id IN (
      SELECT id
      FROM access_collections acl 
      WHERE (e.owner_guid = $user_guid)
         OR (acl.target_guid = 0)
         OR ($user_guid = acl.target_guid AND acl.relationship_name = '') 
         OR EXISTS(
           SELECT 1
           FROM entity_relationships er1
           WHERE er1.guid_two = $user_guid
             AND er1.guid_one = acl.target_guid
             AND er1.relationship = acl.relationship_name
             AND acl.inverse_relationship = 0
        )
        OR EXISTS(
          SELECT 1
          FROM entity_relationships er2
          WHERE er2.guid_one = $user_guid
            AND er2.guid_two = acl.target_guid
            AND er2.relationship = acl.relationship_name
            AND acl.inverse_relationship = 1
        )
    );

    This is definitely intriguing. I suspect the inverse_relationship is just compensating for the fact that early stuff built on relationships used guid_one where they should've used guid_two.

  • Oh, sorry. I understood how your approach would implement this. I was just thinking out loud about my approach.

    Your approach basically suggests that all relationships are bilateral. In our current schema, 1:friend:2 translates to 1 friended 2, it does not translate to 2 friended 1. Thus inverse_relationship in my approach, and thus my comment about followers with regards to your approach. My understanding of dynamic is something that can be generated on the fly, e.g. retrieving a list of friends of from a direct relationship.

     

  • ...but I think you may be right that we can't ignore the inverse issue; there's not always a right direction to store the relationship, and even if there is, it doesn't always match the way we'd use it as an ACL.

    In the case of friends, the system creates (123, friend, 234) when user 123 friends 234. This is confusing due to the name, but the direction makes sense as it matches the action. And in this case user 123 would want to use the ACL "/123/friend" to share stuff with people she has friended.

    Now consider groups: If 123 joins group 345, that creates (123, member, 345). And in this case you'd want the ACL "/345/member". The opposite direction.

    If we just pick one direction to use for ACLs, not only would we need to swap some existing data, but also every dev who wanted to make a relationship would need to stop and consider which direction it would be used as an ACL.

    So the pragmatic solution is, as you have, to store the ACL direction explicitly.

  • Ah, I missed the "friends and followers" bit. That explains two separate ACLs for (123, friends) in opposite directions.

Feedback and Planning

Feedback and Planning

Discussions about the past, present, and future of Elgg and this community site.