API for serving files from filestore

A question for plugins developers. I am working on an API that will allow us to serve any asset from filestore (inline or as attachment). It is currently a plugin, but once I have received enough feedback, I will be making a PR to core. https://github.com/hypeJunction/Elgg-proxy

The new API will have a single handler, so plugins will no longer need to implement handlers a-la icondirect.php. The URL of the handler will include the path to the file relative to dataroot, so this will ultimately expose the directory structure on filestore. Does anyone use sensitive user/identifying information as directory names on filestore? Any opposition to using such approach?

The URL will be something like:

https://hj.dev/mod/proxy/e0/l1447268209/di/kc/KMlO-oRuC9q-9Bepu6zYRKYAPJLEp2e2-SaFXQm1oL4/1/43/ads/1447268210giphy.gif

e = expiration time (0 for no expiration)

l = last modified time

d = disposition (inline or attachment)

HMAC signature of the request

path to file on filestore

 

  • i don't recall seeing any plugins that use sensitive data in that way, no.

    are you intending to add security / authorisation to this too? to ensure that privacy is maintained for assets?

  • the hmac is used the prevent accessing another file? so hmac x gives access only to file x, and hmac y to y?

    If you eventually make a PR for core will it include an option to disable this feature? For now it's a plugin so i can choose to enable/disable this if I/my client wanted.

    The whole point of dataroot was to have Elgg be incontrol of the access to the files. If this is still valid, go for it!

  • There are two parameters that are used:

    1. Expiration time - you can pass expiration time of the URL, so if anyone attempts to access the file after the expiration time, they will get a 403 page

    2. Cookie - you can build a URL that relies on the current session cookie, so if anyone attempts to access the file after their current session has expired, they will get a 403 page

    So, for file downloads, you would generate a link with short expiration time and a cookie. If file's access id changes, there may be a short window when the user can still access the file, but the reasoning is that once served, the file would have been downloaded/cached by the browser anyway.

  • HMAC signs the entire request, including the path, expiration time, disposition, cookie used/unused. If any those variables change, the URL will become invalid as HMAC will not validate.

    Perhaps initial dataroot design made sense, but nowadays I feel that actual file entities account for a very small pecentage of what's it is being used for. There is too much boilerplate with regards how we serve files from store, and that results in wrong headers, security issues etc. 

    You would have to use the API explicitly, i.e. call elgg_proxy_get_url(). So if you don't want to use that API in a plugin, you don't have to, and you can write your own handler. I imagine over time all core will be migrated to a standard approach (if this API makes it to core).

    This is the first step to creating a unified API for handling entity media, i.e. icons, covers etc.

  • i'm all for unifying these types of features as much as possible, sounds good.

  • Also, whenever file/entity access changes, we can use touch() on all assets to change modified time, that will invalidate all generated URLs.

  • Thanks for the explanation. I say go for it ;)

  • this will ultimately expose the directory structure on filestore

    Maybe to be more clear, only files explicitly published through this plugin API would be exposed, and with those files, their paths from dataroot.

  • I'd like the API to designate some predefined scenarios with sensible defaults, and allow configuring from there. E.g.

    <?php
    
    // returns Elgg\FileService\File that tracks session cookie and expires in 1hr
    $file_svc = elgg_get_download_service($path);
    
    $file_svc->setExpires(86400); // allow 24hr 
    $file_svc->getUrl();
    
    
    // long expires, do not track cookie
    $file_svc = elgg_get_avatar_service($path);
    
    // hidden group, let's be paranoid and require session cookie
    $file_svc->bindSession(true);
    

    Some other common scenarios?

  • one that i have encountered recently and that was an issue was the need for a way to store temporary files for a length of time that do not explicitly connect to an existing entity. e.g. a thumbnail for a bookmark that has not yet been created as an elgg entity. i attached this to the user object, but my implementation is hacky. a way to handle temporary files reliably would be helpful.

Feedback and Planning

Feedback and Planning

Discussions about the past, present, and future of Elgg and this community site.