RESTful API Framework for Elgg

This is a little doc I put together about a new REST API framework I might potentially build for Elgg. It's not fully fleshed out, but I wanted to get some feedback from those interested in getting involved.

Note that despite the use of "We" and third-person references to "Evan", I am the sole author of this document and it doesn't necessarily reflect the ideas of other developers. Just throwing it out there for feedback.

You can comment here or on the original Google Doc.

 

Summary

I want to expose a high-quality, standard REST API in Elgg that can be easily extended and/or customized by plugins and consumed by clients. This will help Elgg and Elgg-based sites to stay relevant in our multi-client world (web, ios, android, etc.).

Background

Until 1.9, Elgg shipped with an API for defining what were supposedly “RESTful” web services. In practice, this API encouraged users to define more of an RPC API. We recently extracted it to the “web_services” plugin for Elgg 1.9 due to (perceived) lack of adoption.

 

There are existing efforts at a standard RESTful web services API, but those never seem to gain traction, perhaps because they are not RESTful in the first place, being built on Elgg’s previous API library.

 

Much of my proposed design is inspired by this article on Resource-Model-Representation approach to designing APIs (as opposed to MVC). I recommend reading that first if you aren’t familiar

Requirements

Testability

No global state or static methods. Ugh, so annoying that modern frameworks out there still do this. Never compromise on testability.

Validation

Input validation should be easy yet customizable. Controllers should never have to worry about validation rules -- the rules should be defined separately from the controller to avoid the need to override the controller just to change a validation rule. In other words, controllers only ever accept valid data and just focus on the business logic.

Permissions

Checking permissions should be decoupled and predictable for all possible requests. The client should not require deep knowledge of the backend resolution of permissions to present a useful UI to its users. E.g. clients need to know to hide the edit button if there is no chance the user can edit the given resource.

Intuitiveness

It should be really easy to understand how the system works just by glancing at it. Devs don’t want to learn something new, and our goal is not to be groundbreaking, it’s to get out of devs way and make them productive. This intuition should also lead devs to follow best practices when using the framework. This can be accomplished by strongly embracing standards.

Decoupling

The JS client should not need to know too much about the underpinnings of the API. In particular, clients should never have to know:

  • Whether a resource and subresource are connected by container/relationship/etc.

  • What the validation rules are for a certain method-resource combo.

 

It is acceptable if the client can make a request to a well-known endpoint in order to get this information, but the point is that the rules and/or logic should not be duplicated on clients and on the server.

Non-goals

Url renaming

It should be easy to add new RESTful resources (and therefore their associated urls) to the system, but renaming existing ones is a non-issue because all of these URLs are meant for developers, not end-users.

Redirects

There is no point in doing these -- we expect all access to be ajax based because we don't send back HTML content from the API. If there is another, more appropriate URL for the given resource, that URL should be included with the response so that it indicates it is the canonical link for that resource.

 

Prior Art

Many many many PHP frameworks already exist that claim to be “RESTful.” Evan examined several of these in his research and found all of them to be lacking in one or more deal-breaking ways.

 

  • CodeIgniter

  • Doo

  • Epiphany

  • Fat-Free

  • Fuel

  • Kohana

  • Laravel

  • Limonade

  • PRADO

  • Recess

  • Restler -- most promising, but still slightly off what we’re looking for...

  • Slim

  • Symfony

  • Tonic

  • Zend

  • Yii

 

Common deal-breakers:

  1. Uses static functions or global state. Evan likes DI for straightforward testability.

  2. Knowledge of output format leaks into controllers (e.g. “renderHtml” methods).

  3. Controller methods don’t map directly to HTTP methods.

  4. Similar: Mapping method/url pairs to functions, rather than mapping urls to controllers.

 

This one is the closest to Evan’s original vision that he’s come across:

https://github.com/BarFoo/Resource-Framework

 

Note that we should probably still utilize some other framework’s features under the hood of our API. E.g. we could use a Symfony router under the hood to do URL mappings, but it’s still nice to be in control of the API.

Detailed Design

Elgg REST API Quick Reference

 

The Big Idea: We’ll use a Resource-Method-Representation approach to strongly encourage RESTful best practices.

 

Each Resource is a PHP class whose methods map directly to HTTP methods

 

class \Elgg\Rest\Resource {

 function get() {}

 function post() {}

 function put() {}

 function delete() {}

}

Content-type negotiation

Each resource will have exactly one canonical URL endpoint. This is because URLs are identity. The identity of resources should not change with representation or API version, so these factors should not be present in the resource’s URL.

 

Bad example:

https://api.elgg.org/v1/xml/resources/12345.xml

 

Good example:

https://api.elgg.org/resources/12345

 

The bad url contains superfluous information (the bolded segments). The good URL is the same logical resource as the bad URL, but only representation varies based on requested Content-Type, not URL.

 

Input/model validation

Having a RESTful API means that input validation directly maps to model validation, because the inputs are representations of the model. We will strive to reduce code duplication by specifying the validation rules for models in a single, central location as opposed to validating inputs for each separate controller.

 

The model validation logic will be html5-inspired and JSON-based so that it can be easily shared with clients for generating forms that are synced with the server-side rules. For example, the following rules would specify that a user’s name must be at least 2 characters long and is a plain text value (as opposed to HTML).

 

{

 “user”: {

   “name”: {

     “type”: “text”,

     “minlength”: 2,

   }

 }

}



Permissions

Permissions have their own history with Elgg and there have recently been some attempts to reboot it:

 

The problem with these, as I see it, is that there is a (strong) possibility that they can’t handle every case. If we approach this from the client perspective, we can boil permissions/validation/etc. down to a simple question:

 

If I make this request, will it succeed (barring server failure, etc.)?

 

In other words, we need a “dry run” mode. We provide a way to make request against a mock database, but still run the same validation and permissions hooks, just to see what would happen. Nothing is saved, no emails are sent, no notifications are sent, but we get a response back that indicates whether the request is valid. That would be the ultimate permissions system because it gives clients a standard way to ask for what they actually care about (will my request succeed) without understanding anything further about the system.

 

For example, if you want to test if this request will work:

POST https://api.elgg.org/example/resource

{

 “resource”: “body”

}

 

You would issue this request:

POST https://api.elgg.org/dry-run

{

 “method”: “POST”,

 “url”: “https://api.elgg.org/example/resource”,

 “body”: { “resource”: “body” }

}

 

Alternative approach: GET for resource returns which other requests are valid:

 

GET /example/resource

Host: https://api.elgg.org

 

{

 “actions”: {

   “like”: {

     “can”: true,

     “method”: “POST”,

     “url”: “/example/resource/likes”

   },

   “unlike”: {

 

   },

   “delete”: false

 }

}



Strawman example for enforcing permissions on the server:

 

```php

class Blogs_Resource {

 function canPost($user, $input) {

   // return boolean for whether or not user has permission

   // Can we return a more structured response that would tell

   // the user how to gain the needed permissions?

 }

 

 function post($input) {

   // Permissions and input validation are complete at this point

   // Just focus on business logic

   // Inputs still require escaping before inserting into DB

 }

}

```

 

Ideas for future improvements

Content type negotiation

It’s possible to send/request various formats of information to/from a web server using HTTP’s Accept and Content-Type headers. The first iteration of the API will focus on speaking a custom JSON format; Later on, support can be added for accepting and exporting the data in other formats, especially standard formats such as JSON-LD, ActivityStreams, RSS, ATOM, etc. This would be done in such a way that business logic written against the v1 api doesn’t need to change at all.

 

Partial responses/updates

It should eventually be possible for the client to request partial representations of any resource in a standard way. The framework would completely handle this in such a way that API authors never have to worry about the fact that a partial response is being requested.

 

Similarly, it would be nice to support partial updates. That is, only PUTing the fields of the resource that have been changed. There doesn’t seem to be a REST standard for this as far as Evan is aware as of Jan 12, 2014.

 

Both of these are good for:

  • bandwidth-constrained environments

  • when the resource representations are otherwise huge

 

That being said, neither partial updates or responses is actually necessary for v1.

 

Pushing data / piggybacking requests

For the sake of performance, the server should eventually be able to push responses to the client for requests it thinks the client is likely to make in the near future. This is most useful when it could cut down on RTTs for the client. Network time is precious.

 

Use case 1: Permissions

If a client GETs a resource, the client is very likely going to want to know whether the current user can POST or PUT to that resource as well.

 

Use case 2: Subresources

Similarly, if a client GETs a resource, there may be a subresource that is commonly requested that the server could push to the client, along with permissions about whether the user is able to put/post to those subresources.

 

Concrete Example: The river

The river displays the latest happenings on the site. When items in the river are loaded, you very commonly might want to know:

  • Which of the items shown have I liked (/users/me/likes/12345?)

  • What are the latest comments on the returned objects (/river/12345/comments)

  • Which of my close friends have liked it (/river/12345/likes?sort=actor.affinity)

  • Am I allowed to comment on a post, join a group, etc...

  • Can I post to the river, if so, what can I post?

 

The problem is, each of these things requires the client to know the ID of the item in question before it can request information about it. Instead of waiting for another RTT, the server can just push that info to the client and hopefully speed up the user experience in doing so.

 

While this all seems like it would be really useful, it should be justified with real-life data and have a real-life user asking for it before we put the effort into implementing something that could be very complex. Premature optimization is the root of all evil, so we’re punting on this feature (for now).

 

Feedback and Planning

Feedback and Planning

Discussions about the past, present, and future of Elgg and this community site.