I'm working on a Elgg (1.11.2) backend intended for use as an API for a mobile client. The problem I'm having is the "sanitation" of input. When sending messages through a web service, characters such as < or > become < and > and get stored in the database as such. This was obviously done with HTML in mind, however this practice of storing santized data in the database is generally shunned* even in common, HTML use cases. In my case (displaying messages on Android), the messages are not even displayed in HTML and are displayed incorrectly:
For now, I have disabled the htmlAwed plugin, however that's not a solution. While my code might then do its own sanitation when needed, I can not rely on core code doing the same. Also, this might work if I was only using the web services API, but I also use the web interface as an admin-backend, and I need the core code to be safe. Another solution might be to de-sanitize every time when loading data from the database in order to serve it to mobile client, but that's more of a hack than a solution. If I could somehow disable it only for web services, that would be an acceptable solution (although still not good for the framework on the whole), although I would still likely need to mess with the core WS code.
* A cursory search for "htmlspecialchars in database" (since htmlspecialchars performs similar function to htmlAwed plugin) gives plenty results advising not to store sanitised data in the database:
http://stackoverflow.com/a/4882317/1358631
http://stackoverflow.com/a/28859383/1358631
http://stackoverflow.com/a/7245495/1358631
http://stackoverflow.com/a/14148954/1358631
Should I open this up as an issue on github?
info@elgg.org
Security issues should be reported to security@elgg.org!
©2014 the Elgg Foundation
Elgg is a registered trademark of Thematic Networks.
Cover image by Raül Utrera is used under Creative Commons license.
Icons by Flaticon and FontAwesome.
- Juho Jaakkola@juho.jaakkola
Juho Jaakkola - 0 likes
- Miloš Milutinović@knezmilos13
Miloš Milutinović - 0 likes
- Juho Jaakkola@juho.jaakkola
Juho Jaakkola - 1 like
- Miloš Milutinović@knezmilos13
Miloš Milutinović - 0 likes
- Steve Clay@steve_clay
Steve Clay - 0 likes
- ihayredinov@ihayredinov
ihayredinov - 0 likes
You must log in to post replies.This has been discussed at length here https://github.com/Elgg/Elgg/issues/561
So, if I'm reading this right - this will not be solved in the forseeable future or ever? Since backwards compatibility with old plugins and avoiding problems with old data in the database? What would you suggest then as the best solution, that's not entirely disabling htmlAwed? I'm now thinking of disabling sanitation by changing web_services/lib/web_services.php method get_parameters_for_method not to use get_input?
Yes, I'm afraid this won't be solved any time soon.
Perhaps you could look into altering HtmLawed configuration using the ['config', 'htmlawed'] plugin hook.
Your plugin hook handler would return a different configuration depending on e.g. the context:
I haven't used contexts before, but I think I've got it - so I just do this to register my plugin hook:
elgg_register_plugin_hook_handler('config', 'htmlawed', 'alter_htmlawed_for_webservices', 500);
And I found this in web services plugin, so I just need to check for 'api' context:
elgg_set_context('api');
The only remaining thing is to find a htmlawed config that makes him do nothing. I'll probably try this out, thanks for help!
I think in an Elgg installation it's insane to disable Htmlawed filtering on any content. There are just too many plugins available that assume content is safe HTML.
Using something like elgg_html_decode seems the least bad solution.
FYI, if you use contexts, use push/pop instead of set.
This particular issue is the by-product of weird stuff that is done with the input in the web-services plugin, which instead of relying on input sanitization, builds input into a query string and then parses it.
https://github.com/Elgg/Elgg/blob/2.x/mod/web_services/lib/web_services.php#L233