optimising elgg 1.8 with nginx: sharing experiences; 3+ years of exploration & refinement: Revision

when i first installed elgg, way back in v1.7.4 - i was unaware of the extent to which i would need to learn to optimise servers just to be able to run elgg in a useful way.

i started with shared hosting and after using several different hosts i realised that i just couldn't get the performance i needed, even just for my own testing, let alone for actual production use. so i moved to a small vps - with 512MB RAM and 1 cpu. at this point i realised that apache was the bottleneck and thankfully nginx was stablised and be ready for use. so i switched. there was no elgg configuration for nginx at that point, however i found one on a blog and it worked.

the site was running smoother and faster than with apache, with less hardware use, though the performance was still not great and i didn't know why. i assumed that the elgg code would be optimised to use best practices for server/browser caching and that the structuring of page templates and plugin flow would use similar best practices for the layout of css and javascript. however, my assumption was incorrect (reminding me that to assume makes an ass out of u and me!). at that point though, i was more focussed on building the site theme and the design aspects, so performance was not high on my list of priorities. recently i have moved away from design and have focused on making changes to ensure the site runs quickly and reliably and sharing some of what i have learned is the purpose of this page.

when i started optimising about 6 months ago, i was seeing pages loading commonly in greater than 5 seconds, sometimes 10 and sometimes 20 or more! i had no explanation for this since there was no simple tool, available by default (that i knew of) with nginx & php that allowed me to view the causes of slow performance on the server. 

 

newrelic - php/server benchmarking

i found the php / server analysis tool from http://newrelic.com/ and used their free trial period to locate bottlenecks in the code. these were PHP issues in the elgg code of certain plugins (mainly videolist and some other plugins as i recall - including my own 'related items' plugin). newrelic allowed me to drill-down in to quite low level detail to find the sources of the slow areas in the code. i found some major delays were occurring with thumbnail generation for videos and applied a fix. after fixing bugs and poor design decisions in 5 or 6 plugins, plus removing some other plugins that were not necessary - the pages were loading about 50% faster.. but still not fast enough. (n.b. i just found that zend have a service that might be a free equivalent to newrelic and i will test it soon, once they release an update that is php5.5 compatible).

 

hardware

around this point i upgraded to 1MB of RAM on the server and found that more of the slowdown was coming from me running nginx, PHP, anti-virus, stat tracking (piwik) & email server all on that one VPS with not enough RAM to cover it all. with the extra RAM capacity the site sped up some more, though was still not fast enough for enjoyable, multi-user production use.

 

caching

after the free trial of newrelic ran out i shifted my approach and looked more closely at the server configuration and caching. i began by testing varnish cache, which did speed the site up nicely and without any configuration being done by me at all. i realised though that nginx is supposed to have a similar ability to varnish so i sought instead to not use varnish (i later discovered that varnish doesn't support https connections so it would have been no use to me anyway without another layer being added to translate - which is innefficient).

i had not previously realised that the default settings for elgg (plus plugins) and also for nginx do not apply helpful caching headers to the various file types we are using on our websites and so most files were being reloaded on every page load, which is highly innefficient! after learning the syntax for nginx configuration files i applied sensible caching to js/css/images/html and all the other types of files that my elgg site is serving. this made another dramatic improvement to speed of use.

after upgrading to PHP5.5 (which i recommend for performance, security and reliability), i installed and configured the PHP Opcode caching which provides a level of caching within the server that reduces the amount of PHP processing through allowing re-use of commonly run code/requests. this improved performance perhaps by 30-40% in some cases.

i have now also applied 2 levels of caching in nginx directly. microcaching is used in some areas, so pages are cached for several seconds - ensuring that refreshes and concurrent requests can be made from multiple visitors without incurring extra overhead, while maintaining near real-time data availability from elgg. file caching in nginx is also enabled (using the open_file_cache series of server directives).

i also explored using cloudflare and incapsula for a while as CDNs and thus globally cached data was available in different continents for my site - even though i only use one server for elgg. i found this was helpful for speed, however i also found that since i value privacy and security for my site's purpose, i was not willing to add the potential security risk of essentially 'employing a doorman/security guard' for my site who would not let me see, moment to moment, the exact details of what it is doing. so i no longer use these CDN services, though they may be of use to you.

encryption / ssl

on the topic of privacy/security, i chose to encrypt the entire site recently and saw that this also slowed the site - mostly on connection and not so much after that. after exploring various options i created a free ssl certificate using openssl and stored it with startssl.com for free. thus i now have a free method of encrypting my site and the site has the 'green safe logo/ padlock' for all pages (i needed to make sure that all links in the site code and database point to https links and not http). after reading some tutorials on this and also after listening to many of the 'black hat' conferences via youtube i realised that ssl has been mostly broken for a long while and that a lot of the information online is either incorrect or is deliberate mis-information.. so i needed to experiment and explore more.

i found that i needed to reduce the number of certificates in the ssl chain to be as small as possible to optimise the ssl connection time. this page offered some extra tips for optimising nginx with ssl (http://www.igvita.com/2013/12/16/optimizing-nginx-tls-time-to-first-byte/). i also included many settings in the nginx setup to optimise the tls encryption process.

after using the ssl testing tool at qualsys and a few weeks of learning about and tweaking the ssl configuration, i now see my site has an A rating (https://www.ssllabs.com/ssltest/analyze.html?d=infiniteeureka.com) for encryption. forward secrecy is the process (as i comprehend currently) of ensuring that the relevant aspects of the security cyphers change regularly, such that if the codes are broken at any point then the previously transmitted encoded messages will not be automatically accessible - the same level of code breaking is required for each transmission (or possibly session). i am no expert in this since i do not code encryption cyphers directly - however, i see now that most browsers can connect to my site with forward secrecy and thus the security is, in that sense, comparible to the busiest encrypted sites on the web.

compression

the use of gzip compression went a long way to reduce the page size of the pages on my site. through tweaking the settings in nginx (which are widely available online) i reduced the page size by at least 50% or more. 

 

performance benchmarking and load testing

i have used a variety of tools and apps along the way to help me in this task of optimisation. the one i currently am using for load testing is loadimpact (https://loadimpact.com/), which shows how the server handles multiple concurrent connections. the results there are not amazing currently, i will continue to refine the server configuration and accept that a hardware boost is probably required to improve the performance at load.

while optimising the nginx configuration i found that the process described here (http://seravo.fi/2013/optimizing-web-server-performance-with-nginx-and-php) for graphing server response helped a lot to gain clear measurements of the effects of small changes. i created a series of 20+ graph lines that show how each of my changes gradually sped up (or in some cases slowed down) the server response with multiple concurrent connections - the response line is now relatively flat, so that the response times are fairly similar regardless of how many visitors connect, whereas when i started the process i was seeing a nearly 45 degree vertical incline as the site slowed down with more visitors hitting it concurrently.

 

image optimisation


since most of the images in elgg are served dynamically via the elgg engine, any optimisation of them needs to be done within elgg or possibly via a server extension such as google's pagespeed module for nginx. i have recently seen that google has been somewhat unkind here in that they have offered pagespeed for free during it's creation process and many coders are adding to it on github, yet they now have a notice saying that the pagespeed service will be a paid service 'at some future point'. so i will not use their tool for optimising here.

instead i have optimised static files as much as possible (see related links below for a useful optimiser for png files) and for a while i used the lazy_load_images elgg plugin to only load the images that are visible on the page - which can speed up page loads for longer pages. currently i have disabled the lazy loading since there is an intermittent bug with chrome/chromium and the activity page in elgg when lazy loading images is enabled, which results in images not always loading.

i think there is considerable room for optimising elgg's serving of images, particularly through applying optimised headers in tidypics and for icons in elgg directly. this ticket in github is one which i think should be resolved asap to increase performance /caching of images in elgg (https://github.com/Elgg/Elgg/issues/4279).

 

page layout - javascript and css

i found the google pagespeed insights tool to be helpful in identifying bottlenecks in the site's javascript and css design (http://developers.google.com/speed/pagespeed/insights/?url=https%3A%2F%2Fwww.infiniteeureka.com). once i combined the javascript files from my site into one large file (i am still refining this as some files are still loaded seperately), the page weight decreased by around 40-50KB and the speed increased too. i still need to optimise the css file for my elgg theme as currently all css is loaded for all pages - leaving 1000+ tags being loaded unnecessarily for many pages.

i just recently found the html5 boilerplate files (http://html5boilerplate.com/) which offer some nice config tips and theming ideas for performance and reliability - i carefully read through the nginx files and other areas and grabbed parts that i didn't already use that are useful.

i also replaced the default jquery and jquery-ui files in elgg with the versions available from the google CDN, so that most (some/many) visitors to my site will not download jquery since it will already be available in their browser cache - in theory. if they do download these files they will be from google's own servers and thus the delivery will be highly optimised for free.

 

the results?

currently, the site is loading rapidly for visitors after they have made the initial ssl connection and downloaded the js/css that is reused throughout the site. the initial connection and page load takes between 3 and 5 seconds usually and subsequent loads are faster - between 1 and 3 seconds usually if the visitor is located within range of my european server. i have not yet seen the site under the weight of a lot of traffic - i am sure i will need to make more changes then.

 

in summation

coding and configuring elgg for performance needs to be a priority for anyone who intends to use elgg for any task more than just playing at home. the tools are available for anyone with enough time/ focus / intention to find them and use them in helpful way. the raw code of many plugins and also the core has space for improvement and the server configuration shared for nginx can be greatly enhanced.

what now?

  • i will improve the sll connection speed somehow - i think that future releases of nginx will improve this - so i may wait for them to see what they do before diving too deeply into that.
  • i have not yet successfully activated memcached for use with elgg - my previous attempts were not succesful. i see there are some current tickets in github related to this, so i will probably wait for elgg 1.9 before i do that.
  • the minds.com team are due to release a fork of elgg which uses a nosql database in place of mysql which is offering enhanced performance, so i may switch to that when they are ready.
  • i am aware that the caching headers for some plugins and images for elgg core need to be improved - i will look to inspire that via github or will do it myself when i have the time/space available.

at some point i will probably make the config files available on github to inspire learning and improvement within the elgg community (and the files themselves). i know that the lorea group has a version of their nginx files online there too. maybe you have some tips and thoughs that you can share here that may help too?

 

more useful links

http://gzipwtf.com/ - test your serve's use of gzip

https://www.ssllabs.com/ssltest/index.html - ssl / encryption testing

https://tinypng.com/ - highly efficient image optimisation for static png files

http://www.webpagetest.org/ - waterfall breakdown of page load, timing and server responses

http://tools.pingdom.com/fpt/ - another performance analysis tool, including waterfall breakdown and other useful metrics