0 comments
web
15 Sep 2010

Load Balancing (Nginx / Apache) and Cache

On our larger sites, such as Subaru (and a couple more I can’t name), we’ve got a nice set up with Nginx handling static files and anything else getting passed off to Apache.

Nginx takes over

Normally you would have Apache listening on port 80, passing everything though to your cgi environment/scripts (php, python etc), but with all of those mod files & shared objects loaded an Apache thread can have quite a large footprint.

To avoid this we put Nginx on to port 80 in a very minimal config, then create a host file (similar to Apache, but different syntax). This is then the clever bit, you can check to see if the url being requested is a file, if it isn’t you use proxy pass to send the request to Apache.

It doesn’t sound like much, but if you have 3 style sheets (core, css3 fancy bits and one for old IE) a few images, some javascript files (tabs, sliders, all those other pretty ui bits) you end up with in the region 10-15 files per page that you no longer need to bother Apache about.

Considering the reduction in requests going to Apache, thats a whole lot of RAM being saved.

Apache sees double

With the proxy pass in place on nginx Apache still needs to be running to handle the heavy lifting side of things, so move it over to another available port (typically 8080) and make sure your vhost for the site is listening to the local host (127.0.0.1) as well as the system ip.

Once you’ve done that you good to go.

Theres a good (informative & accurate) – but ugly – guide on doing this at Debian Administration and sure plenty of other just as good (but less ugly) guides to be found via the magic of google.

Still not fast enough?

Say you have a really data intensive part of a site, say listing all the prices, specs, features, accessories & images for a car (yet another Subaru example..) and your doing the kind of processing and display of that data which would melt your cpu if you had to do on every page load. Well, cache is your friend.

Without sounding too much like a RoR programmer, throw hardware at it; in this case RAM.

By adding a little bit of abstraction to the code, memcache can be used to reduce database load, circumvent heavy looping algorithms and much more.

Lets say you have part of page that queries, loops and displays 3k features/specs; the database query running that would be a significant drain on resources, so lets avoid doing it. In fact, avoid doing the loop as well. We can run this query once, store the results with a unique id into memcache and you never have to worry about it again.

Memcache isn’t the only tool out there that does this kind of job, but it was one of the earliest and has the most features. You even go as far as sticking something like Reddis as a front end to your normal SQL queries as long as you have enough RAM free that is.

Last time I checked, the response time for the initial page response on Subaru was down to between 50 & 80 ms (average time to blink is about 300 ms).

blog comments powered by Disqus