A High Availability Architecture
Making our web applications resilient
Craig R Webster
Software Engineer, BBC A&Mi
This talk
- How we evolved our architecture
- Why we did it and why you should use it
- No implementation details - they'll appear in future talks
Where we were
- Apache on each box
- MySQL on each box
- Assets served through application server
- Hard work in request-response cycle
- One server per application
- User-generated content stored locally
- Logs stored on each box
- No caching
Apache on each box
- Waste of application server RAM and CPU
Move Apache to a separate box. Apache already proxies to application via HTTP, why not do it across the network?
- Requires setup for each application
We can use the new Apache box to proxy back to several applications
- Apache crashed? Application unavailable
High Availability Web Tier
Wait, a what now?
High Availability
- Multiple boxes running service (eg Apache)
- One master node
- Several slave nodes
- Floating IP address
- If master fails one slave takes over automatically
- Software to do this for us: Heartbeat
After that change we have...
- High Availability Apache
- MySQL on each box
- Assets served through application server
- Hard work in request-response cycle
- One server per application
- User-generated content stored locally
- Logs stored on each box
- No caching
MySQL on each box
- Waste of application server RAM and CPU
Move MySQL to a separate box. Application can easily talk to MySQL via TCP/IP
- Requires setup for each application
We can use the new MySQL box to host several databases.
- MySQL crashed? Application unavailable
High Availability Database Tier
Same idea as Apache
After that change we have...
- High Availability Web Tier
- High Availability Database Tier
- Assets served through application server
- Hard work in request-response cycle
- One server per application
- User-generated content stored locally
- Logs stored on each box
- No caching
Assets served through the application server
- Application requests are expensive and are not needed to serve static assets
- Apache does this much faster
Copy assets up to High Availability Web Tier and invoke some mod_rewrite voodoo.
After that change we have...
- High Availability Web Tier
- High Availability Database Tier
- Assets served through web tier
- Hard work in request-response cycle
- One server per application
- User-generated content stored locally
- Logs stored on each box
- No caching
Hard work in request-response cycle
- External services are slow
- External services are unreliable
- External services are hard work
Much faster to give client an IOU and ask something else to do the job later
Use an Enterprise Message Bus to send job requests
Great. What's an Enterprise Message Bus?
Enterprise Message Bus
- Put simply, it's a queue
- Messages are put in one end (by your web application, the producer)
- Messages are read from the other end (by any process that's interested, the consumers)
- Possible to have several producers and consumers
- Write a consumer to do the hard stuff. Let the web application concentrate on serving application requests.
Digital Fabric will provide pan-BBC bus. We should take advantage of that. Until then, Service Management provide this.
After that change we have...
- High Availability Web Tier
- High Availability Database Tier
- Assets served through application server
- Hard or slow work done offline
- One server per application
- User-generated content stored locally
- Logs stored on each box
- No caching
One server per application
- Hard to scale - vertical only
- Server crashed? Application unavailable
- Server maintenance? Application unavailable
- Server too busy? Application unavailable
Add more application boxes! Apache's mod_proxy_balancer doesn't care and will detect and drop unavailable servers from cluster, and MySQL is already accessed across the network.
After that change we have...
- High Availability Web Tier
- High Availability Database Tier
- Assets served through application server
- Hard or slow work done offline
- Many servers per application
- User-generated content stored locally
- Logs stored on each box
- No caching
User-generated content stored locally
- Content only visible to one application server
- Requests handled by several application servers
- Content sometimes appears, sometimes doesn't
- Server disk dies? Boom! Data loss.
- What happens when disk fills up? Shouldn't need to take down application to add more space.
Stop-gap: rsync assets to web tier. Can cause assets to appear sporadically until rsync runs. Not a nice user-experience.
Real solution: document store
Distributed Filestore
- Simple add / modify / delete operations, just like filesystem
- Doesn't use local disk - uses network. Easily accessed by all application boxes
- Easy to scale - just add more storage nodes
- Fault tolerant - spray-files-everywhere approach (disk is cheap!)
We went with MogileFS. Built by Danga for LiveJournal (massive traffic).
Assets served by application!
- Now we have a document store everything is peachy, right?
- No! Assets stored in MogileFS are being served by the application. Boo!
Perlbal can "re-proxy" MogileFS.
Perlbal
- Software based load balancer
- Sort of like Apache's mod_proxy_balancer
- Except it can watch for special HTTP headers coming back from your application and send files from MogileFS to the client based on those
- This needs to be High Availability too or we introduce another single point of failure
After that change we have...
- High Availability Web Tier
- High Availability Database Tier
- Assets served through application server
- Hard or slow work done offline
- Many servers per application
- Document store for user-generated content
- Logs stored on each box
- No caching
Logs stored on each box
- Several boxes per application
- Hard to find out which server to tail the logs on
- Consumes disk space - sometimes alarmingly fast - which can cause application to fail when disk is full
Send them across the network to a central log server. Application logs to syslog, syslog streams to log-host.
After that change we have...
- High Availability Web Tier
- High Availability Database Tier
- Assets served through application server
- Hard or slow work done offline
- Many servers per application
- Document store for UGC
- Logs sent to log host
- No caching
No caching
- Slow changing or less time-sensitive data can be cached
- No need to hit application & database for each
select count(*) from brands;
Memcache provides an easy to use in-memory LRU cache.
Don't just cache pages... cache everything
That said, caching is hard to get right and it may not be appropriate to use an LRU cache for your data (think sessions). Start with the requests that hit your application hardest.
After that change we have...
- High Availability Web Tier
- High Availability Database Tier
- Assets served through application server
- Hard or slow work done offline
- Many servers per application
- Document store for UGC
- Logs sent to log host
- Slow-changing pages served from cache
System Diagram (Current at 2008-12-10)
Are we done?
- Already used by PIT, MIT, Peel, VR and Tags2
- Not perfect, but good enough to use
- Still need to
- make sure DB can handle many hundreds of connections
- investigate adding more database nodes (using MySQL Proxy?)
- make the message bus failover nicely (PIT & SM are doing this now)
- decide if using Perlbal makes Apache unnecessary
- investigate possibility of moving to Forge
- work out how to estimate Memcache memory requirements
- document how to scale MogileFS
- show everyone how to use the services introduced in this talk
- ... and probably more