It's nice when your website continues to be served even when something catastrophic happens. Running two Apache nodes and Heartbeat will help - if one server blows up, the other will take over in short order.

Prelude

You'll need two boxes and three IP addresses. I use virtual machines from Xeriom Networks. I've firewalled them and opened the HTTP port to the world.

sudo iptables -I INPUT 3 -p tcp --dport http -j ACCEPT
sudo sh -c "iptables-save -c > /etc/iptables.rules"

For the purpose of this post, let's assume that the following IP addresses are available.

  • 193.219.108.236 - Node 1 (craig-02.vm.xeriom.net)
  • 193.219.108.237 - Node 2 (craig-03.vm.xeriom.net)
  • 193.219.108.238 - Not assigned

Simple Service

First we'll setup Apache on both boxes. Nothing complex - we just want to make sure that we can serve something to HTTP clients.

Run the following command on both boxes.

sudo apt-get install apache2 --yes

Now fire up a browser and hit the IP addresses assigned to Node 1 and Node 2. You should see the default Apache page stating "It works!". If you don't, check your firewall allows www traffic. Your firewall rules should look like the below - note the line ending tcp dpt:www.

sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            state RELATED,ESTABLISHED 
ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:ssh 
ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:www
DROP       all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Adding resilience

Apache can serve web pages from your machines now - that's great, but it doesn't protect against one of the machines dying. For that, we use a tool called heartbeat.

Install and configure Heartbeat on both boxes.

sudo apt-get install heartbeat

Next we'll copy and customise the authkeys, ha.cf and haresources files from the sample documentation to the configuration directory.

sudo cp /usr/share/doc/heartbeat/authkeys /etc/ha.d/
sudo sh -c "zcat /usr/share/doc/heartbeat/ha.cf.gz > /etc/ha.d/ha.cf"
sudo sh -c "zcat /usr/share/doc/heartbeat/haresources.gz > /etc/ha.d/haresources"

The authkeys should be readable only by root because it's going to contain a valuable password.

sudo chmod go-wrx /etc/ha.d/authkeys

Edit /ec/ha.d/authkeys and add a password of your choice so that it looks like below.

auth 2
2 sha1 your-password-here

Configure ha.cf according to your network. In this case the nodes are craig-02.vm.xeriom.net and craig-03.vm.xeriom.net. To figure out what your node names are run uname -n on each of the nodes. These must match the values you use in the node directives in the configuration file.

logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
initdead 120
bcast eth0
udpport 694
auto_failback on
node craig-02.vm.xeriom.net
node craig-03.vm.xeriom.net

We need to tell Heartbeat we want it to look after Apache. Edit haresources and make it look like the following - still on both machines.

craig-02.vm.xeriom.net 193.219.108.238 apache2

This file must be identical on both nodes - even the hostname, which should be the output of uname -n on node 1. The IP address should be the unassigned IP address given above in the prelude section.

In ha.cf we told Heartbeat to use UDP port 694 to communicate but because we're all nicely firewalled this port is blocked. Open it on both boxes.

sudo iptables -I INPUT 2 -p udp --dport 694 -j ACCEPT

Your iptables rules should now look similar to the output below.

sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            state RELATED,ESTABLISHED 
ACCEPT     udp  --  anywhere             anywhere            udp dpt:694 
ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:ssh 
ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:www 
DROP       all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Now create a file on each box that tells us which webserver we're looking at.

# Node 1 (craig-02.vm.xeriom.net)
echo "craig-02.vm.xeriom.net" > /var/www/index.html
# Node 2 (craig-03.vm.xeriom.net)
echo "craig-03.vm.xeriom.net" > /var/www/index.html

Check that this file shows up on each box by hitting the nodes IP addresses in the browser. If that works, it's time to flip the switch.

It lives... IT LIVES!

Start heartbeat on the master (node 1 / craig-02.vm.xeriom.net) then the slave (node 2 / craig-03.vm.xeriom.net).

sudo /etc/init.d/heartbeat start

This process takes quite a while to start up. tail -f /var/log/ha-log on both boxes to watch what's happening. After a while you should see node 1 say something like this.

heartbeat[6792]: 2008/06/24_11:06:21 info: Initial resource acquisition complete (T_RESOURCES(us))
IPaddr[6867]:   2008/06/24_11:06:22 INFO:  Running OK
heartbeat[6832]: 2008/06/24_11:06:22 info: Local Resource acquisition completed.

Testing for a broken heart

If you now check the output of ifconfig eth0:0 on both boxes you should see output like below.

# Node 1
sudo ifconfig eth0:0
eth0:0    Link encap:Ethernet  HWaddr 00:16:3e:3c:70:25  
          inet addr:193.219.108.238  Bcast:193.219.108.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
# Node 2
sudo ifconfig eth0:0
eth0:0    Link encap:Ethernet  HWaddr 00:16:3e:92:ad:78  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

Node 1 has taken over our virtual IP address. If you kill Node 1, Node 2 will take it over. You can simulate this by taking down the Heartbeat process on Node 1.

# Node 1
sudo /etc/init.d/heartbeat stop

Checking ifconfig again you should see that the virtual IP address has swapped nodes. If you bring up Node 1 again (start heartbeat) you should see the IP address swap back to that node.

If you got this far with no problems then congratulations, Heartbeat is running and your web tier will survive failure of a node. You can skip to the next section to see it working in the browser.

If you see some lines in the ha-log file telling you that the message queue is filling up then it's likely the two nodes can't communicate with each other. Check that you opened UDP port 694 on the firewall of both boxes.

heartbeat[6148]: 2008/06/24_11:05:09 ERROR: Message hist queue is filling up (500 messages in queue)

Check the firewall rules look like below - the important line is the one ending in udp dpt:694.

sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            state RELATED,ESTABLISHED 
ACCEPT     udp  --  anywhere             anywhere            udp dpt:694 
ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:ssh 
ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:www 
DROP       all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

The proof is in the pudding

Mmm, cake.

Fire up your browser and hit the virtual IP address (193.219.18.238 in this post). You should see a page telling you that you're on Node 1.

Stop heartbeat (or shutdown Node 1) and hit the IP address again in the browser. You should now see that you're hitting Node 2.

Finally, bring Heartbeat back up on Node 1 (or start the box if you stopped it) and hit the IP address again. You should now be hitting Node 1 again.

Love me!

If you've found this article useful I'd appreciate beer and recommendations at Working With Rails.

written by
Craig
published
2008-06-24
Disagree? Found a typo? Got a question?
If you'd like to have a conversation about this post, email craig@barkingiguana.com. I don't bite.
You can verify that I've written this post by following the verification instructions:
curl -LO http://barkingiguana.com/2008/06/24/high-availability-apache-on-ubuntu-804.html.orig
curl -LO http://barkingiguana.com/2008/06/24/high-availability-apache-on-ubuntu-804.html.orig.asc
gpg --verify high-availability-apache-on-ubuntu-804.html.orig{.asc,}