Penguin LVS Manager installation instructions

                           Brian Martin
                   bmartin@penguincomputing.com

Contents:

 1 - network topology
 2 - software requirements 
 3 - installation
 4 - configuration
 5 - known problems/limitations

+--------------------+
|                    |
|  Network Topology  |
|                    |
+--------------------+

Before going about the software installation, a few decisions need to be
made in order to make sure that your network topology is in order.

There are many possible valid network configurations for lvsm, as well
as many possible invalid configurations.

The network topology of the cluster is closely tied with the forwarding 
method in use.  For instance, when using lvs-nat, all realservers must be
on a private network with their default routes pointing at the NAT ip
address.

At the very least, lvsm assumes that you will have a network that connects
your cluster together.  It is the interface for this network that you should
supply when asked by the lvsm configurator for the 'heartbeat udp interface'.

In the case of lvs-nat, this will be a private network (eg 192.168.x.x).
Otherwise, this network will need to have routable ip addresses so that
the realservers can send direct replies to clients.

In most cases you will want any machines that you desire to serve as
directors (either primary or secondary) to have a secondary nic to connect
to the external network.  Heartbeat automatically chooses the proper 
interface to alias ip addresses to so this is not an issue.

Once you have your network topology figured out, you should know which 
machines will be acting as directors, and which machines will be only
acting as realservers.

There is a good discussion on how to choose your lvs forwarding type at 

http://www.linuxvirtualserver.org/Joseph.Mack/HOWTO/LVS-HOWTO-5.html

Also in the same howto are some example network topologies.  

When reading the howto, remember that plvsd is designed to do most of
these things for you, with the exception of running network cable :)

What you will have to do is to give each machine an ip address (maybe more 
than one), and make sure that all the default and network routes are set
up properly across the cluster.  Do not try to bring up any of the service
ip addresses manually.  

In the case of lvs-nat, you will need to configure your realservers 
to have a default route that points at a floating ip address on the private
network.  Heartbeat will manage this ip for you but you need to set up the
routes on your realservers.

+-----------------------+
|                       |
| Software Requirements |
|                       |
+-----------------------+

An LVSM installation is composed of five major components:

1 - LVS kernel code 
2 - Heartbeat
3 - Mon
4 - plvsd
5 - web gui

LVSM is currently running only on redhat linux.  The reasons
for this are small, and it may very well work on other distributions.

Support for other distributions will be added as necessary.

You will also need to install the included Digest::SHA1 perl module
on all machines in the cluster, including the web server machine.

1. Kernel 

For your scheduler nodes, you will need to have the ipvs 
kernel patches from www.linuxvirtualserver.org.

There are patches for both the latest 2.2 and 2.4 kernels.
Be aware that for 2.4, 0.8.0 is the stable version, and 0.9.0 
is for development.

LVSM has been developed with the 2.2 kernel, but should work 
with 2.4 as well.

Due to the lack of interface hiding in the 2.4 kernel you will
need to patch the 2.4 kernel for this.  Fortunately the patch for
this is included in the ipvs tarballs.

From the LVS-HOWTO:

The 2.4.x "hidden" patch in now being actively maintained and is included in
ipvs-x.x.x/contrib/patches/hidden-x.x.x.diff 

To apply it:

cd /usr/src/linux
patch -p1 <../ipvs-0.2.5/contrib/patches/hidden-2.4.2-1.diff

Also at this time the destination hash and source hash schedulers
are not supported.  This is only temporary..

2. Heartbeat

LVSM is tightly integrated with heartbeat, and currently 
runs with a patched version of heartbeat 0.4.9.  In addition,
LVSM requires some non standard compile time options.

The patches are a temporary measure until Alan Robertson
(heartbeat maintainer) works out some problems I found.

So for now the lvsm tarball contains a heartbeat directory.

This is the only heartbeat which will work with lvsm, so you should
use it :)

3. Mon

Mon is a general monitoring daemon by Jim Trocki.  LVSM uses it
for service level monitoring of the realservers.  Any recent version
should work.  So far I have been using version 0.38.21.

Mon is available from http://kernel.org/software/mon/

4. plvsd

PLVSD is the penguin lvs daemon.  It is the glue that holds everything
together.  It is in charge of maintaining the lvs routing tables, 
collecting system statistics, maintaining /etc/ha.d/haresources, 
and also providing an interface for clients to talk to the cluster. 

It is designed to run on every node in the cluster.  This makes it 
very easy to maintain the proper ip aliases on the realservers 
when forwarding via ip tunneling or direct routing.

5. web gui

The web gui runs under mod_perl, and need only have tcp access to
at least one node in the cluster which runs plvsd.  

For security reasons it is best to run the lvsm web gui from an
ssl server as there are cleartext passwords passed between the
browser and the webserver.  


+-----------------------+
|                       |
| Software Installation |
|                       |
+-----------------------+

1. Kernel

LVSM requires that any nodes that you wish to use as directors be 
running the ipvs kernel patches.  Any nodes that are not running
said patches will only be allowed to be realservers. 

The first thing to do is to decide which kernel branch (2.2/2.4) you 
wish to use, then go get the ipvs patches for your kernel from 
www.linuxvirtualserver.org.  Follow the directions there for patching
and configuring your kernel and then compile and install both your new
kernel as well as the ipvsadm utility from the ipvs tarball.

It is important here to make sure that when you type ipvsadm you are
not running an old version left over from the os install.  Redhat's
piranha packages include a version of ipvsadm that is quite likely
not compatible with the kernel you are about to compile.  

rpm -qa | grep piranha

should tell you if you need to do anything special.  

To compile your kenrnel, you should patch it as described in your 
ipvs tarball, and then make sure you have at least the following 
config options set:

For 2.2.x:

CONFIG_EXPERIMENTAL=y
CONFIG_MODULES=y
CONFIG_NETLINK=y
CONFIG_IP_MASQUERADE=y
CONFIG_IP_MASQUERADE_VS=y

You should compile at least one of these schedulers:

CONFIG_IP_MASQUERADE_VS_RR=m
CONFIG_IP_MASQUERADE_VS_WRR=m
CONFIG_IP_MASQUERADE_VS_LC=m
CONFIG_IP_MASQUERADE_VS_WLC=m
CONFIG_IP_MASQUERADE_VS_LBLC=m
CONFIG_IP_MASQUERADE_VS_LBLCR=m

And if you want ip tunneling to work:

CONFIG_NET_IPIP=m

For 2.4.x:

CONFIG_EXPERIMENTAL=y
CONFIG_MODULES=y
CONFIG_NETLINK=y
CONFIG_NETFILTER=y
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=y
CONFIG_IP_NF_TARGET_REDIRECT=y
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_VS=y

You should compile at least one of these schedulers:

CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m

And if you want ip tunneling to work:

CONFIG_NET_IPIP=m

Be aware that the 2.4 kernel has not been extensively tested with lvsm.

2. Heartbeat

At present, LVSM requires that heartbeat be running on all cluster
nodes.  Note that this does _not_ limit lvsm to 2 node clusters, as
lvsm comes with a patched ResourceManager script for heartbeat.

Heartbeat will be built as part of the plvsd build process. 

3. MON

Mon should be installed in /usr/bin, and it should also have
monitors and alerts under /usr/lib/mon.

These are the defaults.  

Mon only needs to be installed on director nodes.

You can get it from http://kernel.org/software/mon

4. plvsd

plvsd also needs to be running on all nodes.  

To compile, 

cd lvsm ; ./configure ; make

should do it.

Note that this will compile the included heartbeat as well.

When you are ready, 

make install

on each cluster node will put plvsd in /usr/local/bin, as well as 
patch some scripts in /usr/lib/heartbeat/.  Heartbeat will be 
installed as well as part of this step.

currently this make install target assumes redhat init script locations
(/etc/rc.d/init.d).  

5. Web gui

The web gui can run anywhere that can establish tcp sockets to 
at least one of the cluster machines.  

It requires apache+mod_perl.

You will need to copy the lvsm/client directory to somewhere in your
apache document root (I use /lvsm), and also add the following to
your httpd.conf:

<Location /lvsm>
    SetHandler perl-script
    PerlHandler Apache::Registry
    Options ExecCGI
    allow from all
    PerlSendHeader On
</Location>

Go ahead and restart apache and you should be all set.

Be aware that for security reasons, the web gui really should be run
over ssl, e.g. with https.

+------------------------+
|                        |
| Software Configuration |
|                        |
+------------------------+

Once all the software has been made and installed, you still need to 
run the configurator on each node.  The configurator is the file 
lvsm/scripts/writeconfigs.pl.

It will handle writing /etc/ha.d/ha.cf, as well as your 
/etc/ha.d/authkeys and the /etc/plvsd.conf.

/etc/ha.d/haresources is managed by plvsd.

you should be able to run the script with 

perl writeconfigs.pl

It will ask you a series of questions, and then write out the config
files as per your specification.

Here is a quick summary of the options:

 - heartbeat udp port

   This is the port which heartbeat should use for its udp communications.
   It should be the same across all nodes.
 
 - heartbeat udp interface

   This is the interface by which heartbeat should attempt to communicate
   with the rest of the cluster.  It should be the interface of a network
   which connects the entire cluster.

   This interface is also important because the ip address that each node
   reports to the system is the first address on this interface.  These
   ips are used for realserver addresses in the lvs routing tables.

 - heartbeat auth method

   heartbeat uses message digests in an attempt to keep out unwanted 
   (unauthorized) packets.  This should be the same on all machines.
   For most cases with udp over ethernet you will want to choose sha1.  
   If you servers are _really_ low on cpu you could try md5 but be aware
   that attacks against it are known.

   Only on a (physcally) isolated network would you want to use crc..

 - heartbeat auth key

   This is the shared secret that heartbeat uses in generating its 
   message digests.  It should be some arbitrary (farily long) string.

 - plvsd tcp port
   
   This is the tcp port where plvsd listens for client connections. 
   
 - plvsd uid

   The uid that plvsd should run as.  After forking away from its 
   controlling terminal, pvlsd will attempt to do a setuid() to this
   uid.

 - admistration password

   This is the password that is required to log into the web client.
   It is stored in /etc/plvsd.conf as a sha-1 hash. 
   
After you have run writeconfigs.pl on each cluster node you should have
config files appropriate to your configuration.

Next the init scripts need to be dealt with.

The init scripts are copied to /etc/rc.d/init.d by the make install target.
If this doesn't exist, let me know and I'll fix it up for your distribution.

If you are running redhat, you should be able to make lvsm come up on 
boot by running 

/sbin/chkconfig -add heartbeat
/sbin/chkconfig -add plvsd

on each node. 

Now you should be able to 

tail -f /var/log/plvsd.log

as well as the /var/log/ha-log (heartbeat)

to see that the system is functioning.  

You should also be able to log into the web gui now.  

point your browser at http://webhost/lvsm/login.pl

And see what happens.  If you get http errors, check your
/var/log/apache/error_log.  

If you get 'server is refusing connections' then you need to 
make sure that the server you are trying to connect to is really
listening on the port that you want to connect to and that 
you have tcp connectivity between the hosts.

If all goes well you should now be able to configure a vip via
the gui and it should be operational.

Click on the save button and your cluster should be operational.

If there are any problems/questions, feel free to contact me (Brian)
via email or phone

bmartin@penguincomputing.com
(415) 358 2605 (direct line)

+----------------------------+
|                            |
| Known Issues / Limitations |
|                            |
+----------------------------+

Nothing is ever perfect and lvsm is a work in progress so there are 
currently some things that either it can't do or that don't work right.

 * tcp services only

   while lvs supports both tcp and udp services, lvsm currently only 
   supports tcp services.  Ading udp support is not a big deal but is
   not currently a big priorty.

 * plvsd needs to run on every cluster node.   

   This is for a few reasons.  Firstly it is so we can easily manage
   ip aliases on the realservers and account for things like that arp
   problem of lvs-ipip and lvs-dr.  

   Also it allows us to easily collect statistics from all nodes in the
   cluster.

   It should be possible to setup lvsm to use arbitrary realservers not
   running plvsd (or even unix for that matter), simply based upon their
   ip.  This is not currently implemented in lvsm.

   Such a setup would require manual maintenance of aliases for lvs-dr 
   and lvs-ipip and tunnel devices for ipip.  With lvs-nat the realservers 
   only need to have their default route pointed at the nat ip.

 * plvsd only runs on redhat linux for now

   Well, it should easily run on any linux, except for some stupid init
   script issues that have yet to be worked out.

   It would be nice to also have say a solaris port of plvsd, but as 
   of this writing heartbeat itself is not compiling cleanly on solaris
   so that will have to wait.  

 * Vips with the same ip must be hosted on the same primary/secondary
   director pair.

   This is because heartbeat manages resources at the ip level so we 
   need to fail them both over at the same time.

 * When using multiple NAT vips the same directors must be used if
   any realservers are shared between vips.

   This is because each realserver can have only one default route, and
   for lvs-nat that route must point at the nat ip.  So if a realserver
   is to be in service under more than one vip, those vips must share
   their nat-ip.  Since that nat ip must fail over to the same director 
   as its vips, the same primary/secondary director pair must be used.

 * NAT forwarding can only be selected on vips whose default forwarding
   is NAT.  

   This is because each NAT director must maintain both an ipchains MASQ
   rule as well as an interanal floating IP address for the realservers
   to use as their default gateway.
   
   Vips with default NAT forwarding can still use direct routing and/or
   ip tunneling to forward to select realservers.  You just can't choose
   nat forwarding for a realserver unless you are running under a NAT vip.

 * Plvsd may get confused if heartbeat has multiple udp interfaces defined

   So if you want redundant communications best to not use another ethernet
   interface for udp.  This will be fixed in the future.