LVS (IPVS) Clusters Continued…

Okay. So it’s been a while since I’ve written about my idea of how to manage and LVS cluster, but with good reason. That pesky thing called school came around and detracted from the project. Now that I’m graduated, I’m starting back into my old projects. While not a whole lot has progressed with this project from when I last wrote, I did have to go and relearn what I had learned before. With that in mind, I decided that I should write some of the details so that I don’t have to do it all over again.

While I’m sure I was abundantly clear in my previous posts of what my goals were, I’ll reiterate them here just to be clear. The LVS project allows for load balancing of services to other other physical servers allowing for greater capacity. I became interested in this project when I started looking into hosting enviroments’ uptimes and their configurations. I was surprised to find that many hosting companies don’t cluster their systems, but leave them standalone. A standalone system can never achieve 99.99% or higher uptime. They can barely achieve 99.9% in many cases. LVS along with heartbeats can achieve uptimes of 99.99% or higher when configured correctly. While I’m no expert in designing systems for uptime, I’ve attempted it anyway. In the image below are the basics of my idea.

IPVS Cluster Design

The flow of traffic would be the user’s browser hits the IPVS load balancer. The load balancer decides which server in the cluster to send it to. That server processes the request and sends it back out through the load balancer. That part is all standard configuration. I wanted to alleviate the burden of managing the cluster (especially if it grew to more than say 15 or 20). That is when I started working with booting a server from over the network, but instead of using NFS for the root filesystem like LTSP does, I opted to load the root filesystem into memory. This was the part that took some work since there isn’t much if any documentation on how to do it.

For my setup I decided to create the filesystem, that would be downloaded onto the server, using SquashFS and using UnionFS to make it readable. I also didn’t want to have to deal with kernel modules so I compiled a custom kernel that included all drivers I would need for the system. That means that I also patched the kernel so that SquashFS and UnionFS could be compiled into the kernel. I made sure to add DHCP client capabilities to the kernel as well so that I wouldn’t have to assign addresses to each server.

I decided to use Ubuntu for my server distro. I used debbootsrap to create a basic root filesystem. I then proceeded to strip of it of all packages that were not essential. Once I felt good about the size I was able to bring it down to, I installed the necessary applications (apache and php) on the image. Using the Dapper distribution, I was able to bring the root filesystem down to about 120MB. I then used the mksquashfs application to create the SquashFS image I would download onto the diskless server. That resulting image was about 42MB.

I then needed to create a custom initial ramdisk (initrd) to download the SquashFS image, create the root filesytem using UnionFS, change root to that new filesystem and then call on the init process of that filesystem to startup the necessary services. In order to do this, I ended up compiling a statically linked version of curl to download the compressed root filesystem.

Once I had all that ready and prepared, I installed a DHCP server, a TFTP server and downloaded PXElinux. I configured the DHCP server to hand out addresses and to tell the PXE client where to get its environment. I placed the necessary files (the SquashFS filesystem, custom kernel, initrd and PXElinux files) in the working directory of the TFTP server and created the config file for PXElinux to use. With that in place, I was prepared to boot the diskless server.

For a hosting provider I could see using an algorithm with mod_rewrite that would point the incoming request to the correct document root for that domain. The file servers would likely be NFS that are either sharing disks (SAN) or mirror their disks between each other. The database servers would be most likely clustered to provide performance and high availability.

With this setup, it would be possible to perform updates on all systems with a simple script that would create the new root filesystem and then restart servers individually or in groups (not all at once) to receive the updates. If the file servers were done correctly, they could limitless capacity, expanding as the need required.

Hopefully in the near future I will be able to get around to organizing my config files and placing them for download.

Comments are closed.