"Fossies" - the Fresh Open Source Software Archive

Member "fake-1.1.11/docs/redundant_linux.txt" (8 Jul 2009, 18467 Bytes) of package /linux/misc/old/fake-1.1.11.tar.gz:

As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 Creating Redundant Linux Servers   - TEXT VERSION
    3 Horms (Simon Horman)
    4 horms@zip.com.au
    5 (c) 1998
    7 To be presented at
    8 The 4th Annual Linux Expo
    9 The Bryan University Center
   10 Duke University
   11 Durham
   12 North Carolina USA
   13 Thursday 28th - Saturday 30th May 1998
   15 http://linuxexpo.zip.com.au/
   17 I would like to acknowledge the assistance of my employer Zip Internet Pro-
   18 fessionals http://www.zip.com.au./ for their assistance and patience that
   19 enabled this presentation to come together.
   21 Additionally I would like to thank Mr.O'Brien, Gus, Miss Kim, K and
   22 Raster for their help along the way.
   28 For an organisation of any size fault tolerance is an important issue. A server
   29 going down should not leave users twiddling their thumbs. A simple solution
   30 to this is to create backup servers that can be switched in when a server goes
   31 down. Using Linux this can be easily achieved using either existing servers
   32 or dedicated backup servers.
   34 Many services have good redundancy built in. Examples of this include mail
   35 servers and name servers, However services such as POP and manual proxies
   36 which require end users to specify a host to connect to are not afforded such
   37 fault tolerance. It is for services such as this that providing backup servers
   38 becomes crucial.
   40 The idea is to create a backup server that when called upon assumes the
   41 identity of the failed server in addition to any existing identities. The backup
   42 server is given an IP alias for the failed host and uses ARP spoofing to
   43 convince the rest of the network that the backup server is in fact the failed
   44 server.
   46 This method of creating backup servers can be supplemented by using a
   47 TCP/IP Switch that allows content based services such as POP3 to be
   48 sourced from servers that may have other inaccessible services on them.
   49 Additionally housing the content for services such as HTTP on a dedicated
   50 NFS server enables a backup HTTP server to serve a site as well as the
   51 primary server.
   53 These are clearly a quick and dirty solutions to creating backup servers. They
   54 have however proved to be quite successful in practice and requires little or
   55 no outlay for additional hardware.
   61  Introduction 
   63  ARP Spoofing
   64    Background
   65    Activation 
   66    Deactivation 
   67    Automation 
   68    Improvements 
   70  TCP/IP Switch 
   71    HTTP Accelerator 
   72    POP3 Switch 
   73    A Generic Switch 
   75  NFS Backbone 
   77  Choosing a Backup Box 
   79  Testing 
   81  Discussion 
   83  Glossary
   89 Working for an ISP with Linux servers it became apparent that the built
   90 in redundancy in many key services was either inadequate or non-existent
   91 Of particular concern was redundancy in proxy servers. As bandwidth in
   92 Australia is relatively expensive mandatory proxies for HTTP are imposed
   93 by many ISPs. Manual proxies and the issuing of automatic proxy configu-
   94 ration files are particularly lacking in redundancy. To make this redundant a
   95 method of backing up HTTP and proxy servers was investigated. What was
   96 required was a generic method for a backup server to take over the role of a
   97 lame server.
   99 The idea initially proposed was to update DNS records as required. This
  100 would change the IP address of the lame server to that of the backup server
  102 This was found to be unsatisfactory on the following counts
  104   The time to live on the zone files would need to be turned down severely
  105   to account for any users using DNS servers other than the master or
  106   secondary that can easily be reset for the zone in which the servers lie
  108   Users may access servers using an IP address rather than a host name
  110   Users may use non-DNS methods such as an /etc/hosts file to map
  111   server host names to IP addresses
  113 After some investigation it was found that a solution where the backup server
  114 would assume the IP address of the lame server would be ideal. This elimi-
  115 nated the difficulties related to the DNS based solution. The only remaining
  116 difficulty was to convince other boxen on the LAN of the change in circum-
  117 stance and this is where ARP Spoofing came into the game [YV].
  119 ARP spoofing is a method often employed by hackers to assume the identity
  120 of a host on a LAN. For this application ARP spoofing allows the backup
  121 server to take of the IP address of the lame server.
  128 Background
  130 To implement a redundant server in Linux using ARP spoofing is a relatively
  131 simple task. The existing server is given a second interface such that the
  132 server can still be accessed when the backup server is in operation. This is
  133 best achieved using a second physical interface as this gives better hardware
  134 redundancy [HM]. However in most situations using IP aliasing is quite
  135 satisfactory.
  137 [Figure 1 Original and Backup Server Interfaces] (Omitted)
  140 Activation
  142 When the backup server is brought into operation it sets up an interface with
  143 the IP address of the server it is to back up. Again this can be an additional
  144 physical interface or an IP alias. The backup server then uses ARP spoofing
  145 for the duration of its operation to ensure that it receives all packets 
  146 directed to the server it is backing up.
  148 The spoofed ARP packets that are sent announce the hardware address of
  149 the backup server that has an interface for the now lame server's IP address
  150 These ARP packets are addressed to the broadcast hardware addresses. This
  151 is known as a Gratuitous ARP as a machine makes an ARP request for its
  152 own IP address.
  154 ARP is central to the functioning of a LAN as it enables the hardware address
  155 of a machine to be found given its IP address. Once the hardware address
  156 of a machine is know packets can be sent to it over the LAN. Machines keep
  157 a cache of hardware to IP address mappings so that a fresh ARP request
  158 doesn't need to be sent out for each IP packet. The hardware address in the
  159 most recent ARP reply for a given IP address will be used. Hence by using
  160 Gratuitous ARP it is possible to force this cache to be pushed, redirecting
  161 IP packets to a different hardware address and hence in this case a different
  162 machine.
  164 It is important that the ARP packets are sent frequently enough that the
  165 ARP cache of other boxen on the LAN does not expire. If the ARP cache
  166 did expire then an ARP request for the hardware address of the lame server
  167 would be issued. If the lame server is in a state where it is able to answer
  168 ARP requests then a race condition would be created between the lame server
  169 and the backup server, as shown in Figure 2.
  172 [Figure 2 Race Condition for ARP replies] (Omitted)
  175 Deactivation
  177 Once the existing server is ready to be used again it is simply a matter of
  178 removing the additional interface on the backup server and stopping ARP
  179 spoofing. Finally additional spoofed ARP packets are sent out pointing the
  180 existing servers IP address back to the original hardware address
  183 Automation
  185 The process of turning on and on the backup server is easily automated
  186 such that if the existing server fails the backup server is activated. Such
  187 automation takes two stages. Firstly the status of the service is gauged by
  188 attempting to access key services it provides. Secondly in a failure situation
  189 scripts to enable the second interface on the backup server and kick of ARP
  190 Spoofing are activated. Similarly by accessing the lame server via the second
  191 interface it can be ascertained when the backup server can be deactivated
  192 by running scripts that deactivate the second interface on the backup server
  193 and stopping ARP Spoofing.
  199 Improvements
  201 The ARP based solution is particularly well suited to services which act as a
  202 relay. Proxies and SMTP relays fall into this category and the users should
  203 not be able to tell when the backup server is in operation. With this in
  204 mind other complimentary methods of creating redundant servers have been
  205 investigated. The use of some sort of TCP/IP switch on servers backup or
  206 otherwise would allow a more powerful backup scheme to be developed as
  207 content could still be sourced from servers where it is still available.
  210 HTTP Accelerator
  212 The popular Squid proxy daemon comes with a facility that allows a single
  213 server to act as a front end to web servers [OP]. This works by having
  214 clients connect to the Squid server as if it were an HTTP server and then
  215 farming requests onto the real web server or servers. This can be used to
  216 share load around multiple servers on high volume sites as illustrated in
  217 Figure 3 or to protect HTTP servers that contain sensitive data by placing
  218 them behind a firewall such that the Squid server can access the HTTP server
  219 but other hosts on the Internet can not.
  221 Though primarily intended to allow load sharing on high volume sites this
  222 can also be used to provide some form of redundancy. The HTTP accelerator
  223 server can be a front end for multiple back end http servers hence the loss
  224 of a HTTP server should not result in a site being down. And of course
  225 as the http accelerator itself has no content is can be backed up using the
  226 ARP base method of creating redundant servers. On small sites this extra
  227 layer between users and the web server may just be another potential point
  228 of failure however the switching idea presented is an interesting one.
  231 [Figure 3 HTTP Accelerator] (Omitted)
  234 POP/Switch
  236 It is quite common for the SMTP and POP3 servers to be the same box so
  237 mail is delivered and collected from a spool directory controlled by a single
  238 localised system. In a situation where the SMTP daemon is incapacitated it
  239 is desirable to switch to the backup server so users can still send mail. Even
  240 if the POP3 daemon was still operable by switching to a backup server that
  241 invariably does not have access to the mail spool and so POP3 also becomes
  242 unavailable. However a POP3 Switch can overcome this.
  244 A POP3 Switch is simply a data pipe that accepts a list of foreign host-port
  245 pairs and tries them in turn until a connection can be made as shown in
  246 Figure 4. So in our situation the POP3 Switch may first try to contact the
  247 POP3 port of the lame host and then go to a dummy POP3 server listening
  248 on a port on the local host.
  251 [Figure 4 POP3 Switch] (Otherwise)
  254 A Generic Switch
  256 Of course the POP3 switch described is just a TCP/IP data pipe and hence
  257 is extensible to just about any protocol that uses TCP/IP. The only penalty
  258 is that the further down the list of possible host-port pairs the switch has
  259 to go before making a connection the longer the connection time becomes
  260 However some sort of caching mechanism by which a bad host-port is not
  261 tried again for a time could improve this.
  263 Hence we are able to swap in backup servers using ARP spoofing and have
  264 them point to content where it is still available using TCP/IP data pipes.
  270 So far a method for switching backup servers in to assume the IP address of
  271 a lame server has been found and a way to source services from otherwise
  272 lame servers has been explored. However if we are trying to back up a service
  273 that provides a large amount of relatively dynamic data and the service goes
  274 down we still do not have an adequate solution.
  276 An example of such a service is a HTTP server. It is not necessarily practical
  277 to keep multiple copies of a web site on different hosts due to the dynamic
  278 nature of most sites and the cost in terms of disk space. A solution that
  279 enables a backup server to access the content of a service such as HTTP
  280 when the main server goes down is to have the content situated on a third
  281 server and mounted via NFS.
  283 If the NFS server is set up such that it does nothing but serve NFS it should
  284 be quite stable and a low risk single point of failure. Additionally, by placing
  285 the NFS server on a physically separate network or on a different segment
  286 of the LAN and giving servers that use it a second network card there is no
  287 issue relating to extra data on the network.
  289 Therefore the content for the service can be accessed regardless of whether
  290 the main server or the backup server is in operation. In the case of an HTTP
  291 server for which this solution is particularly well suited, this means the web
  292 site should remain accessible.
  298 Although all of the solutions discussed do not require a dedicated backup
  299 server it is advisable to have one. If a server that has other tasks to perform
  300 is run as a backup server then the additional load placed on the server when it
  301 is running the services of another box may cause an unacceptable slow down
  302 or raise reliability issues. For this reason it is advisable to have a backup
  303 server on which very little is running.
  309 As with any system is is important to test that the backup server functions as
  310 expected. Your testing regime should include a full production test including
  311 having any automated aspects run their due course. Although this will result
  312 in some disruption of service to users it is better for a brief outage to occur
  313 under controlled circumstances than for some unexpected behavior to surface
  314 in a crisis situation.
  316 It is also a good plan to have a regular testing procedure in place. The
  317 nature of the backup server is that it hardly ever gets used and is likely to
  318 be used for other purposes from time to time. As such it is very easy for
  319 one configuration or another to get altered and go unnoticed. By conducting
  320 regular, possibly automated tests you can ensure that the backup server is
  321 always in good shape.
  327 The ARP based solution is particularly well suited to services which act as a
  328 relay. Proxies and SMTP relays fall into this category and the users should
  329 not be able to tell when the backup server is in operation.
  331 When the service that is to be backed up is a source of data this method of
  332 creating redundant servers though not well suited can still be successfully
  333 applied. A backup POP3 or IMAP server could be configured such that an
  334 email explaining the current situation is delivered. Key parts of a web site
  335 can be duplicated and warning pages issued in lieu of unavailable pages
  336 When the ARP based solution is coupled with a TCP/IP switch then services
  337 that provide content can also be made more redundant. Finally by housing
  338 content on a NFS server backup servers can have access to content and serve
  339 it accordingly
  341 The redundant servers created can be used in a variety of situations. First
  342 and foremost their activation can be automated such that the backup servers
  343 are called into service in emergency situations. Automation is particularly
  344 attractive here as such situations typically occur around 2 am. Additionally
  345 the redundancy can be used to prevent disruption to users when system
  346 maintenance and hardware upgrades are being undertaken.
  348 We can see that using simple utilities coupled with the power of Linux redun-
  349 dant servers are easy to realise even for small organisation. This redundancy
  350 can be used to provide a more constant and stable level of service to users
  351 This increases their satisfaction while reducing your support burden.
  353 While it is obvious that the solutions presented are targeted towards low end
  354 applications there is no reason why these concepts could not be scaled up.
  357 What is important to realise is that the power of Linux enables us to create
  358 solutions that suit our needs rather than modifying our needs to fit with the
  359 solutions available.
  365 ARP: Address Resolution Protocol. Protocol used to map an interface's IP
  366      address to the hardware address of the network card.
  368 Daemon: A programme that runs in the background and performs a specific task. 
  369      A Web server is usually implemented as a daemon.
  371 Data Pipe: Daemon that accepts a TCP/IP connection from and forwards
  372      it to another host and port. Note that the host can be any host including
  373      the host on which the daemon is running.
  375 DNS: Domain Name Service. Distributed database used to map host names
  376      to IP addresses and vice-versa.
  378 Hardware Address: Unique number associated with each network card
  379      used with low level protocols.
  381 Host: A computer on the Internet
  383 Localhost: Interface on a computer that loops back to the computer on
  384     which the interface resides on.
  386 HTTP: HyperText Transfer Protocol. Protocol used by the World Wide Web.
  388 IMAP: Internet Message Access Protocol. Protocol used to view mail in
  389     remote mail boxes.
  391 Interface. Software access point to network hardware.
  393 IP: Internet Protocol. The underlying protocol used to transfer data on the
  394     Internet
  396 IP Address. Unique number assigned to each interface on the Internet. 
  397    IP Aliasing. Kernel option that allows multiple interfaces to be assigned
  398    to a single network card.
  400 ISP: Internet Services Provider. An organisation that provides internet con-
  401    connectivity and other related services.
  403 LAN: Local Area Network. Network used to connect boxen at close proximity.
  405 NFS: Network File System. Method of making a directory and its contents
  406    available to other boxen on a network.
  408 Redundancy: The ability to keep functioning at some level after a failure.
  410 POP3: Post Office Protocol. Protocol used to download mail from remote
  411     mail boxes.
  413 Proxy: A service by which requests for information from services such as
  414     HTTP are done on behalf of clients and the information is returned to the
  415     client. The information collected on behalf of the client may be kept in a
  416     local cache on the proxy server.
  418 Port: A software access point to a host. Hosts have multiple ports and
  419     daemons typically listen on a specific port or ports for connections from
  420     clients
  422 Service: A source of information that users access. e.g. A HTTP server
  423     provides web pages.
  425 SMTP: Simple Mail Transfer Protocol. Protocol used to transfer email over
  426    the internet.
  428 SMTP relay: SMTP server that forwards email from one box to another
  430 TCP/IP: Transmission Control Protocol. Internet Protocol. Pair of protocols 
  431     that provide a connection based service used on the internet for protocols
  432     such as HTTP and SMTP.
  434 /etc/hosts: A file on Unix systems that maps host name to IP addresses.
  439 References
  441 [HM] hm@seneca.muc.de Harald Milz. Linux High Availability Howto.
  442      http://www.muc.de./~hm/linux/HA/High-Availability-HOWTO.html,
  443      http://sunsite.unc.edu./pub/Linux/ALPHA/linux_ha/
  444      High-Availability-HOWTO.html, February 1998.
  446 [OP] oskar@is.co.za Oskar Pearson. Squid Users Guide
  447      http://cache.is.co.za./squid/, September 1997.
  449 [YV] jmcdonal@unf.edu Yuri Volobuev. Playing Redir Games With ARP
  450      And ICMP http://www.rootshell.com./.
  453 Creating Redundant Linux Servers   - TEXT VERSION