"Fossies" - the Fresh Open Source Software Archive 
Member "freeha-1.0/INSTALL" (23 Nov 2006, 6139 Bytes) of package /linux/privat/old/freeha-1.0.tar.gz:
As a special service "Fossies" has tried to format the requested text file into HTML format (style:
standard) with prefixed line numbers.
Alternatively you can here
view or
download the uninterpreted source code file.
1
2 INSTALL documentation for "FreeHA"
3 -----------------------------------
4
5 FreeHA currently supports running a 'service' on a two-node
6 cluster, in active/standby.
7 Its concept of 'single service' is rather flexible. You can actually
8 have it handle a collection of actual services; the only limitation is
9 that it is an all-or-nothing affair. Either ALL the services are running,
10 or the box has somehow 'failed', and the other box should start up services.
11
12
13
14 ********COMPILING**********************
15
16 After a quick glance-through of the top of Makefile, you should be able
17 to just do a plain old
18
19 make ; make install
20
21 However, you will then have to customize the three master scripts
22 heavily, according to what 'services' you want to run. See
23 "SETTING UP CLUSTERED SERVICES", below
24
25 You will also need to create a custom startup script for the 'freehad' demon
26 at boot-time, to set what networks to use for cluster communication.
27
28 See 'startdemon' for an example.
29
30 >>> YOU MUST SET THE 'PATH' VAR if you write your own 'startdemon' <<<
31
32
33 PHYSICAL SETUP
34 --------------------
35
36 To run a service wish 'high availability', you first need to
37 connect two machines with multiple network cards, and configure
38 unique IP addresses between them, on unique networks.
39
40 For example:
41
42
43
44 |--------| |--------|
45 | box 1 |-10.1.1.3-----ha_net1--10.1.1.5-----| box 2 |
46 | | | |
47 | |-192.168.1.3--ha_net2--192.168.1.5--| |
48 | | | |
49 | | | |
50 | |-20.14.3.6 20.14.3.11-| |
51 | | | | | |
52 |--------| | | |--------|
53 | |
54 ------general-network-with-other-machines------------
55
56
57 10.1.1.255 is then the broadcast address freeHA will use for the first
58 private channel of communication, and 192.168.1.255 is the broadcast address
59 for the second channel
60
61 "private channel" can be translated as "a network crossover cable
62 directly connected to each machine", or whatever works for your site.
63
64 You would start freehad on box1 as
65 freehad -a 10.1.1.3 -A 10.1.1.255 -b 192.168.1.3 -B 192.168.1.255
66
67 [although if you dont specify the broadcast, freehad will default it
68 to be a class C style broadcast anyway]
69
70
71
72 ============SETTING UP CLUSTERED SERVICES===================================
73
74 Services are controlled by scripts which are usually in
75 /opt/freeha/bin
76
77 Doing a "make install" will copy default versions of the required
78 scripts to that directory.
79 The top-level scripts are:
80
81
82 - starthasrv
83 - stophasrv
84 - monitorhasrv
85
86
87 For each service you plan to run, you must add a line (or two) to each of
88 the three scripts to handle it.
89
90 A major goal of the FreeHA project is to provide easy to use utility
91 scripts for all common services people are interested in clustering.
92 That way, line entries could be as simple as
93
94 starthasrv: vip.start hme0 1.2.3.4
95 stophasrv: vip.stop hme0 1.2.3.4
96 monitorthasrv: vip.monitor 1.2.3.X
97
98 A sample fake service is provided, so that you can see the demon in action.
99
100 For your convenience, there is a sample boot-time startup script for the
101 freehad demon, named "startdemon"
102
103
104
105 ***NOTE ON HA STARTUP***
106 Please note that BOTH NODES must be running before the service
107 will be auto started up.
108 Once both nodes are running, services will normally be auto started
109 on the 'alphabetically first' node. Thus, if you have 3 systems named
110 "ha1", "ha2", and "ha3", then 'ha3' can be considered the "primary"
111 system.
112
113
114 STATUS of a node
115 Status of nodes can be found by reading the status file on any node.
116 The location of the status file defaults to
117
118 /var/run/freeha.status
119
120 or /var/freeha/freeha.status if there is no /var/run,
121 or whatever you specify to be the status file when you startup freehad.
122
123
124
125
126 ==============================CAVIATES==============================
127
128 >>
129 Make your monitoring scripts run FAST. heartbeats are sent between monitor
130 runs. If your monitoring hangs, heartbeats will not be sent, which will
131 eventually lead to the node being set to timedout state by other nodes.
132 At which point, another node will try to TAKE OVER SERVICES!!!
133 Adjust timeout seconds to be longer, if monitoring is unavoidably slow.
134 timeout is 120 seconds by default, so you have a good amount of leeway
135 to begin with.
136
137 >>
138 Similarly to the above... be REALLY careful using timesync software
139 on clustered nodes. You should always adjust time in small increments
140 (eg: "date -a", or "ntpdate -B") rather than jumping to a new time.
141 Jumping to a new time, will cause timeouts of heartbeats from
142 the other system, and cause split-brain hell.
143 In other words, the local demon on the time-adjusted machine will think it
144 needs to take over services, because the other side has not responded during
145 the gap of the time adjustment.
146
147 >>
148 ***Do not*** run multiple clusters of FreeHA on the same subnet.
149 That is to say, make sure that the 'heartbeat' subnets,
150 are private subnets shared only between the machines in
151 a particular cluster.
152 Or make sure to change the port numbers each cluster uses, so that
153 they do not conflict with each other, even if they are using the same
154 broadcast address destination.
155
156 >>
157 This software is by no means 'secure'. It uses a simple UDP protocol.
158 If someone wanted to, they could easily 'spoof' the states, and
159 cause your cluster to go down.
160 Firewalls are Good. Private networks are Better.
161
162 >>
163 ID for a node is encoded in 'heartbeat' packets as the hostname of the
164 machine, as returned by 'uname -n'. Nodes are *automatically added* to the
165 overall in-memory state of the cluster, if heartbeats are detected from new
166 nodes. (There is no automatic deletion)
167 Be wary of messing with the hostname of your machines.
168
169 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
170 FreeHA, July 2005 -- Philip Brown
171 http://www.bolthole.com/freeha/