"Fossies" - the Fresh Open Source Software Archive 
Member "mosshe/README.txt" (6 Oct 2020, 19484 Bytes) of package /linux/privat/old/mosshe.tar.gz:
As a special service "Fossies" has tried to format the requested text file into HTML format (style:
standard) with prefixed line numbers.
Alternatively you can here
view or
download the uninterpreted source code file.
1 -----------------------------------------------------------------------
2 MoSShE v20.9.26
3 2003-2020 by Volker Tanger <volker.tanger@wyae.de>
4 -----------------------------------------------------------------------
5
6 This software is being retired and replaced by
7 http://www.wyae.de/software/moshel/
8
9
10 MoSShE (MOnitoring in Simple SHell Environment) is a simple,
11 lightweight (both in size and system requirements) server monitoring
12 package designed for secure and in-depth monitoring of single or
13 multiple typical/critical internet systems.
14
15 As most of the servers/services I want to monitor are remote systems,
16 traditional NMS (relying on close-looped and/or unencrypted sessions) are
17 either big, complicated to install for safe remote monitoring, ressource
18 intense (when doing remote checks), lack a status history or a combination
19 thereof.
20
21 Thus I wrote this small, easily configured system. It originally was
22 intended for monitoring of single a handful of typical internet
23 systems. With the more recent system and grouping features monitoring
24 of serious numbers of systems is easily possible.
25
26 MoSShE supports email alerts and SLA monitoring out of the box - and
27 whatever you can script.
28
29 The system is programmed in plain (Bourne) SH, and to be compatible
30 with BASH and Busybox so it can easily be deployed on embedded systems.
31
32 Monitoring is designed to be distributed over multiple systems,
33 usually running locally. As no parameters are accepted from outside,
34 checks cannot be tampered or misused from outside.
35
36 The system is designed to allow decentralized checks and evaluation as
37 well as classical agent-based checks with centralized data
38 accumulation.
39
40 Agent data is transferred via HTTP (pull-mode) or FTP, SSH, SCP, ...
41 in push-mode, so available web servers can be co-used for agent data
42 transfer. Additionally each agent creates simple (static) HTML pages
43 with full and condensed status reports on each system, allowing simple
44 local checks.
45
46
47 Requirements for MoSSHe:
48 * Unix Shell (BASH, soon Bourne-SH, Busybox)
49 * standard Unix text tools (fgrep, cut, head, mail, time, date, ...)
50 * "netcat" networking tool
51
52 for single checks only if performed:
53 * "pstree" for tree view of process list
54 * "dig" for DNS check
55 * "free" memory display for memory check
56 * "lpq" BSD(compatible) printing for printing check
57 * "lynx" web browser for HTTP check or server/client architectures
58 * "curl" or "wget" web downloader for server/client architectures
59 * "mailq" if running the mail queue check
60 * "mbmon" or "lm-sensors" motherboard check for temp/fan check
61 * "smbclient" for samba check
62 * [future] "snmp" networking tools (especiall "snmpget") for SNMP check
63 * /proc/mdstat for Linux MD0 SoftRAID checks
64 * "smartctl" (smartmontools) for HD health checks
65 * "tw_cli" from 3ware (now: LSI) for Raid3ware checks
66 * "mysqladm" for MySQL checks
67 * "apcaccess" for UPS checks
68 * postfix and dovecot checks only work on SYSTEMD systems
69
70 for web interface:
71 * webserver - which can server static files (= nearly any)
72 * the "dygraphs" JavaScript library. Included in the archive
73 within the /plotdata/ directory
74
75 for PUSH configuration:
76 * ftp server with incoming directory
77 * SCP server with incoming directory
78 * fileserver (SMB) with incoming directory
79
80
81 Hardware requirements:
82 A difficult question. As the checks are run and evaluated
83 locally on each system it is nearly impossible to "overload"
84 the server as is with other monitoring systems.
85
86 The system is a shell script, so no big size components here,
87 either. For a webserver (nearly) any HTTPD is fine. No
88 database needed - everything is plain text.
89
90
91
92 KNOWN ISSUES:
93 - currently (13.5.14 and newer) only works in BASH, but not in
94 BOURNE shell / Busybox, needs compatibility cleanup
95
96
97 Updates will be available at http://www.wyae.de/software/mosshe/
98 Please check there for updates prior to submitting patches!
99
100 There is a user/developer mailing list available. To subscribe send a
101 mail with "subscribe mosshe" as subject to minimalist@wyae.de
102
103 For bug reports and suggestions or if you just want to talk to me
104 please contact me at volker.tanger@wyae.de
105
106
107 -----------------------------------------------------------------------
108 Monitoring server Setup
109 -----------------------------------------------------------------------
110
111 Get and unzip the archive - usually in /usr/local/lib/mosshe.
112
113 copy the whole plotdata/ directory to WWWDIR (see below)
114
115
116 Edit the MOSSHE file and set the environment
117
118 MYNAME HOSTname of this server
119
120 MYDOM DOMAINname of this server
121
122 MYGROUP GROUPname of this server
123
124 WWWDIR where the HTML reports and status file are saved to
125
126 DATADIR location of MOSSHE scripts (/usr/local/lib/mosshe)
127
128 TEMPDIR for temporary files (default: /tmp)
129
130
131 In the MOSSHE shell script file you now can configure the checks to be
132 run - usually you can set warning and alert trigger levels
133
134
135 #=========================================================
136 # Local Checks
137 #=========================================================
138
139 DaysUpCheck notify of recent reboot
140
141 DebianUpdatesAvailable status whether updates are available (debian)
142 needs hourly cron job - included in the TAR
143 FedoraYumUpdatesAvailable status whether updates are available (yum)
144 needs hourly cron job - included in the TAR
145 FedoraDnfUpdatesAvailable status whether updates are available (dnf)
146 needs hourly cron job - included in the TAR
147 ArchlinuxUpdatesAvailable status whether updates are available (debian)
148 needs hourly cron job - included in the TAR
149 UbuntuUpdatesAvailable number of package updates available (ubuntu)
150 UbuntuReleaseUpgrade is a release upgrade available? (ubuntu)
151 UbuntuRebootRequired is a reboot required according to system? (ubuntu)
152
153 HDCheck minimum free space on a filesystem in MB
154 (string match on "df" command)
155 HDCheckGB minimum free space on a filesystem in GB
156 HDfreeMB minimum free space on a filesystem in MB (mount point)
157 HDfreeGB minimum free space on a filesystem in GB (mount point)
158 HDparmState no alert, only records active/standby state of discs
159
160 LoadCheck maximum load of a system
161 LoadHektoCheck maximum load of a system (= uptime * 100)
162 MemCheck minimum free RAM (MB)
163 MemCheckLinux minimum free RAM (MB) under Linux
164
165 ProcessCheck maximum processes running
166 ZombieCheck maximum zombie processes
167 ShellCheck maximum shells for root / other users
168
169 NetworkErrorsCheck percentage of errors on interface
170 NetworkTrafficCheck maximum kbit/s network throughput
171 NetworkBandwidth maximum GByte/month bandwidth usage
172 (momentary use projected to month values)
173 NetworkConnections number of established connections (Warn Alert)
174 NetworkConnectionsNetstat dito, using "netstat" instead of "ss"
175
176 FileCheck check file existing (check PIDs or named pipes)
177 ProcCheck check for process existing
178
179 FileTooOld check whether file was modified not too long ago
180 (e.g. for checking whether a backup has run)
181 FileTooBig check for files growing too much - esp. useful
182 for logfiles (no logrotate/gallopping problems)
183 FileLines check whether a file exceeded a number of lines
184
185 MailqCheck maximum number of mails in queue
186 PrintCheck maximum number of print jobs in queue
187
188 MBMonCheck Motherboard-checks: maximum temperature, fan speeds (mbmon)
189 HardwareSensor Hardware-Check: sensor warn alert
190 HardwareSensorBetween Hardware-Check: sensor min max
191
192 SmartMonHealth health status of hard discs (reads test state)
193 HDhardwareSmart alerts on hardware failure counters
194 Raid3ware OK status of 3ware RAID controllers
195 RaidCheck checks md0 RAID (WARN=syncing, ALERT=fail)
196
197 ApcUpsValueTooHigh checks UPS health if a value is too high
198 ApcUpsValueTooLow checks UPS health if a value is too low
199 ApcUpsStatus checks UPS health if a value is requivalent to
200
201 LogEntryCheck maximum number of message matches in logfiles
202 (used to check for bruteforcing, see examples in MOSSHE)
203
204 CheckFileChanges compare current file to known-good copy
205 CheckConfigChanges compare config (command) to known-good copy
206
207
208 #=========================================================
209 # Network Checks
210 #=========================================================
211
212 PingPartner maximum ping loss and avg. roundtrip
213 PingTime max roundtrip time regardless loss
214 PingLoss max % Loss regardless roundtrip
215
216 TCPing generic TCP connect ping
217
218 SAMBA checks for Microsoft file server (SMB/CIFS/Samba)
219
220 HTTPheader http server return code
221 HTTPheadermatch checks for named return code (usually 302-Moved)
222 HTTPcontentmatch check for web site content
223
224 FTPcheck checks for FTP service
225
226 SSHcheck checks for SSH service
227
228 POP3check checks for POP3 service
229 IMAPcheck checks for IMAP service
230 SMTPcheck checks for SMTP mail service
231
232 RBLcheckIP checks whether an IP address is listed on RBL
233 RBLcheckFQDN checks whether a named system is listed on RBL
234
235 DNSquery checks whether a DNS response is given
236 DNSmatch checks a DNS response against expected value
237
238
239 #=========================================================
240 # MySQL Checks
241 #=========================================================
242
243 MySQLThreads number of Threads running
244 MySQLQueries number of Queries/second
245
246
247 #=========================================================
248 # Mail checks - all per the last 5 minutes (WARN/ALERT)
249 #=========================================================
250
251 PostfixOutTLS number of outgoing TLS connections
252 PostfixInTLS number of incoming TLS connections
253 PostfixInConnections number of outgoing connections
254 PostfixNoqueue number of rejected incoming mails
255 PostfixSent number of sent mails
256
257 DovecotStored number of mails stored by dovecot without sieve
258 DovecotSieved number of mails handled by sieve
259 DovecotLoginFailed number of failed logins
260
261
262 #=========================================================
263 # VIRTUALization Checks
264 #=========================================================
265
266 CheckVserverDown verifies if Linux VSERVER is shut down
267 CheckVserverUp verifies if Linux VSERVER is up and running
268 VserverLoad measures individual Linux VSERVER uptime * 100
269
270 VZbeancounter checks usage (percent) of OpenVZ/Virtuzzo beancounters
271
272
273 #=========================================================
274 # Import agent data *from* other servers
275 #=========================================================
276
277 Typical setup is to monitor multiple scattered servers from behind
278 (DSL) a router/firewall.
279
280 With this function you can establish one or multiple central servers
281 by including the data from other MoSShE agents into the local one.
282 Just be careful not to do circular inclusions or your logfile size
283 might explode!
284
285 ImportAgent URL to the index.csv file, which can include
286 username and password as in
287 http://user:passwd@remote.server.test/mosshe/index.csv
288
289 ImportAgentCurl - see above, but using curl instead of lynx
290
291 ImportAgentWget - see above, but using wget instead of lynx
292
293 ImportServerInfo import the server info txt file for a server
294
295
296
297 #=========================================================
298 # Centralize data *to* other servers
299 #=========================================================
300
301 Typical setup is to monitor multiple customer servers without opening
302 a TCP listener on them to reduce possible attack surface on those
303 systems. Instead have them send the information files to your own,
304 dedicated incoming monitoring system using battle-proven file transfer
305 system servers and methods: ftp-incoming, ssh/scp.
306
307 Or to monitor systems within a LAN without having to run additional
308 network services (except maybe the network file system mounter).
309
310 You can combine centralizing functions sequentially. You can set up a
311 "internet monitoring" server in a DMZ, receiving monitoring data from
312 customers servers via FTP and SCP - and pulling other infos off other
313 hosting systems via ImportAgent. Using separate (password-protected)
314 customer incoming monitoring directories, you even can offer split
315 monitoring: you pull all your customers from the incoming server - and
316 each customer can pull the already accumulated monitoring for their
317 systems from that machine, too.
318
319 You can mix and combine ad lib - just make damn sure not to create
320 loops, otherwise your logs will explode.
321
322 You need to setup a secured incoming server - I suggest ftp (incoming
323 directory mechanism) or SCP with password-free certificates (but make
324 sure to disable shell access). On LANs you maybe alreday have a common
325 NAS (network drive) mounted you can directly use a dedicated
326 monitoring drop-off directory.
327
328 Examples of drop-off script snippets to include into the MOSSHE script:
329
330 ### via file system mount
331 cp $WWDIR/index.csv /mnt/nfsmount/mosshe/zeus.example.com.csv
332
333 ### via password-free ssh key
334 scp $WWDIR/index.csv mosshe@central.example.com:zeus.example.com.csv
335
336 ### via ftp-upload
337 ftp-upload --host central.example.com --user USER --password PASSWD \
338 --passive --no-ls --dir /incoming \
339 --as zeus.example.com.csv $WWDIR/index.csv
340
341
342
343 On the importing/monitoring server the ReapPassiveChecks script checks
344 for the existence of the check file and its age. If the file is too
345 old (given in minutes), something with the (passively) monitored
346 system is probably wrong and an alert is raised.
347
348 # MYGROUP="Externals"
349 #### reap from servername, max.age , file location
350 # ReapPassiveChecks zeus.example.com 10 /home/ftp/zeus.example.com.csv
351 # ReapPassiveChecks hera.example.com 10 /home/ssh/hera.example.com.csv
352
353
354
355
356 #=========================================================
357 # Alerting, Logging
358 #=========================================================
359
360
361 LogTo write full log to given filename
362 LogToDaily as above, but with date appended
363 LogToWeekly as above, but with calendar week appended
364 LogToMonthly as above, but with month appended
365
366 SyslogOnChange log status changes to syslog
367
368 AlertMailAlways send alert whenever a service IS down
369 AlertMailOnChange send alert mail only if something changed
370 (whenever a service GOES up or down)
371 AlertMailOnChangeFor as above, but with pattern matching e.g.
372 server or group name
373 e.g. for alerting different admins
374
375 SLA_Eval builds log extract for downtime documentation
376
377 CreateDataFiles create data files with N maximum data sets
378 for each host and check (see $WWWDIR/datalog/)
379
380 PlotDataFiles create data files with N maximum data sets
381 (no longer uses GNUplot)
382
383 PlotAvgDataFiles create data files with values averaged over X
384 points with N maximum data sets
385 ONLY RUN AFTER PlotDataFiles
386 (no longer uses GNUplot)
387
388
389 -----------------------------------------------------------------------
390 Usage
391 -----------------------------------------------------------------------
392
393 Adapt the "mosshe" script.
394
395 Place the CRON.D_MOSSHE file into /etc/cron.d
396 or adapt it accordingly so mosshe is called periodically.
397
398 Via the web interface you can view the overall status - full and
399 abbreviated status. But you cannot modify anything - which makes it
400 quite safe for even non-admin multiuser use...
401 ;-)
402
403
404
405 Quick setup:
406 ------------
407 * make sure you have NMAP installed
408 * change to the TOOLS directory.
409 * run ./create_mosshe.sh MYNETWORKFILE ipaddress/mask
410 * adapt MYNETWORKFILE (especially setting the right mail addresses and
411 paths!) and rename it to ../mosshe
412 * copy CRON.D_MOSSHE to /etc/cron.d/mosshe and reload CRON
413
414 For example running
415 ./create_mosshe.sh ../mosshe 192.168.0.0/16
416 will scan your local network (in this example: 192.168.0.0/16) and
417 create a basic monitoring from the services found.
418
419
420 -----------------------------------------------------------------------
421 Known/common Problems and Maintenance
422 -----------------------------------------------------------------------
423
424 During the first run (usually including every reboot of the system)
425 MoSShE will complain on nonexisting previous files it tries to compare
426 its status and values to. This is normal and expected.
427
428 Only if you consistently see errors popping up in the CRON logs/mails
429 that do not clearly relate to actual system errors (e.g. network
430 outages) there something needs fixing.
431
432
433
434 -----------------------------------------------------------------------
435 Customizing Checks & Writing your own
436 -----------------------------------------------------------------------
437
438 Writing your own:
439
440 A check must terminate within a given (short) timeframe regardless
441 circumstances - so make sure there are timeouts builtin or configured.
442 If not, your complete MoSSHe might hang when this check stops.
443
444 Scripts (better: shell functions) must write a status line to
445 $TEMPDIR/tmp.$$.collected.tmp
446
447 A check *must* give back the results in ONE LINE PER STATUS ONLY in
448 the format:
449 date;time;groupname;systemname;status;numeric;long
450
451
452 DATE in ISO format: yyyy-mm-dd
453 with yyyy = 4digit year, mm=2digit month, dd=2digit day
454
455 TIME HH:MM:SS - 24hour time, all 2digit
456 this is the time local to MoSSHe server for all PING and service
457 checks, but local time of the server checked when using imported
458 checks
459
460 GROUPNAME
461 Domain name or some group name for the system as configured in mosshe
462
463 SYSTEMNAME
464 Host name or IP address of the system as configured in mosshe
465
466 CHECK (short) name of the check.
467
468 STATUS any status of: OK, INFO, WARN, ALERT, UNDEF
469
470 NUMERIC the numeric value of the test, e.g. LOAD number, free megabytes, etc.
471 It must be a valid FLOAT or INT number to be displayed nicely.
472
473 LONG A longer text with details to the status. Should be short enough to
474 fit into one line of the web display for nicer display, though.
475
476
477 Here an example of the output of a number of checks - the first 6 checks
478 after PING are all from a single LOCALCHECK script, btw.
479
480 2004-07-23;23:55:32;LanSrv;kali;ping;OK;1;host up
481 2004-07-23;23:55:32;LanSrv;kali;/dev/hda1;OK;4054;Disk free
482 2004-07-23;23:55:32;LanSrv;kali;/dev/hda2;OK;1395;Disk free
483 2004-07-23;23:55:32;LanSrv;kali;/dev/hdb3;OK;2817;Disk free
484 2004-07-23;23:55:32;LanSrv;kali;load;OK;0.80;Load: 0.80
485 2004-07-23;23:55:32;LanSrv;kali;processes;OK;76;Total processes: 76
486 2004-07-23;23:55:32;LanSrv;kali;zombies;OK;0;Zombie processes: 0 = ok
487 2004-07-23;23:55:34;LanSrv;hermes;ping;OK;1;host up
488
489
490 Please keep in mind that MoSSHe is designed to be lean, small, efficient.
491 Thus having to install a JSP/EJB server only to install one singular check
492 usually is not considered overly adequate.
493
494 Small, simple, secure - that's the way we should go.
495
496
497 If you have a nice (free) check that could be of use to other people, please
498 send it to me so I can include it into the distribution.
499
500
501 -----------------------------------------------------------------------
502 Shortcut: Distributable under GPL
503 -----------------------------------------------------------------------
504 Copyright (C) 2003- Volker Tanger
505
506 This program is free software; you can redistribute it and/or modify
507 it under the terms of the GNU General Public License as published by
508 the Free Software Foundation; either version 2 of the License, or (at
509 your option) any later version.
510
511 This program is distributed in the hope that it will be useful, but
512 WITHOUT ANY WARRANTY; without even the implied warranty of
513 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
514 General Public License for more details.
515
516 You should have received a copy of the GNU General Public License
517 along with this program; if not, write to the Free Software
518 Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
519 USA. or on their website http://www.gnu.org/copyleft/gpl.html
520
521 -----------------------------------------------------------------------
522