"Fossies" - the Fresh Open Source Software Archive

Member "mosshe/README.txt" (6 Oct 2020, 19484 Bytes) of package /linux/privat/old/mosshe.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 -----------------------------------------------------------------------
    2       			MoSShE  v20.9.26
    3  	   2003-2020 by Volker Tanger <volker.tanger@wyae.de>
    4 -----------------------------------------------------------------------
    5 
    6 This software is being retired and replaced by 
    7 http://www.wyae.de/software/moshel/
    8 
    9 
   10 MoSShE (MOnitoring in Simple SHell Environment) is a simple,
   11 lightweight (both in size and system requirements) server monitoring
   12 package designed for secure and in-depth monitoring of single or
   13 multiple typical/critical internet systems. 
   14 
   15 As most of the servers/services I want to monitor are remote systems,
   16 traditional NMS (relying on close-looped and/or unencrypted sessions) are
   17 either big, complicated to install for safe remote monitoring, ressource
   18 intense (when doing remote checks), lack a status history or a combination
   19 thereof.
   20 
   21 Thus I wrote this small, easily configured system. It originally was
   22 intended for monitoring of single a handful of typical internet
   23 systems. With the more recent system and grouping features monitoring
   24 of serious numbers of systems is easily possible.
   25 
   26 MoSShE supports email alerts and SLA monitoring out of the box - and
   27 whatever you can script. 
   28 
   29 The system is programmed in plain (Bourne) SH, and to be compatible
   30 with BASH and Busybox so it can easily be deployed on embedded systems.
   31 
   32 Monitoring is designed to be distributed over multiple systems,
   33 usually running locally. As no parameters are accepted from outside,
   34 checks cannot be tampered or misused from outside. 
   35 
   36 The system is designed to allow decentralized checks and evaluation as
   37 well as classical agent-based checks with centralized data
   38 accumulation. 
   39 
   40 Agent data is transferred via HTTP (pull-mode) or FTP, SSH, SCP, ...
   41 in push-mode, so available web servers can be co-used for agent data
   42 transfer. Additionally each agent creates simple (static) HTML pages
   43 with full and condensed status reports on each system, allowing simple
   44 local checks.
   45 
   46 
   47 Requirements for MoSSHe:
   48 	* Unix Shell (BASH, soon Bourne-SH, Busybox)
   49 	* standard Unix text tools (fgrep, cut, head, mail, time, date, ...)
   50 	* "netcat" networking tool
   51 	
   52 for single checks only if performed:
   53 	* "pstree" for tree view of process list
   54 	* "dig" for DNS check
   55 	* "free" memory display for memory check
   56 	* "lpq" BSD(compatible) printing for printing check
   57 	* "lynx" web browser for HTTP check or server/client architectures
   58 	* "curl" or "wget" web downloader for server/client architectures
   59 	* "mailq" if running the mail queue check
   60 	* "mbmon" or "lm-sensors" motherboard check for temp/fan check
   61 	* "smbclient" for samba check
   62 	* [future] "snmp" networking tools (especiall "snmpget") for SNMP check
   63 	* /proc/mdstat for Linux MD0 SoftRAID checks
   64 	* "smartctl" (smartmontools) for HD health checks
   65 	* "tw_cli" from 3ware (now: LSI) for Raid3ware checks
   66 	* "mysqladm" for MySQL checks
   67 	* "apcaccess" for UPS checks
   68 	* postfix and dovecot checks only work on SYSTEMD systems
   69 	
   70 for web interface:
   71 	* webserver - which can server static files (= nearly any)
   72 	* the "dygraphs" JavaScript library. Included in the archive
   73 	  within the /plotdata/ directory
   74 
   75 for PUSH configuration: 
   76 	* ftp server with incoming directory 
   77 	* SCP server with incoming directory 
   78 	* fileserver (SMB) with incoming directory
   79 
   80 
   81 Hardware requirements:
   82 	A difficult question. As the checks are run and evaluated
   83 	locally on each system it is nearly impossible to "overload"
   84 	the server as is with other monitoring systems. 
   85 
   86 	The system is a shell script, so no big size components here,
   87 	either. For a webserver (nearly) any HTTPD is fine. No
   88 	database needed - everything is plain text. 
   89 
   90 
   91 
   92 KNOWN ISSUES:
   93 	- currently (13.5.14 and newer) only works in BASH, but not in 
   94 	  BOURNE shell / Busybox, needs compatibility cleanup
   95 
   96 
   97 Updates will be available at   http://www.wyae.de/software/mosshe/
   98 Please check there for updates prior to submitting patches!
   99 
  100 There is a user/developer mailing list available. To subscribe send a
  101 mail with "subscribe mosshe" as subject to minimalist@wyae.de
  102 
  103 For bug reports and suggestions or if you just want to talk to me
  104 please contact me at volker.tanger@wyae.de
  105 
  106 
  107 -----------------------------------------------------------------------
  108 Monitoring server Setup
  109 -----------------------------------------------------------------------
  110 
  111 Get and unzip the archive - usually in /usr/local/lib/mosshe.
  112 
  113 copy the whole plotdata/ directory to WWWDIR (see below)
  114 
  115 
  116 Edit the MOSSHE file and set the environment
  117 
  118 	MYNAME	HOSTname of this server
  119 	
  120 	MYDOM	DOMAINname of this server
  121 	
  122 	MYGROUP	GROUPname of this server
  123 	
  124 	WWWDIR	where the HTML reports and status file are saved to
  125 	
  126 	DATADIR	location of MOSSHE scripts (/usr/local/lib/mosshe)
  127 
  128 	TEMPDIR	for temporary files (default: /tmp)
  129 
  130 
  131 In the MOSSHE shell script file you now can configure the checks to be
  132 run - usually you can set warning and alert trigger levels
  133 
  134 
  135 #=========================================================
  136 # Local Checks
  137 #=========================================================
  138 
  139 DaysUpCheck		notify of recent reboot
  140 
  141 DebianUpdatesAvailable	status whether updates are available (debian) 
  142 			needs hourly cron job - included in the TAR
  143 FedoraYumUpdatesAvailable	status whether updates are available (yum) 
  144 			needs hourly cron job - included in the TAR
  145 FedoraDnfUpdatesAvailable	status whether updates are available (dnf) 
  146 			needs hourly cron job - included in the TAR
  147 ArchlinuxUpdatesAvailable	status whether updates are available (debian) 
  148 			needs hourly cron job - included in the TAR
  149 UbuntuUpdatesAvailable	number of package updates available (ubuntu)
  150 UbuntuReleaseUpgrade	is a release upgrade available? (ubuntu)
  151 UbuntuRebootRequired	is a reboot required according to system? (ubuntu)
  152 
  153 HDCheck 		minimum free space on a filesystem in MB
  154 			(string match on "df" command)
  155 HDCheckGB 		minimum free space on a filesystem in GB
  156 HDfreeMB		minimum free space on a filesystem in MB (mount point)
  157 HDfreeGB		minimum free space on a filesystem in GB (mount point)
  158 HDparmState 		no alert, only records active/standby state of discs
  159 
  160 LoadCheck		maximum load of a system
  161 LoadHektoCheck		maximum load of a system (= uptime * 100)
  162 MemCheck		minimum free RAM (MB)
  163 MemCheckLinux		minimum free RAM (MB) under Linux
  164 
  165 ProcessCheck		maximum processes running
  166 ZombieCheck		maximum zombie processes
  167 ShellCheck		maximum shells for root / other users
  168 
  169 NetworkErrorsCheck	percentage of errors on interface
  170 NetworkTrafficCheck	maximum kbit/s network throughput
  171 NetworkBandwidth	maximum GByte/month bandwidth usage 
  172 			(momentary use projected to month values)
  173 NetworkConnections	number of established connections (Warn Alert)
  174 NetworkConnectionsNetstat	dito, using "netstat" instead of "ss"
  175 
  176 FileCheck		check file existing (check PIDs or named pipes)
  177 ProcCheck		check for process existing
  178 
  179 FileTooOld 		check whether file was modified not too long ago
  180 			(e.g. for checking whether a backup has run)
  181 FileTooBig		check for files growing too much - esp. useful
  182 			for logfiles (no logrotate/gallopping problems) 
  183 FileLines		check whether a file exceeded a number of lines
  184 
  185 MailqCheck		maximum number of mails in queue
  186 PrintCheck		maximum number of print jobs in queue
  187 
  188 MBMonCheck		Motherboard-checks: maximum temperature, fan speeds (mbmon)
  189 HardwareSensor		Hardware-Check: sensor warn alert
  190 HardwareSensorBetween	Hardware-Check: sensor min max
  191 
  192 SmartMonHealth		health status of hard discs (reads test state)
  193 HDhardwareSmart		alerts on hardware failure counters
  194 Raid3ware		OK status of 3ware RAID controllers
  195 RaidCheck		checks md0 RAID  (WARN=syncing, ALERT=fail)
  196 
  197 ApcUpsValueTooHigh	checks UPS health if a value is too high
  198 ApcUpsValueTooLow	checks UPS health if a value is too low
  199 ApcUpsStatus		checks UPS health if a value is requivalent to
  200 
  201 LogEntryCheck		maximum number of message matches in logfiles
  202 			(used to check for bruteforcing, see examples in MOSSHE)
  203 
  204 CheckFileChanges	compare current file to known-good copy
  205 CheckConfigChanges	compare config (command) to known-good copy
  206 
  207 
  208 #=========================================================
  209 # Network Checks
  210 #=========================================================
  211 
  212 PingPartner		maximum ping loss and avg. roundtrip
  213 PingTime 		max roundtrip time regardless loss
  214 PingLoss		max % Loss regardless roundtrip
  215 
  216 TCPing			generic TCP connect ping
  217 
  218 SAMBA			checks for Microsoft file server (SMB/CIFS/Samba)
  219 
  220 HTTPheader		http server return code
  221 HTTPheadermatch		checks for named return code (usually 302-Moved)
  222 HTTPcontentmatch	 check for web site content
  223 
  224 FTPcheck		checks for FTP service
  225 
  226 SSHcheck		checks for SSH service
  227 
  228 POP3check		checks for POP3 service
  229 IMAPcheck		checks for IMAP service
  230 SMTPcheck		checks for SMTP mail service
  231 
  232 RBLcheckIP		checks whether an IP address is listed on RBL
  233 RBLcheckFQDN		checks whether a named system is listed on RBL
  234 
  235 DNSquery		checks whether a DNS response is given
  236 DNSmatch		checks a DNS response against expected value
  237 
  238 
  239 #=========================================================
  240 # MySQL Checks
  241 #=========================================================
  242 
  243 MySQLThreads		number of Threads running 
  244 MySQLQueries		number of Queries/second
  245 
  246 
  247 #=========================================================
  248 # Mail checks - all per the last 5 minutes (WARN/ALERT)
  249 #=========================================================
  250 
  251 PostfixOutTLS 		number of outgoing TLS connections
  252 PostfixInTLS 		number of incoming TLS connections
  253 PostfixInConnections 	number of outgoing connections
  254 PostfixNoqueue 		number of rejected incoming mails
  255 PostfixSent 		number of sent mails
  256 
  257 DovecotStored 		number of mails stored by dovecot without sieve
  258 DovecotSieved 		number of mails handled by sieve
  259 DovecotLoginFailed	number of failed logins
  260 
  261 
  262 #=========================================================
  263 # VIRTUALization Checks
  264 #=========================================================
  265 
  266 CheckVserverDown	verifies if Linux VSERVER is shut down
  267 CheckVserverUp		verifies if Linux VSERVER is up and running
  268 VserverLoad		measures individual Linux VSERVER uptime * 100
  269 
  270 VZbeancounter		checks usage (percent) of OpenVZ/Virtuzzo beancounters
  271  
  272 
  273 #=========================================================
  274 # Import agent data *from* other servers
  275 #=========================================================
  276 
  277 Typical setup is to monitor multiple scattered servers from behind
  278 (DSL) a router/firewall.
  279 
  280 With this function you can establish one or multiple central servers
  281 by including the data from other MoSShE agents into the local one.
  282 Just be careful not to do circular inclusions or your logfile size
  283 might explode!
  284 
  285 ImportAgent	URL to the index.csv file, which can include
  286 		username and password as in
  287 		http://user:passwd@remote.server.test/mosshe/index.csv
  288 
  289 ImportAgentCurl - see above, but using curl instead of lynx
  290 
  291 ImportAgentWget - see above, but using wget instead of lynx
  292 
  293 ImportServerInfo  import the server info txt file for a server
  294 
  295 
  296 
  297 #=========================================================
  298 # Centralize data *to* other servers
  299 #=========================================================
  300 
  301 Typical setup is to monitor multiple customer servers without opening
  302 a TCP listener on them to reduce possible attack surface on those
  303 systems. Instead have them send the information files to your own,
  304 dedicated incoming monitoring system using battle-proven file transfer
  305 system servers and methods:  ftp-incoming, ssh/scp.
  306 
  307 Or to monitor systems within a LAN without having to run additional
  308 network services (except maybe the network file system mounter).
  309 
  310 You can combine centralizing functions sequentially. You can set up a
  311 "internet monitoring" server in a DMZ, receiving monitoring data from
  312 customers servers via FTP and SCP - and pulling other infos off other
  313 hosting systems via ImportAgent. Using separate (password-protected)
  314 customer incoming monitoring directories, you even can offer split
  315 monitoring: you pull all your customers from the incoming server - and
  316 each customer can pull the already accumulated monitoring for their
  317 systems from that machine, too.
  318 
  319 You can mix and combine ad lib - just make damn sure not to create
  320 loops, otherwise your logs will explode.
  321 
  322 You need to setup a secured incoming server - I suggest ftp (incoming
  323 directory mechanism) or SCP with password-free certificates  (but make
  324 sure to disable shell access). On LANs you maybe alreday have a common 
  325 NAS (network drive) mounted you can directly use a dedicated
  326 monitoring drop-off directory. 
  327 
  328 Examples of drop-off script snippets to include into the MOSSHE script: 
  329 
  330 ### via file system mount
  331 cp $WWDIR/index.csv /mnt/nfsmount/mosshe/zeus.example.com.csv		
  332 
  333 ### via password-free ssh key
  334 scp $WWDIR/index.csv mosshe@central.example.com:zeus.example.com.csv	
  335 
  336 ### via ftp-upload
  337 ftp-upload --host central.example.com --user USER --password PASSWD \
  338 	--passive --no-ls --dir /incoming \
  339 	--as zeus.example.com.csv $WWDIR/index.csv
  340 
  341 
  342 
  343 On the importing/monitoring server the ReapPassiveChecks script checks
  344 for the existence of the check file and its age. If the file is too
  345 old (given in minutes), something with the (passively) monitored
  346 system is probably wrong and an alert is raised. 
  347 
  348 # MYGROUP="Externals"
  349 #### reap from       servername,   max.age , file location
  350 # ReapPassiveChecks  zeus.example.com  10  /home/ftp/zeus.example.com.csv 
  351 # ReapPassiveChecks  hera.example.com  10  /home/ssh/hera.example.com.csv
  352 
  353 
  354 
  355 
  356 #=========================================================
  357 # Alerting, Logging
  358 #=========================================================
  359 
  360 
  361 LogTo			write full log to given filename
  362 LogToDaily		as above, but with date appended 
  363 LogToWeekly		as above, but with calendar week appended
  364 LogToMonthly		as above, but with month appended 
  365 
  366 SyslogOnChange 		log status changes to syslog 
  367 
  368 AlertMailAlways		send alert whenever a service IS down
  369 AlertMailOnChange	send alert mail only if something changed
  370 			(whenever a service GOES up or down)
  371 AlertMailOnChangeFor	as above, but with pattern matching e.g.
  372 			server or group name 
  373 			e.g. for alerting different admins  
  374 
  375 SLA_Eval		builds log extract for downtime documentation 
  376 
  377 CreateDataFiles		create data files with N maximum data sets 
  378 			for each host and check (see $WWWDIR/datalog/)
  379 
  380 PlotDataFiles		create data files with N maximum data sets
  381 			(no longer uses GNUplot)
  382 
  383 PlotAvgDataFiles	create data files with values averaged over X 
  384 			points with N maximum data sets
  385 			ONLY RUN AFTER PlotDataFiles
  386 			(no longer uses GNUplot)
  387 
  388 
  389 -----------------------------------------------------------------------
  390 Usage
  391 -----------------------------------------------------------------------
  392 
  393 Adapt the "mosshe" script. 
  394 
  395 Place the CRON.D_MOSSHE file into /etc/cron.d 
  396 or adapt it accordingly so mosshe is called periodically.
  397 
  398 Via the web interface you can view the overall status - full and
  399 abbreviated status. But you cannot modify anything - which makes it
  400 quite safe for even non-admin multiuser use...  
  401 ;-)
  402 
  403 
  404 
  405 Quick setup:
  406 ------------
  407 * make sure you have NMAP installed
  408 * change to the TOOLS directory.
  409 * run  ./create_mosshe.sh MYNETWORKFILE ipaddress/mask
  410 * adapt MYNETWORKFILE (especially setting the right mail addresses and 
  411   paths!) and rename it to ../mosshe
  412 * copy CRON.D_MOSSHE to /etc/cron.d/mosshe and reload CRON
  413 
  414 For example running
  415 	./create_mosshe.sh ../mosshe 192.168.0.0/16
  416 will scan your local network (in this example: 192.168.0.0/16) and 
  417 create a basic monitoring from the services found.
  418 
  419 
  420 -----------------------------------------------------------------------
  421 Known/common Problems and Maintenance
  422 -----------------------------------------------------------------------
  423 
  424 During the first run (usually including every reboot of the system)
  425 MoSShE will complain on nonexisting previous files it tries to compare 
  426 its status and values to. This is normal and expected. 
  427 
  428 Only if you consistently see errors popping up in the CRON logs/mails
  429 that do not clearly relate to actual system errors (e.g. network
  430 outages) there something needs fixing.
  431 
  432 
  433 
  434 -----------------------------------------------------------------------
  435 Customizing Checks & Writing your own
  436 -----------------------------------------------------------------------
  437 
  438 Writing your own:
  439 
  440 A check must terminate within a given (short) timeframe regardless
  441 circumstances - so make sure there are timeouts builtin or configured.
  442 If not, your complete MoSSHe might hang when this check stops.
  443 
  444 Scripts (better: shell functions) must write a status line to
  445 	$TEMPDIR/tmp.$$.collected.tmp
  446 
  447 A check *must* give back the results in ONE LINE PER STATUS ONLY in
  448 the format: 
  449 date;time;groupname;systemname;status;numeric;long
  450 
  451 
  452 DATE	in ISO format: yyyy-mm-dd   
  453 	with yyyy = 4digit year, mm=2digit month, dd=2digit day
  454 	
  455 TIME	HH:MM:SS - 24hour time, all 2digit
  456 	this is the time local to MoSSHe server for all PING and service
  457 	checks, but local time of the server checked when using imported
  458 	checks
  459 
  460 GROUPNAME
  461 	Domain name or some group name for the system as configured in mosshe
  462 
  463 SYSTEMNAME
  464 	Host name or IP address of the system as configured in mosshe
  465 
  466 CHECK	(short) name of the check. 
  467 
  468 STATUS	any status of: OK, INFO, WARN, ALERT, UNDEF
  469 
  470 NUMERIC	the numeric value of the test, e.g. LOAD number, free megabytes, etc.
  471 	It must be a valid FLOAT or INT number to be displayed nicely.
  472 
  473 LONG	A longer text with details to the status. Should be short enough to
  474 	fit into one line of the web display for nicer display, though.
  475 
  476 
  477 Here an example of the output of a number of checks - the first 6 checks
  478 after PING are all from a single LOCALCHECK script, btw.
  479 
  480 	2004-07-23;23:55:32;LanSrv;kali;ping;OK;1;host up
  481 	2004-07-23;23:55:32;LanSrv;kali;/dev/hda1;OK;4054;Disk free
  482 	2004-07-23;23:55:32;LanSrv;kali;/dev/hda2;OK;1395;Disk free
  483 	2004-07-23;23:55:32;LanSrv;kali;/dev/hdb3;OK;2817;Disk free
  484 	2004-07-23;23:55:32;LanSrv;kali;load;OK;0.80;Load: 0.80
  485 	2004-07-23;23:55:32;LanSrv;kali;processes;OK;76;Total processes: 76
  486 	2004-07-23;23:55:32;LanSrv;kali;zombies;OK;0;Zombie processes: 0 = ok
  487 	2004-07-23;23:55:34;LanSrv;hermes;ping;OK;1;host up
  488 
  489 
  490 Please keep in mind that MoSSHe is designed to be lean, small, efficient.
  491 Thus having to install a JSP/EJB server only to install one singular check
  492 usually is not considered overly adequate. 
  493 
  494 Small, simple, secure - that's the way we should go.
  495 
  496 
  497 If you have a nice (free) check that could be of use to other people, please
  498 send it to me so I can include it into the distribution.
  499 
  500 
  501 -----------------------------------------------------------------------
  502 Shortcut: Distributable under  GPL
  503 -----------------------------------------------------------------------
  504 Copyright (C) 2003- Volker Tanger
  505 
  506 This program is free software; you can redistribute it and/or modify
  507 it under the terms of the GNU General Public License as published by
  508 the Free Software Foundation; either version 2 of the License, or (at
  509 your option) any later version.
  510 
  511 This program is distributed in the hope that it will be useful, but
  512 WITHOUT ANY WARRANTY; without even the implied warranty of
  513 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  514 General Public License for more details.
  515 
  516 You should have received a copy of the GNU General Public License
  517 along with this program; if not, write to the Free Software
  518 Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
  519 USA. or on their website http://www.gnu.org/copyleft/gpl.html
  520 
  521 -----------------------------------------------------------------------
  522