"Fossies" - the Fresh Open Source Software Archive  

Source code changes of the file "doc/15-troubleshooting.md" between
icinga2-2.11.5.tar.gz and icinga2-2.12.0.tar.gz

About: Icinga 2 is an enterprise grade monitoring system which keeps watch over networks and any conceivable network resource.

15-troubleshooting.md  (icinga2-2.11.5):15-troubleshooting.md  (icinga2-2.12.0)
skipping to change at line 619 skipping to change at line 619
StartLimitInterval=10 StartLimitInterval=10
StartLimitBurst=3 StartLimitBurst=3
``` ```
Using the watchdog can also help with monitoring Icinga 2, to activate and use i t add the following to the override: Using the watchdog can also help with monitoring Icinga 2, to activate and use i t add the following to the override:
``` ```
WatchdogSec=30s WatchdogSec=30s
``` ```
This way systemd will kill Icinga 2 if does not notify for over 30 seconds, a ti mout of less than 10 seconds is not This way systemd will kill Icinga 2 if it does not notify for over 30 seconds. A timeout of less than 10 seconds is not
recommended. When the watchdog is activated, `Restart=` can be set to `watchdog` to restart Icinga 2 in the case of a recommended. When the watchdog is activated, `Restart=` can be set to `watchdog` to restart Icinga 2 in the case of a
watchdog timeout. watchdog timeout.
Run `systemctl daemon-reload && systemctl restart icinga2` to apply the changes. Run `systemctl daemon-reload && systemctl restart icinga2` to apply the changes.
Now systemd will always try to restart Icinga 2 (except if you run Now systemd will always try to restart Icinga 2 (except if you run
`systemctl stop icinga2`). After three failures in ten seconds it will stop `systemctl stop icinga2`). After three failures in ten seconds it will stop
trying because you probably have a problem that requires manual intervention. trying because you probably have a problem that requires manual intervention.
### Late Check Results <a id="late-check-results"></a> ### Late Check Results <a id="late-check-results"></a>
skipping to change at line 1028 skipping to change at line 1028
* Subject with the common name (CN) matches the client endpoint name and its FQD N. * Subject with the common name (CN) matches the client endpoint name and its FQD N.
* v3 extensions must set the basic constraint for `CA:TRUE` (ca.crt) or `CA:FALS E` (client certificate). * v3 extensions must set the basic constraint for `CA:TRUE` (ca.crt) or `CA:FALS E` (client certificate).
* Subject Alternative Name is set to the resolvable DNS name (required for REST API and browsers). * Subject Alternative Name is set to the resolvable DNS name (required for REST API and browsers).
Navigate into the local certificate store: Navigate into the local certificate store:
``` ```
$ cd /var/lib/icinga2/certs/ $ cd /var/lib/icinga2/certs/
``` ```
Print the CA certificate: Make sure to verify the agents' certificate and its stored `ca.crt` in `/var/lib
/icinga2/certs` and ensure that
all instances (master, satellite, agent) are signed by the **same CA**.
Compare the `ca.crt` file from the agent node and compare it to your master's `c
a.crt` file.
Since 2.12, you can use the built-in CLI command `pki verify` to perform TLS cer
tificate validation tasks.
> **Hint**
>
> The CLI command uses exit codes aligned to the [Plugin API specification](05-s
ervice-monitoring.md#service-monitoring-plugin-api).
> Run the commands followed with `echo $?` to see the exit code.
These CLI commands can be used on Windows agents too without requiring the OpenS
SL binary.
#### Print TLS Certificate <a id="troubleshooting-certificate-verification-print
"></a>
Pass the certificate file to the `--cert` CLI command parameter to print its det
ails.
This prints a shorter version of `openssl x509 -in <file> -text`.
``` ```
$ openssl x509 -in ca.crt -text $ icinga2 pki verify --cert icinga2-agent2.localdomain.crt
Certificate: information/cli: Printing certificate 'icinga2-agent2.localdomain.crt'
Data:
Version: 3 (0x2) Version: 3
Serial Number: 1 (0x1) Subject: CN = icinga2-agent2.localdomain
Signature Algorithm: sha256WithRSAEncryption Issuer: CN = Icinga CA
Issuer: CN=Icinga CA Valid From: Feb 14 11:29:36 2020 GMT
Validity Valid Until: Feb 10 11:29:36 2035 GMT
Not Before: Feb 23 14:45:32 2016 GMT Serial: 12:fe:a6:22:f5:e3:db:a2:95:8e:92:b2:af:1a:e3:01:44:c4:70:e
Not After : Feb 19 14:45:32 2031 GMT 0
Subject: CN=Icinga CA
Subject Public Key Info: Signature Algorithm: sha256WithRSAEncryption
Public Key Algorithm: rsaEncryption Subject Alt Names: icinga2-agent2.localdomain
Public-Key: (4096 bit) Fingerprint: 40 98 A0 77 58 4F CA D1 05 AC 18 53 D7 52 8D D7 9C 7F 5A 2
Modulus: 3 B4 AF 63 A4 92 9D DC FF 89 EF F1 4C
...
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Basic Constraints: critical
CA:TRUE
Signature Algorithm: sha256WithRSAEncryption
...
``` ```
Print the client public certificate: You can also print the `ca.crt` certificate without any further checks using the
`--cert` parameter.
#### Print and Verify CA Certificate <a id="troubleshooting-certificate-verifica
tion-print-verify-ca"></a>
The `--cacert` CLI parameter allows to check whether the given certificate file
is a public CA certificate.
``` ```
$ openssl x509 -in icinga2-agent1.localdomain.crt -text $ icinga2 pki verify --cacert ca.crt
Certificate: information/cli: Checking whether certificate 'ca.crt' is a valid CA certificate
Data: .
Version: 3 (0x2)
Serial Number: Version: 3
86:47:44:65:49:c6:65:6b:5e:6d:4f:a5:fe:6c:76:05:0b:1a:cf:34 Subject: CN = Icinga CA
Signature Algorithm: sha256WithRSAEncryption Issuer: CN = Icinga CA
Issuer: CN=Icinga CA Valid From: Jul 31 12:26:08 2019 GMT
Validity Valid Until: Jul 27 12:26:08 2034 GMT
Not Before: Aug 20 16:20:05 2016 GMT Serial: 89:fe:d6:12:66:25:3a:c5:07:c1:eb:d4:e6:f2:df:ca:13:6e:dc:e
Not After : Aug 17 16:20:05 2031 GMT 7
Subject: CN=icinga2-agent1.localdomain
Subject Public Key Info: Signature Algorithm: sha256WithRSAEncryption
Public Key Algorithm: rsaEncryption Subject Alt Names:
Public-Key: (4096 bit) Fingerprint: 9A 11 29 A8 A3 89 F8 56 30 1A E4 0A B2 6B 28 46 07 F0 14 1
Modulus: 7 BD 19 A4 FC BD 41 40 B5 1A 8F BF 20
...
Exponent: 65537 (0x10001) information/cli: OK: CA certificate file 'ca.crt' was verified successfully.
X509v3 extensions: ```
X509v3 Basic Constraints: critical
CA:FALSE In case you pass a wrong certificate, an error is shown and the exit code is `2`
X509v3 Subject Alternative Name: (Critical).
DNS:icinga2-agent1.localdomain
Signature Algorithm: sha256WithRSAEncryption ```
... $ icinga2 pki verify --cacert icinga2-agent2.localdomain.crt
information/cli: Checking whether certificate 'icinga2-agent2.localdomain.crt' i
s a valid CA certificate.
Version: 3
Subject: CN = icinga2-agent2.localdomain
Issuer: CN = Icinga CA
Valid From: Feb 14 11:29:36 2020 GMT
Valid Until: Feb 10 11:29:36 2035 GMT
Serial: 12:fe:a6:22:f5:e3:db:a2:95:8e:92:b2:af:1a:e3:01:44:c4:70:e
0
Signature Algorithm: sha256WithRSAEncryption
Subject Alt Names: icinga2-agent2.localdomain
Fingerprint: 40 98 A0 77 58 4F CA D1 05 AC 18 53 D7 52 8D D7 9C 7F 5A 2
3 B4 AF 63 A4 92 9D DC FF 89 EF F1 4C
critical/cli: CRITICAL: The file 'icinga2-agent2.localdomain.crt' does not seem
to be a CA certificate file.
``` ```
Make sure to verify the client's certificate and its received `ca.crt` in `/var/ #### Verify Certificate is signed by CA Certificate <a id="troubleshooting-certi
lib/icinga2/certs` and ensure that ficate-verification-signed-by-ca"></a>
both instances are signed by the **same CA**.
Pass the certificate file to the `--cert` CLI parameter, and the `ca.crt` file t
o the `--cacert` parameter.
Common troubleshooting scenarios involve self-signed certificates and untrusted
agents resulting in disconnects.
``` ```
$ openssl verify -verbose -CAfile /var/lib/icinga2/certs/ca.crt /var/lib/icinga2 $ icinga2 pki verify --cert icinga2-agent2.localdomain.crt --cacert ca.crt
/certs/icinga2-master1.localdomain.crt
information/cli: Verifying certificate 'icinga2-agent2.localdomain.crt'
Version: 3
Subject: CN = icinga2-agent2.localdomain
Issuer: CN = Icinga CA
Valid From: Feb 14 11:29:36 2020 GMT
Valid Until: Feb 10 11:29:36 2035 GMT
Serial: 12:fe:a6:22:f5:e3:db:a2:95:8e:92:b2:af:1a:e3:01:44:c4:70:e
0
icinga2-master1.localdomain.crt: OK Signature Algorithm: sha256WithRSAEncryption
Subject Alt Names: icinga2-agent2.localdomain
Fingerprint: 40 98 A0 77 58 4F CA D1 05 AC 18 53 D7 52 8D D7 9C 7F 5A 2
3 B4 AF 63 A4 92 9D DC FF 89 EF F1 4C
information/cli: with CA certificate 'ca.crt'.
Version: 3
Subject: CN = Icinga CA
Issuer: CN = Icinga CA
Valid From: Jul 31 12:26:08 2019 GMT
Valid Until: Jul 27 12:26:08 2034 GMT
Serial: 89:fe:d6:12:66:25:3a:c5:07:c1:eb:d4:e6:f2:df:ca:13:6e:dc:e
7
Signature Algorithm: sha256WithRSAEncryption
Subject Alt Names:
Fingerprint: 9A 11 29 A8 A3 89 F8 56 30 1A E4 0A B2 6B 28 46 07 F0 14 1
7 BD 19 A4 FC BD 41 40 B5 1A 8F BF 20
information/cli: OK: Certificate with CN 'icinga2-agent2.localdomain' is signed
by CA.
``` ```
#### Verify Certificate matches Common Name (CN) <a id="troubleshooting-certific
ate-verification-common-name-match"></a>
This allows to verify the common name inside the certificate with a given string
parameter.
Typical troubleshooting involve upper/lower case CNs (Windows).
``` ```
$ openssl verify -verbose -CAfile /var/lib/icinga2/certs/ca.crt /var/lib/icinga2 /certs/icinga2-agent1.localdomain.crt $ icinga2 pki verify --cert icinga2-agent2.localdomain.crt --cn icinga2-agent2.l ocaldomain
icinga2-agent1.localdomain.crt: OK information/cli: Verifying common name (CN) 'icinga2-agent2.localdomain in certi
ficate 'icinga2-agent2.localdomain.crt'.
Version: 3
Subject: CN = icinga2-agent2.localdomain
Issuer: CN = Icinga CA
Valid From: Feb 14 11:29:36 2020 GMT
Valid Until: Feb 10 11:29:36 2035 GMT
Serial: 12:fe:a6:22:f5:e3:db:a2:95:8e:92:b2:af:1a:e3:01:44:c4:70:e
0
Signature Algorithm: sha256WithRSAEncryption
Subject Alt Names: icinga2-agent2.localdomain
Fingerprint: 40 98 A0 77 58 4F CA D1 05 AC 18 53 D7 52 8D D7 9C 7F 5A 2
3 B4 AF 63 A4 92 9D DC FF 89 EF F1 4C
information/cli: OK: CN 'icinga2-agent2.localdomain' matches certificate CN 'ici
nga2-agent2.localdomain'.
``` ```
Fetch the `ca.crt` file from the client node and compare it to your master's `ca .crt` file: In the example below, the certificate uses an upper case CN.
``` ```
$ scp icinga2-agent1:/var/lib/icinga2/certs/ca.crt test-client-ca.crt $ icinga2 pki verify --cert icinga2-agent2.localdomain.crt --cn icinga2-agent2.l
$ diff -ur /var/lib/icinga2/certs/ca.crt test-client-ca.crt ocaldomain
information/cli: Verifying common name (CN) 'icinga2-agent2.localdomain in certi
ficate 'icinga2-agent2.localdomain.crt'.
Version: 3
Subject: CN = ICINGA2-agent2.localdomain
Issuer: CN = Icinga CA
Valid From: Feb 14 11:29:36 2020 GMT
Valid Until: Feb 10 11:29:36 2035 GMT
Serial: 12:fe:a6:22:f5:e3:db:a2:95:8e:92:b2:af:1a:e3:01:44:c4:70:e
0
Signature Algorithm: sha256WithRSAEncryption
Subject Alt Names: ICINGA2-agent2.localdomain
Fingerprint: 40 98 A0 77 58 4F CA D1 05 AC 18 53 D7 52 8D D7 9C 7F 5A 2
3 B4 AF 63 A4 92 9D DC FF 89 EF F1 4C
critical/cli: CRITICAL: CN 'icinga2-agent2.localdomain' does NOT match certifica
te CN 'icinga2-agent2.localdomain'.
``` ```
### Certificate Signing <a id="troubleshooting-certificate-signing"></a> ### Certificate Signing <a id="troubleshooting-certificate-signing"></a>
Icinga offers two methods: Icinga offers two methods:
* [CSR Auto-Signing](06-distributed-monitoring.md#distributed-monitoring-setup-c sr-auto-signing) which uses a client (an agent or a satellite) ticket generated on the master as trust identifier. * [CSR Auto-Signing](06-distributed-monitoring.md#distributed-monitoring-setup-c sr-auto-signing) which uses a client (an agent or a satellite) ticket generated on the master as trust identifier.
* [On-Demand CSR Signing](06-distributed-monitoring.md#distributed-monitoring-se tup-on-demand-csr-signing) which allows to sign pending certificate requests on the master. * [On-Demand CSR Signing](06-distributed-monitoring.md#distributed-monitoring-se tup-on-demand-csr-signing) which allows to sign pending certificate requests on the master.
Whenever a signed certificate is not received on the requesting clients, ensure to check the following: Whenever a signed certificate is not received on the requesting clients, ensure to check the following:
skipping to change at line 1498 skipping to change at line 1581
[2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//.checksums' from config sync staging to production zones directory. [2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//.checksums' from config sync staging to production zones directory.
[2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//.timestamp' from config sync staging to production zones directory. [2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//.timestamp' from config sync staging to production zones directory.
[2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//director/001-director-basics.conf' from config sync staging to production zo nes directory. [2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//director/001-director-basics.conf' from config sync staging to production zo nes directory.
[2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//director/host_templates.conf' from config sync staging to production zones d irectory. [2019-08-01 09:20:26 +0200] information/ApiListener: Copying file 'director-glob al//director/host_templates.conf' from config sync staging to production zones d irectory.
... ...
[2019-08-01 09:20:26 +0200] notice/Application: Got reload command, forwarding t o umbrella process (PID 4236) [2019-08-01 09:20:26 +0200] notice/Application: Got reload command, forwarding t o umbrella process (PID 4236)
``` ```
In case the received configuration updates are equal to what is running in produ
ction, a different message is logged and the validation/reload is skipped.
```
[2020-02-05 15:18:19 +0200] information/ApiListener: Received configuration upda
tes (4) from endpoint 'icinga2-master1.localdomain' are equal to production, ski
pping validation and reload.
```
#### Syncing Binary Files is Denied <a id="troubleshooting-cluster-config-sync-b inary-denied"></a> #### Syncing Binary Files is Denied <a id="troubleshooting-cluster-config-sync-b inary-denied"></a>
The config sync is built for syncing text configuration files, wrapped into JSON -RPC messages. The config sync is built for syncing text configuration files, wrapped into JSON -RPC messages.
Some users have started to use this as binary file sync instead of using tools b uilt for this: Some users have started to use this as binary file sync instead of using tools b uilt for this:
rsync, git, Puppet, Ansible, etc. rsync, git, Puppet, Ansible, etc.
Starting with 2.11, this attempt is now prohibited and logged. Starting with 2.11, this attempt is now prohibited and logged.
``` ```
[2019-08-02 16:03:19 +0200] critical/ApiListener: Ignoring file '/etc/icinga2/zo nes.d/global-templates/forbidden.exe' for cluster config sync: Does not contain valid UTF8. Binary files are not supported. [2019-08-02 16:03:19 +0200] critical/ApiListener: Ignoring file '/etc/icinga2/zo nes.d/global-templates/forbidden.exe' for cluster config sync: Does not contain valid UTF8. Binary files are not supported.
skipping to change at line 1530 skipping to change at line 1619
outside in `/etc/icinga2/zones.conf`. outside in `/etc/icinga2/zones.conf`.
If you for example create a "Zone Inception" with defining the If you for example create a "Zone Inception" with defining the
`satellite` zone in `zones.d/master`, the config compiler does not `satellite` zone in `zones.d/master`, the config compiler does not
re-run and include this zone config recursively from `zones.d/satellite`. re-run and include this zone config recursively from `zones.d/satellite`.
Since v2.11, the config compiler is only including directories where a Since v2.11, the config compiler is only including directories where a
zone has been configured. Otherwise it would include renamed old zones, zone has been configured. Otherwise it would include renamed old zones,
broken zones, etc. and those long-lasting bugs have been now fixed. broken zones, etc. and those long-lasting bugs have been now fixed.
A more concrete example: Masters and Satellites still need to know the Zone hier archy outside of zones.d synced configuration. A more concrete example: Masters and Satellites still need to know the Zone hier archy outside of `zones.d` synced configuration.
**Doesn't work** **Doesn't work**
``` ```
vim /etc/icinga2/zones.conf vim /etc/icinga2/zones.conf
object Zone "master" { object Zone "master" {
endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ] endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ]
} }
``` ```
skipping to change at line 1606 skipping to change at line 1695
The thing you can do: For `command_endpoint` agents like inside the Director: The thing you can do: For `command_endpoint` agents like inside the Director:
Host -> Agent -> yes, there is no config sync for this zone in place. Therefore Host -> Agent -> yes, there is no config sync for this zone in place. Therefore
it is valid to just sync their zones via the config sync. it is valid to just sync their zones via the config sync.
#### Director Changes #### Director Changes
The following restores the Zone/Endpoint objects as config objects outside of `z ones.d` The following restores the Zone/Endpoint objects as config objects outside of `z ones.d`
in your master/satellite's zones.conf with rendering them as external objects in the Director. in your master/satellite's zones.conf with rendering them as external objects in the Director.
[Example](06-distributed-monitoring.md#three-levels-with-masters-satellites-and- agents) [Example](06-distributed-monitoring.md#distributed-monitoring-scenarios-master-s atellite-agents)
for a 3 level setup with the masters and satellites knowing about the zone hiera rchy for a 3 level setup with the masters and satellites knowing about the zone hiera rchy
outside defined in [zones.conf](#zones-conf): outside defined in [zones.conf](04-configuration.md#zones-conf):
``` ```
object Endpoint "icinga-master1.localdomain" { object Endpoint "icinga-master1.localdomain" {
//define 'host' attribute to control the connection direction on each instance //define 'host' attribute to control the connection direction on each instance
} }
object Endpoint "icinga-master2.localdomain" { object Endpoint "icinga-master2.localdomain" {
//... //...
} }
skipping to change at line 1780 skipping to change at line 1869
``` ```
critical/TcpSocket: Invalid socket: 10055, "An operation on a socket could not b e performed because the system lacked sufficient buffer space or because a queue was full." critical/TcpSocket: Invalid socket: 10055, "An operation on a socket could not b e performed because the system lacked sufficient buffer space or because a queue was full."
``` ```
Windows is blocking Icinga 2 and as such, no more TCP connection handling is pos sible. Windows is blocking Icinga 2 and as such, no more TCP connection handling is pos sible.
Depending on the version, patch level and installed applications, Windows is cha nging its Depending on the version, patch level and installed applications, Windows is cha nging its
range of [ephemeral ports](https://en.wikipedia.org/wiki/Ephemeral_port#Range). range of [ephemeral ports](https://en.wikipedia.org/wiki/Ephemeral_port#Range).
In order to solve this, raise the the `MaxUserPort` value in the registry. In order to solve this, raise the `MaxUserPort` value in the registry.
``` ```
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
Value Name: MaxUserPort Value Value Name: MaxUserPort Value
Type: DWORD Type: DWORD
Value data: 65534 Value data: 65534
``` ```
More details in [this blogpost](https://www.netways.de/blog/2019/01/24/windows-b locking-icinga-2-with-ephemeral-port-range/) More details in [this blogpost](https://www.netways.de/blog/2019/01/24/windows-b locking-icinga-2-with-ephemeral-port-range/)
 End of changes. 20 change blocks. 
65 lines changed or deleted 194 lines changed or added

Home  |  About  |  Features  |  All  |  Newest  |  Dox  |  Diffs  |  RSS Feeds  |  Screenshots  |  Comments  |  Imprint  |  Privacy  |  HTTP(S)