icinga

Icinga module.

exception spicerack.icinga.IcingaCheckError[source]

Bases: SpicerackCheckError

Custom exception class for check errors of this module.

exception spicerack.icinga.IcingaError[source]

Bases: SpicerackError

Custom exception class for errors of this module.

exception spicerack.icinga.IcingaStatusNotFoundError(hostnames: collections.abc.Sequence[str])[source]

Bases: IcingaError

Custom exception class for a host missing from the Icinga status.

Initializes an IcingaStatusNotFoundError instance.

Parameters:

hostnames (collections.abc.Sequence[str]) -- the hostnames not found in the Icinga status.

exception spicerack.icinga.IcingaStatusParseError[source]

Bases: IcingaError

Custom exception class for errors while parsing the Icinga status.

class spicerack.icinga.CommandFile(icinga_host: RemoteHosts, *, config_file: str = '/etc/icinga/icinga.cfg')[source]

Bases: str

String class to represent an Icinga command file path with cache capabilities.

Get the Icinga host command file path where to write the commands and cache it.

Parameters:
  • icinga_host -- the Icinga host instance.

  • config_file -- the Icinga configuration file to check for the command file directive.

Raises:

spicerack.icinga.IcingaError -- if unable to get the command file path.

class spicerack.icinga.HostStatus(*, name: str, state: str, optimal: bool, downtimed: bool, notifications_enabled: bool, failed_services: collections.abc.Sequence[collections.abc.Mapping] | None = None, services: collections.abc.Sequence[collections.abc.Mapping] | None = None)[source]

Bases: object

Represent the status of all Icinga checks for a single host.

Initialize the instance.

Either services or failed_services may be present, depending on the flags passed to icinga-status.

Parameters:
STATE_UP: str = 'UP'

The Icinga value for a host that is up and running. The other values for the Icinga host state are DOWN and UNREACHABLE.

property acked_services: list[str]

Return a list of services which have failed, but are acknowledged in Icinga.

property failed_services: list[str]

Return the list of service names that are failing.

class spicerack.icinga.HostsStatus[source]

Bases: dict

Represent the status of all Icinga checks for a set of hosts.

property acked_services: dict[str, list[str]]

Return a list of services which have failed, but are acknowledged in Icinga for each host.

property failed_hosts: list[str]

Return the list of hostnames that are not up and running. They can either be down or unreachable.

property failed_services: dict[str, list[str]]

Return the list of service names that are failing for each host that has at least one failing service.

property non_optimal_hosts: list[str]

Return the list of hostnames that are not in an optimal state.

They can either not being up and running or have at least one failed service.

property optimal: bool

Returns True if all the hosts are optimal, False otherwise.

class spicerack.icinga.IcingaHosts(icinga_host: spicerack.remote.RemoteHosts, target_hosts: spicerack.typing.TypeHosts, *, verbatim_hosts: bool = False, dry_run: bool = True) None[source]

Bases: object

Class to manage the Icinga checks of a given set of hosts.

Initialize the instance.

Parameters:
downtime(reason: spicerack.administrative.Reason, *, duration: datetime.timedelta = datetime.timedelta(seconds=14400)) None[source]

Downtime hosts on the Icinga server for the given time with a message.

Parameters:
Return type:

None

downtime_services(service_re: str, reason: spicerack.administrative.Reason, *, duration: datetime.timedelta = datetime.timedelta(seconds=14400)) None[source]

Downtime services on the Icinga server for the given time with a message.

If there are multiple target_hosts, the set of matching services may vary from host to host (e.g. because a hostname, DB section, or other unique fact is included in the service name) and downtime_services will downtime each service on the correct target_host. If some hosts happen to have no matching services, they will be safely skipped. But if no hosts have matching services, IcingaError is raised (because the regex is probably wrong).

Parameters:
  • service_re (str) -- the regular expression matching service names to downtime.

  • reason (spicerack.administrative.Reason) -- the reason to set for the downtime on the Icinga server.

  • duration (datetime.timedelta, default: datetime.timedelta(seconds=14400)) -- the length of the downtime period.

Raises:
  • re.error -- if service_re is an invalid regular expression.

  • spicerack.icinga.IcingaError -- if no services on any target host match the regular expression.

Return type:

None

downtimed(reason: spicerack.administrative.Reason, *, duration: datetime.timedelta = datetime.timedelta(seconds=14400), remove_on_error: bool = False) collections.abc.Iterator[None][source]

Context manager to perform actions while the hosts are downtimed on Icinga.

Parameters:
  • reason (spicerack.administrative.Reason) -- the reason to set for the downtime on the Icinga server.

  • duration (datetime.timedelta, default: datetime.timedelta(seconds=14400)) -- the length of the downtime period.

  • remove_on_error (bool, default: False) -- should the downtime be removed even if an exception was raised.

Yields:

None -- it just yields control to the caller once Icinga has been downtimed and deletes the downtime once getting back the control.

Return type:

collections.abc.Iterator[None]

get_status(service_re: str = '') spicerack.icinga.HostsStatus[source]

Get the current status of the given hosts from Icinga.

Parameters:

service_re (str, default: '') -- if non-empty, the regular expression matching service names

Raises:
Return type:

spicerack.icinga.HostsStatus

recheck_all_services() None[source]

Force recheck of all services associated with a set of hosts.

Return type:

None

recheck_failed_services() None[source]

Force recheck of all failed associated with a set of hosts.

Return type:

None

remove_downtime() None[source]

Remove a downtime from a set of hosts.

Return type:

None

remove_service_downtimes(service_re: str) None[source]

Remove downtimes for services from a set of hosts.

If there are multiple target_hosts, this method has the same behavior as downtime_services. If any matching service is not downtimed, it's silently skipped. (If one or more services exist matching the regex, but none of them is downtimed, this method does nothing.)

Parameters:

service_re (str) -- the regular expression matching service names to un-downtime.

Raises:
  • re.error -- if service_re is an invalid regular expression.

  • spicerack.icinga.IcingaError -- if no services on any target host match the regular expression.

Return type:

None

run_icinga_command(command: str, *args: str) None[source]

Execute an Icinga command on the Icinga server for all the current hosts.

This lower level API is meant to be used when the higher level API exposed in this class does not cover a given use case. The arguments passed to the underlying Icinga command will be the hostname plus all the arguments passed to this method. Hence it can be used only with Icinga commands that require a hostname. See the link below for more details on the available Icinga commands and their arguments.

Parameters:
  • command (str) -- the Icinga command to execute.

  • *args (str) -- optional positional arguments to pass to the command.

Return type:

None

services_downtimed(service_re: str, reason: spicerack.administrative.Reason, *, duration: datetime.timedelta = datetime.timedelta(seconds=14400), remove_on_error: bool = False) collections.abc.Iterator[None][source]

Context manager to perform actions while services are downtimed on Icinga.

Parameters:
  • service_re (str) -- the regular expression matching service names to downtime.

  • reason (spicerack.administrative.Reason) -- the reason to set for the downtime on the Icinga server.

  • duration (datetime.timedelta, default: datetime.timedelta(seconds=14400)) -- the length of the downtime period.

  • remove_on_error (bool, default: False) -- should the downtime be removed even if an exception was raised.

Yields:

None -- it just yields control to the caller once Icinga has been downtimed and deletes the downtime once getting back the control.

Return type:

collections.abc.Iterator[None]

wait_for_downtimed() None[source]

Poll the Icinga status to verify that the hosts got effectively downtimed.

Raises:

spicerack.icinga.IcingaError -- if unable to verify that all hosts got downtimed.

Return type:

None

wait_for_optimal(*, skip_acked: bool = False) None[source]

Waits for an icinga optimal status, else raises an exception.

This function will first instruct icinga to recheck all failed services and then wait until all services are in an optimal status. If an optimal status is not reached in 6 minutes then we raise IcingaError.

Parameters:

skip_acked (bool, default: False) -- ignore any acknowledge alerts when determining if a device is in optimal state.

Raises:

IcingaError -- if the status is not optimal.

Return type:

None

class spicerack.icinga.IcingaStatus(value)[source]

Bases: Enum

String class to represent an Icinga status.

CRITICAL: int = 2

Critical status.

OK: int = 0

Ok status.

UNKNOWN: int = 3

Unknown status.

WARNING: int = 1

Warning status.

class spicerack.icinga.ServiceStatus(dict=None, /, **kwargs)[source]

Bases: UserDict

Represent the current status of a service.

property acked: bool

Returns True if the service acknowledged, False otherwise.

property failed: bool

Check if the service is not in optimal state.

Returns:

True if the service is not in a IcingaStatus.OK status, False otherwise.

spicerack.icinga.ICINGA_DOMAIN: str = 'icinga.wikimedia.org'

The Icinga website FQDN.

spicerack.icinga.MIN_DOWNTIME_SECONDS: int = 60

Minimum time in seconds the downtime can be set to.