alertmanager

Alertmanager module.

exception spicerack.alertmanager.AlertmanagerError(message: str, response: requests.models.Response | None = None) None[source]

Bases: SpicerackError

Custom exception class for errors of this module.

Initializes an AlertmanagerError instance with the API response instance.

Parameters:
class spicerack.alertmanager.Alertmanager(*, dry_run: bool = True) None[source]

Bases: object

Operate on Alertmanager via its APIs.

Initialize the instance.

When using Alertmanager in high availability (cluster) make sure to pass all hosts in your cluster as alertmanager_urls.

Parameters:

dry_run (bool, default: True) -- whether this is a DRY-RUN.

downtime(reason: spicerack.administrative.Reason, *, matchers: collections.abc.Sequence[dict[str, Union[str, int, float, bool]]], duration: datetime.timedelta = datetime.timedelta(seconds=14400)) str[source]

Issue a new downtime.

Parameters:
Return type:

str

Returns:

The downtime ID.

Raises:
downtimed(reason: spicerack.administrative.Reason, *, matchers: collections.abc.Sequence[dict[str, Union[str, int, float, bool]]], duration: datetime.timedelta = datetime.timedelta(seconds=14400), remove_on_error: bool = False) collections.abc.Iterator[None][source]

Context manager to perform actions while the matching alerts are downtimed on Alertmanager.

Parameters:
  • reason (spicerack.administrative.Reason) -- the reason to set for the downtime on Alertmanager.

  • matchers (collections.abc.Sequence[dict[str, typing.Union[str, int, float, bool]]]) -- the list of matchers to be applied to the downtime. The downtime will match alerts that match all the matchers provided, as they are ANDed by AlertManager.

  • duration (datetime.timedelta, default: datetime.timedelta(seconds=14400)) -- the length of the downtime period.

  • remove_on_error (bool, default: False) -- should the downtime be removed even if an exception was raised.

Yields:

None -- it just yields control to the caller once Alertmanager has received the downtime and deletes the downtime once getting back the control.

Return type:

collections.abc.Iterator[None]

remove_downtime(downtime_id: str) None[source]

Remove a downtime.

Parameters:

downtime_id (str) -- the downtime ID to remove.

Raises:

spicerack.alertmanager.AlertmanagerError -- if none of the alertmanager_urls API returned a success.

Return type:

None

class spicerack.alertmanager.AlertmanagerHosts(target_hosts: spicerack.typing.TypeHosts, *, verbatim_hosts: bool = False, dry_run: bool = True) None[source]

Bases: Alertmanager

Operate on Alertmanager for a list of hosts via its APIs.

Initialize the instance.

Parameters:
  • target_hosts (typing.TypeVar(TypeHosts, collections.abc.Sequence[str], ClusterShell.NodeSet.NodeSet)) -- the target hosts either as a NodeSet instance or a sequence of strings.

  • verbatim_hosts (bool, default: False) -- if True use the hosts passed verbatim as is, if instead False, the default, consider the given target hosts as FQDNs and extract their hostnames to be used in Alertmanager.

  • dry_run (bool, default: True) -- whether this is a DRY-RUN.

Raises:

spicerack.alertmanager.AlertmanagerError -- if no target hosts are provided.

downtime(reason: spicerack.administrative.Reason, *, matchers: collections.abc.Sequence[dict[str, Union[str, int, float, bool]]] = (), duration: datetime.timedelta = datetime.timedelta(seconds=14400)) str[source]

Issue a new downtime for the given hosts.

Parameters:
  • reason (spicerack.administrative.Reason) -- the downtime reason.

  • matchers (collections.abc.Sequence[dict[str, typing.Union[str, int, float, bool]]], default: ()) -- an optional list of matchers to be applied to the downtime. They will be added to the matcher automatically generated to match the current instance target_hosts hosts. For this reason the provided matchers cannot be for the instance property. The downtime will match alerts that match all the matchers provided, as they are ANDed by AlertManager.

  • duration (datetime.timedelta, default: datetime.timedelta(seconds=14400)) -- the length of the downtime period.

Returns:

the downtime ID.

Return type:

str

Raises:
downtimed(reason: spicerack.administrative.Reason, *, matchers: collections.abc.Sequence[dict[str, Union[str, int, float, bool]]] = (), duration: datetime.timedelta = datetime.timedelta(seconds=14400), remove_on_error: bool = False) collections.abc.Iterator[None][source]

Context manager to perform actions while the hosts are downtimed on Alertmanager.

Parameters:
  • reason (spicerack.administrative.Reason) -- the reason to set for the downtime on Alertmanager.

  • matchers (collections.abc.Sequence[dict[str, typing.Union[str, int, float, bool]]], default: ()) -- an optional list of matchers to be applied to the downtime. They will be added to the matcher automatically generated to match the current instance target_hosts hosts. For this reason the provided matchers cannot be for the instance property. The downtime will match alerts that match all the matchers provided, as they are ANDed by AlertManager.

  • duration (datetime.timedelta, default: datetime.timedelta(seconds=14400)) -- the length of the downtime period.

  • remove_on_error (bool, default: False) -- should the downtime be removed even if an exception was raised.

Yields:

None -- it just yields control to the caller once Alertmanager has received the downtime and deletes the downtime once getting back the control.

Return type:

collections.abc.Iterator[None]

spicerack.alertmanager.ALERTMANAGER_URLS: tuple[str, str] = ('http://alertmanager-eqiad.wikimedia.org', 'http://alertmanager-codfw.wikimedia.org')

All the alertmanager instances to contact.

spicerack.alertmanager.PORT_REGEX: str = '(\\..+)?(:[0-9]+)?'

The regular expression used to match FQDNs and port numbers in the instance labels.