Configuring Service Checks

Scap can perform service checks during a deployment in order to detect any problems that might be caused by the new code (or configuration) and alert the deployer early in the process and offering the option to roll back to the previously deployed version. Additionally, you can run any arbitrary command using checks.

The environment in which a check executes has two additional variables defined by scap: $SCAP_FINAL_PATH and $SCAP_REV_PATH. $SCAP_FINAL_PATH is the final path of the code after deployment is complete. $SCAP_REV_PATH is the variable path of the code currently being deployed.

Logging and Monitoring

Utilizing service checks allows you to get real time status information about your deployments with minimal effort and without ever leaving the deployment console session.

See also

  • scap deploy-log - monitor output from scap deployment target (s)

NRPE Checks

The nrpe check type is a simple wrapper around existing Icinga/NRPE checks so that your deployments can utilize monitoring infrastructure that is already in place with minimal effort.

By default, the check commands are loaded and registered from definitions in /etc/nagios/nrpe.d. You must reference these checks in your checks.yaml in order to tell scap which checks are relevant to the service you are deploying.

The directory /etc/nagios/nrpe.d/ should contain config files (generated by puppet) which specify the specific commands needed to run various nagios check plugins. For example, the directory should look something like this:

/etc/nagios/nrpe.d:

├── check_cassandra.cfg
├── check_check_dhclient.cfg
├── check_check_eth.cfg
├── check_check_salt_minion.cfg
├── check_disk_space.cfg
├── check_dpkg.cfg
├── check_endpoints_restbase.cfg
├── check_puppet_checkpuppetrun.cfg
├── check_raid.cfg
└── check_root_disk_space.cfg

NRPE checks can be referenced in checks.yaml using type: nrpe and command: {check_name}. The value of {check_name} must match the name of a file in /etc/nagios/nrpe.d, but omitting the file extension.

Example checks.yml:

checks:
  service_endpoints:
    type: nrpe
    command: check_service_endpoints
    stage: promote
    timeout: 60 # default is 30 seconds

Script Checks

The script check type allows users to run scripts after any stage of a deployment. This was in the past achieved through use of the command check; however, this provides an easier means by which to execute scripts that may change between revisions of a repository.

Script checks will only run executable files in the scap/scripts directory.

Script checks can be referenced in checks.yaml using type: script and command: [basename_of_executable_file]. The value of [basename_of_executable_file] will be executed by the ssh_user after the stage specified by stage:

In the example below, scap expects that in the repo being deployed there exists a scap/scripts/build_venv.sh file that is executable by the ssh_user.

Example checks.yml:

checks:
  build_venv:
    type: script
    stage: promote
    command: build_venv.sh

Command Checks

The command check type allows users to define shell commands to run after each stage of deployment.

Command checks can be referenced in checks.yaml using type: command and command: {shell_command}. The value of {shell_command} will be executed by the ssh_user after the stage specified by stage:

Example checks.yml:

checks:
  mockbase_responds:
    type: command
    stage: promote
    command: curl -Ss localhost:1134

Check stages

Not all of these stages are run for every deployment. The basic stages that you might want to write checks for are fetch and promote.

NRPE checks, and command checks may be executed following any stage of deployment (the stage is specified using the stage option in the checks.yaml file:

  1. restart_service - a service is restarted
  2. config_deploy - templated configuration files are rendered
  3. config_diff - compare each file to the deployed version, called during scap deploy --dry-run.
  4. fetch - target repository has been checked-out
  5. finalize - final deployment cleanup
  6. promote - make the new deployment active
  7. rollback - target is rolled back to the last deployed revision