elasticsearch_cluster

ElasticsearchCluster module.

class spicerack.elasticsearch_cluster.ElasticsearchCluster(elasticsearch, remote, dry_run=True)[source]

Bases: object

Class to manage elasticsearch cluster.

__init__(elasticsearch, remote, dry_run=True)[source]

Initialize ElasticsearchCluster

Parameters:
  • elasticsearch (elasticsearch.Elasticsearch) -- elasticsearch instance.
  • remote (spicerack.remote.Remote) -- the Remote instance, pre-initialized.
  • dry_run (bool, optional) -- whether this is a DRY-RUN.

Todo

self._hostname class member will be replaced by the formatted message obtained via Reason, this can't be done right now as it needs to be inline with what the MW maint script and the Icinga check do at the moment.

_do_cluster_routing(cluster_routing)[source]

Performs cluster routing of shards.

Parameters:cluster_routing (curator.ClusterRouting) -- Curator's cluster routing object.
_force_allocation_of_shard(shard, nodes)[source]

Force allocation of shard.

Parameters:
  • shard (dict) -- shard of an index to be relocated.
  • nodes (list) -- list of nodes to allocate shards to.

Todo

It was found that forcing allocation of shards may perform better in terms of speed than letting elasticsearch do its recovery on its own. We should verify from time to time that elastic recovery performance has not gone better and remove this step if proven unnecessary.

_freeze_writes(reason)[source]

Stop writes to all elasticsearch indices.

Parameters:reason (spicerack.administrative.Reason) -- Reason for freezing writes.
_get_unassigned_shards()[source]

Fetch unassigned shards.

Returns:list of unassigned shards from the cluster.
Return type:list
_start_replication()[source]

Starts cluster replication.

_stop_replication()[source]

Stops cluster replication.

_unfreeze_writes()[source]

Enable writes on all elasticsearch indices

flush_markers(timeout=datetime.timedelta(0, 60))[source]

Flush markers unsynced. flush + flush_synced is called here because from experience,

it result into less shards not syncing. This also makes recovery faster.
Parameters:timeout (datetime.timedelta) -- timedelta object for elasticsearch request timeout.
force_allocation_of_all_unassigned_shards()[source]

Manual allocation of unassigned shards.

frozen_writes(reason)[source]

Stop writes to all elasticsearch indices and enable them on exit.

Parameters:reason (spicerack.administrative.Reason) -- Reason for freezing writes.
get_nodes()[source]

Get all Elasticsearch Nodes.

Returns:dictionary of elasticsearch nodes in the cluster.
Return type:dict
is_green()[source]

Cluster health status.

is_node_in_cluster_nodes(node)[source]

Checks if node is in a list of elasticsearch cluster nodes.

Parameters:node (str) -- the elasticsearch host.
Returns:True if node is present and False if not present.
Return type:bool
static split_node_name(node_name)[source]

Split node name into hostname and cluster group name

Parameters:node_name (str) -- node name containing hostname and cluster name separated by '-'
Returns:dict containing the node name and the cluster name
Return type:dict
stopped_replication()[source]

Context manager to perform actions while the cluster replication is stopped.

exception spicerack.elasticsearch_cluster.ElasticsearchClusterError[source]

Bases: spicerack.exceptions.SpicerackError

Exception class for errors of this module.

class spicerack.elasticsearch_cluster.ElasticsearchClusters(clusters, remote, dry_run=True)[source]

Bases: object

Class to manage elasticsearch clusters.

__init__(clusters, remote, dry_run=True)[source]

Initialize ElasticsearchClusters.

Parameters:
  • clusters (list of spicerack.elasticsearch_cluster.ElasticsearchCluster) -- list of elasticsearch cluster.
  • remote (spicerack.remote.Remote) -- the Remote instance, pre-initialized.
  • dry_run (bool, optional) -- whether this is a DRY-RUN.
static _append_to_nodegroup(nodes_group, cluster_node, cluster)[source]

Merge node of different clusters.

Parameters:
_get_nodes_group()[source]

Create nodes_group for each nodes.

Returns:merged clusters nodes. e.g:
{'el5':
    {'name': 'el5', 'clusters': ['alpha', 'beta],
    'clusters_instances': [spicerack.elasticsearch_cluster.ElasticsearchCluster],
        'row': 'row2', 'oldest_start_time': 10
    }
}
Return type:dict
static _node_has_been_restarted(node, since)[source]

Check if node has been restarted.

Parameters:
  • node (dict) -- elasticsearch node.
  • since (datetime.datetime) -- the time against after which we check if the node has been restarted.
Returns:

True if the node has been restarted after since, false otherwise.

Return type:

bool

static _to_rows(nodes)[source]

Arrange nodes in rows, so each node belongs in their respective row.

Parameters:nodes (list) -- list containing dicts of elasticsearch nodes.
Returns:
dict object containing a normalized rows of elasticsearch nodes.
E.g {'row1': [{'name': 'el1'}, {'name': 'el2'}], 'row2': [{'name': 'el6'}]}
Return type:dict
flush_markers(timeout=datetime.timedelta(0, 60))[source]

Flush markers on all clusters.

Parameters:timeout (datetime.timedelta, optional) -- timedelta object for elasticsearch request timeout.
force_allocation_of_all_unassigned_shards()[source]

Force allocation of unassigned shards on all clusters.

frozen_writes(reason)[source]

Freeze all writes to the clusters and then perform operations before unfreezing writes.

Parameters:reason (spicerack.administrative.Reason) -- Reason for freezing writes.
get_next_clusters_nodes(started_before, size=1)[source]

Get next set of cluster nodes for cookbook operations like upgrade, rolling restart etc.

Parameters:
  • started_before (datetime.datetime) -- the time against after which we check if the node has been restarted.
  • size (int, optional) -- size of nodes not restarted in a row.
Returns:

next eligible nodes for ElasticsearchHosts.

Return type:

spicerack.elasticsearch_cluster.ElasticsearchHosts

stopped_replication()[source]

Stops replication for all clusters.

wait_for_green(timeout=datetime.timedelta(0, 3600))[source]

Wait for green on all clusters.

Parameters:timeout (datetime.timedelta, optional) -- timedelta object to represent how long to wait for green status on all clusters.
class spicerack.elasticsearch_cluster.ElasticsearchHosts(remote_host, next_nodes, dry_run=True)[source]

Bases: spicerack.remote.RemoteHostsAdapter

Remotehosts Adapter for managing elasticsearch nodes.

__init__(remote_host, next_nodes, dry_run=True)[source]

After calling the super's constructor, initialize other instance variables.

Parameters:
  • remote_host (spicerack.remote.RemoteHosts) -- the instance with the target hosts.
  • next_nodes (list) -- list of dicts containing clusters hosts belong to.
  • dry_run (bool, optional) -- whether this is a DRY-RUN.
depool_nodes()[source]

Depool the hosts

get_remote_hosts()[source]

Returns elasticsearch remote hosts

Returns:RemoteHosts instance for this adapter.
Return type:spicerack.remote.RemoteHosts
pool_nodes()[source]

Pool the hosts

restart_elasticsearch()[source]

Restarts all elasticsearch instances

start_elasticsearch()[source]

Starts all elasticsearch instances

stop_elasticsearch()[source]

Stops all elasticsearch instances

wait_for_elasticsearch_up(timeout=datetime.timedelta(0, 900))[source]

Check if elasticsearch instances on each node are up.

Parameters:timeout (datetime.timedelta, optional) -- represent how long to wait for all instances to be up.
spicerack.elasticsearch_cluster.create_elasticsearch_clusters(clustergroup, remote, dry_run=True)[source]

Create ElasticsearchClusters instance.

Parameters:
  • clustergroup (str) -- name of cluster group.
  • remote (spicerack.remote.Remote) -- the Remote instance.
  • dry_run (bool, optional) -- whether this is a DRY-RUN.
Raises:

spicerack.elasticsearch_cluster.ElasticsearchClusterError -- Thrown when the requested cluster configuration is not found.

Returns:

ElasticsearchClusters instance.

Return type:

spicerack.elasticsearch_cluster.ElasticsearchClusters