MediaWiki REL1_36
|
Provide a given client with protection against visible database lag. More...
Public Member Functions | |
__construct (BagOStuff $store, array $client, $posIndex, $secret='') | |
applySessionReplicationPosition (ILoadBalancer $lb) | |
Apply the "session consistency" DB replication position to a new ILoadBalancer. | |
getClientId () | |
getTouched (ILoadBalancer $lb) | |
Get the UNIX timestamp when the client last touched the DB, if they did so recently. | |
setEnabled ( $enabled) | |
setLogger (LoggerInterface $logger) | |
setMockTime (&$time) | |
setWaitEnabled ( $enabled) | |
shutdown (&$cpIndex=null) | |
Save any remarked "session consistency" DB replication positions to persistent storage. | |
stageSessionReplicationPosition (ILoadBalancer $lb) | |
Update the "session consistency" DB replication position for an end-of-life ILoadBalancer. | |
Public Attributes | |
const | POSITION_COOKIE_TTL = 10 |
Seconds to store position write index cookies (safely less than POSITION_STORE_TTL) | |
Protected Member Functions | |
getCurrentTime () | |
getStartupSessionPositions () | |
getStartupSessionTimestamps () | |
lazyStartup () | |
Load the stored DB replication positions and touch timestamps for the client. | |
mergePositions ( $storedValue, array $shutdownPositions, array $shutdownTimestamps, &$cpIndex=null) | |
Merge the new DB replication positions with the currently stored ones (highest wins) | |
Protected Attributes | |
string | $clientId |
Hash of client parameters. | |
string[] | $clientLogInfo |
Map of client information fields for logging. | |
bool | $enabled = true |
Whether reading/writing session consistency replication positions is enabled. | |
string | $key |
Storage key name. | |
LoggerInterface | $logger |
bool | $positionWaitsEnabled = true |
Whether waiting on DB servers to reach replication positions is enabled. | |
array< string, DBMasterPos > | $shutdownPositionsByMaster = [] |
Map of (DB master name => position) | |
array< string, float > | $shutdownTimestampsByCluster = [] |
Map of (DB cluster name => UNIX timestamp) | |
array< string, DBMasterPos > | $startupPositionsByMaster = [] |
Map of (DB master name => position) | |
float null | $startupTimestamp |
UNIX timestamp when the client data was loaded. | |
array< string, float > | $startupTimestampsByCluster = [] |
Map of (DB cluster name => UNIX timestamp) | |
BagOStuff | $store |
int null | $waitForPosIndex |
Expected minimum index of the last write to the position store. | |
Private Attributes | |
float null | $wallClockOverride |
const | FLD_POSITIONS = 'positions' |
const | FLD_TIMESTAMPS = 'timestamps' |
const | FLD_WRITE_INDEX = 'writeIndex' |
const | LOCK_TIMEOUT = 3 |
Lock timeout to use for key updates. | |
const | LOCK_TTL = 6 |
Lock expiry to use for key updates. | |
const | POSITION_INDEX_WAIT_TIMEOUT = 5 |
Max seconds to wait for positions write indexes to appear (e.g. | |
const | POSITION_STORE_TTL = 60 |
Seconds to store replication positions. | |
Provide a given client with protection against visible database lag.
This class tries to hide visible effects of database lag. It does this by temporarily remembering the database positions after a client makes a write, and on their next web request we will prefer non-lagged database replicas. When replica connections are establshed, we wait up to a few seconds for sufficient replication to have occurred, if they were not yet caught up to that same point.
This ensures a consistent ordering of events as seen by a client. Kind of like Hawking's Chronology Protection Agency.
For performance and scalability reasons, almost all data is queried from replica databases. Only queries relating to writing data, are sent to a database master. When rendering a web page with content or activity feeds on it, the very latest information may thus not yet be there. That's okay in general, but if, for example, a client recently changed their preferences or submitted new data, we do our best to make sure their next web response does reflect at least their own recent changes.
To explain how it works, we will look at an example lifecycle for a client.
A client is browsing the site. Their web requests are generally read-only and display data from database replicas, which may be a few seconds out of date if a client elsewhere in the world recently modified that same data. If the application is run from multiple data centers, then these web requests may be served from the nearest secondary DC.
A client performs a POST request, perhaps to publish an edit or change their preferences. This request is routed to the primary DC (this is the responsibility of infrastructure outside the web app). There, the data is saved to the database master, after which the database host will asynchronously replicate this to its replicas in the same and any other DCs.
Toward the end of the response to this POST request, the application takes note of the database master's current "position", and save this under a "clientId" key in the ChronologyProtector store. The web response will also set two cookies that are similarly short-lived (about ten seconds): UseDC=master
and cpPosIndex=<clientId>
.
The ten seconds window is meant to account for the time needed for the database writes to have replicated across all active database replicas, including the cross-dc latency for those further away in any secondary DCs.
Future web requests from the client should fall in one of two categories:
The store used by ChronologyProtector, as configured via $wgChronologyProtectorStash
, should meet the following requirements:
These are the expectations a site administrator must meet for chronology protection:
Web requests that use the POST verb, or carry a UseDC=master
cookie, must be routed to the primary DC only.
An exception is requests carrying the Promise-Non-Write-API-Action: true
header, which use the POST verb for large read queries, but don't actually require the primary DC.
If you have legacy extensions deployed that perform queries on the master database during GET requests, then you will have to identify a way to route any of its relevant URLs to the primary DC as well, or to accept that their reads do not enjoy chronology protection, and that writes may be slower (due to cross-dc latency). See T91820 for Wikimedia Foundation's routing.
Definition at line 135 of file ChronologyProtector.php.
Wikimedia\Rdbms\ChronologyProtector::__construct | ( | BagOStuff | $store, |
array | $client, | ||
$posIndex, | |||
$secret = '' |
|||
) |
BagOStuff | $store | |
array | $client | Map of (ip: <IP>, agent: <user-agent> [, clientId: <hash>] ) |
int | null | $posIndex | Write counter index |
string | $secret | Secret string for HMAC hashing [optional] |
Definition at line 192 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$store, and BagOStuff\makeGlobalKey().
Wikimedia\Rdbms\ChronologyProtector::applySessionReplicationPosition | ( | ILoadBalancer | $lb | ) |
Apply the "session consistency" DB replication position to a new ILoadBalancer.
If the stash has a previous master position recorded, this will try to make sure that the next query to a replica DB of that master will see changes up to that position by delaying execution. The delay may timeout and allow stale data if no non-lagged replica DBs are available.
ILoadBalancer | $lb |
Definition at line 255 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ILoadBalancer\getClusterName(), Wikimedia\Rdbms\ILoadBalancer\getServerName(), Wikimedia\Rdbms\ChronologyProtector\getStartupSessionPositions(), Wikimedia\Rdbms\ILoadBalancer\getWriterIndex(), and Wikimedia\Rdbms\ILoadBalancer\waitFor().
Wikimedia\Rdbms\ChronologyProtector::getClientId | ( | ) |
Definition at line 222 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$clientId.
Referenced by Wikimedia\Rdbms\LBFactory\shutdown().
|
protected |
Definition at line 546 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$wallClockOverride.
Referenced by Wikimedia\Rdbms\ChronologyProtector\lazyStartup(), and Wikimedia\Rdbms\ChronologyProtector\stageSessionReplicationPosition().
|
protected |
Definition at line 401 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$startupPositionsByMaster, and Wikimedia\Rdbms\ChronologyProtector\lazyStartup().
Referenced by Wikimedia\Rdbms\ChronologyProtector\applySessionReplicationPosition().
|
protected |
Definition at line 410 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$startupTimestampsByCluster, and Wikimedia\Rdbms\ChronologyProtector\lazyStartup().
Referenced by Wikimedia\Rdbms\ChronologyProtector\getTouched().
Wikimedia\Rdbms\ChronologyProtector::getTouched | ( | ILoadBalancer | $lb | ) |
Get the UNIX timestamp when the client last touched the DB, if they did so recently.
ILoadBalancer | $lb |
Definition at line 372 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ILoadBalancer\getClusterName(), and Wikimedia\Rdbms\ChronologyProtector\getStartupSessionTimestamps().
|
protected |
Load the stored DB replication positions and touch timestamps for the client.
Definition at line 421 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\FLD_POSITIONS, Wikimedia\Rdbms\ChronologyProtector\FLD_TIMESTAMPS, Wikimedia\Rdbms\ChronologyProtector\FLD_WRITE_INDEX, Wikimedia\Rdbms\ChronologyProtector\getCurrentTime(), and Wikimedia\Rdbms\ChronologyProtector\POSITION_INDEX_WAIT_TIMEOUT.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getStartupSessionPositions(), and Wikimedia\Rdbms\ChronologyProtector\getStartupSessionTimestamps().
|
protected |
Merge the new DB replication positions with the currently stored ones (highest wins)
array<string,mixed>|false | $storedValue Current DB replication position data | |
array<string,DBMasterPos> | $shutdownPositions New DB replication positions | |
array<string,float> | $shutdownTimestamps New DB post-commit shutdown timestamps | |
int | null | &$cpIndex | New position write index |
Definition at line 501 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\FLD_POSITIONS, Wikimedia\Rdbms\ChronologyProtector\FLD_TIMESTAMPS, and Wikimedia\Rdbms\ChronologyProtector\FLD_WRITE_INDEX.
Referenced by Wikimedia\Rdbms\ChronologyProtector\shutdown().
Wikimedia\Rdbms\ChronologyProtector::setEnabled | ( | $enabled | ) |
bool | $enabled | Whether reading/writing session replication positions is enabled |
Definition at line 230 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$enabled.
Wikimedia\Rdbms\ChronologyProtector::setLogger | ( | LoggerInterface | $logger | ) |
Definition at line 214 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$logger.
Referenced by Wikimedia\Rdbms\LBFactory\getChronologyProtector().
Wikimedia\Rdbms\ChronologyProtector::setMockTime | ( | & | $time | ) |
float | null | &$time | Mock UNIX timestamp |
Definition at line 562 of file ChronologyProtector.php.
Wikimedia\Rdbms\ChronologyProtector::setWaitEnabled | ( | $enabled | ) |
bool | $enabled | Whether session replication position wait barriers are enable |
Definition at line 238 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$enabled.
Wikimedia\Rdbms\ChronologyProtector::shutdown | ( | & | $cpIndex = null | ) |
Save any remarked "session consistency" DB replication positions to persistent storage.
int | null | &$cpIndex | DB position key write counter; incremented on update |
Definition at line 315 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$shutdownPositionsByMaster, and Wikimedia\Rdbms\ChronologyProtector\mergePositions().
Referenced by Wikimedia\Rdbms\LBFactory\shutdownChronologyProtector().
Wikimedia\Rdbms\ChronologyProtector::stageSessionReplicationPosition | ( | ILoadBalancer | $lb | ) |
Update the "session consistency" DB replication position for an end-of-life ILoadBalancer.
This remarks the replication position of the master DB if this request made writes to it using the provided ILoadBalancer instance.
ILoadBalancer | $lb |
Definition at line 283 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ILoadBalancer\getClusterName(), Wikimedia\Rdbms\ChronologyProtector\getCurrentTime(), Wikimedia\Rdbms\ILoadBalancer\getReplicaResumePos(), Wikimedia\Rdbms\ILoadBalancer\getServerName(), Wikimedia\Rdbms\ILoadBalancer\getWriterIndex(), Wikimedia\Rdbms\ILoadBalancer\hasOrMadeRecentMasterChanges(), and Wikimedia\Rdbms\ILoadBalancer\hasStreamingReplicaServers().
Referenced by Wikimedia\Rdbms\LBFactory\shutdownChronologyProtector().
|
protected |
Hash of client parameters.
Definition at line 144 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getClientId().
|
protected |
Map of client information fields for logging.
Definition at line 146 of file ChronologyProtector.php.
|
protected |
Whether reading/writing session consistency replication positions is enabled.
Definition at line 151 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\setEnabled(), and Wikimedia\Rdbms\ChronologyProtector\setWaitEnabled().
|
protected |
Storage key name.
Definition at line 142 of file ChronologyProtector.php.
|
protected |
Definition at line 139 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\setLogger().
|
protected |
Whether waiting on DB servers to reach replication positions is enabled.
Definition at line 153 of file ChronologyProtector.php.
|
protected |
Map of (DB master name => position)
Definition at line 160 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\shutdown().
|
protected |
Map of (DB cluster name => UNIX timestamp)
Definition at line 164 of file ChronologyProtector.php.
|
protected |
Map of (DB master name => position)
Definition at line 158 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getStartupSessionPositions().
|
protected |
UNIX timestamp when the client data was loaded.
Definition at line 155 of file ChronologyProtector.php.
|
protected |
Map of (DB cluster name => UNIX timestamp)
Definition at line 162 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getStartupSessionTimestamps().
|
protected |
Definition at line 137 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\__construct().
|
protected |
Expected minimum index of the last write to the position store.
Definition at line 148 of file ChronologyProtector.php.
|
private |
Definition at line 167 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getCurrentTime().
|
private |
Definition at line 181 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\lazyStartup(), and Wikimedia\Rdbms\ChronologyProtector\mergePositions().
|
private |
Definition at line 182 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\lazyStartup(), and Wikimedia\Rdbms\ChronologyProtector\mergePositions().
|
private |
Definition at line 183 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\lazyStartup(), and Wikimedia\Rdbms\ChronologyProtector\mergePositions().
|
private |
Lock timeout to use for key updates.
Definition at line 177 of file ChronologyProtector.php.
|
private |
Lock expiry to use for key updates.
Definition at line 179 of file ChronologyProtector.php.
const Wikimedia\Rdbms\ChronologyProtector::POSITION_COOKIE_TTL = 10 |
Seconds to store position write index cookies (safely less than POSITION_STORE_TTL)
Definition at line 170 of file ChronologyProtector.php.
|
private |
Max seconds to wait for positions write indexes to appear (e.g.
replicate) in storage
Definition at line 174 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\lazyStartup().
|
private |
Seconds to store replication positions.
Definition at line 172 of file ChronologyProtector.php.