MediaWiki master
|
Provide a given client with protection against visible database lag. More...
Inherits LoggerAwareInterface.
Public Member Functions | |
__construct ( $cpStash=null, $secret=null, $cliMode=null, $logger=null) | |
getClientId () | |
getSessionPrimaryPos (ILoadBalancer $lb) | |
Yield client "session consistency" replication position for a new ILoadBalancer. | |
getTouched (ILoadBalancer $lb) | |
Get the UNIX timestamp when the client last touched the DB, if they did so recently. | |
persistSessionReplicationPositions (&$clientPosIndex=null) | |
Persist any staged client "session consistency" replication positions. | |
setEnabled ( $enabled) | |
setLogger (LoggerInterface $logger) | |
setMockTime (&$time) | |
setRequestInfo (array $info) | |
stageSessionPrimaryPos (ILoadBalancer $lb) | |
Update client "session consistency" replication position for an end-of-life ILoadBalancer. | |
Static Public Member Functions | |
static | getCPInfoFromCookieValue (?string $value, int $minTimestamp) |
Parse a string conveying the client and write index of the chronology protector data. | |
static | makeCookieValueFromCPIndex (int $writeIndex, int $time, string $clientId) |
Build a string conveying the client and write index of the chronology protector data. | |
Public Attributes | |
const | POSITION_COOKIE_TTL = 10 |
Seconds to store position write index cookies (safely less than POSITION_STORE_TTL) | |
Protected Member Functions | |
getCurrentTime () | |
getStartupSessionPositions () | |
getStartupSessionTimestamps () | |
lazyStartup () | |
Load the stored replication positions and touch timestamps for the client. | |
mergePositions ( $storedValue, array $shutdownPositions, array $shutdownTimestamps, ?int &$clientPosIndex=null) | |
Merge the new replication positions with the currently stored ones (highest wins) | |
Protected Attributes | |
string | $clientId |
Hash of client parameters. | |
string[] | $clientLogInfo |
Map of client information fields for logging. | |
bool | $enabled = true |
Whether reading/writing session consistency replication positions is enabled. | |
string | $key |
Storage key name. | |
LoggerInterface | $logger |
array< string, DBPrimaryPos > | $shutdownPositionsByPrimary = [] |
Map of (primary server name => position) | |
array< string, float > | $shutdownTimestampsByCluster = [] |
Map of (DB cluster name => UNIX timestamp) | |
array< string, DBPrimaryPos > | $startupPositionsByPrimary = [] |
Map of (primary server name => position) | |
float null | $startupTimestamp |
UNIX timestamp when the client data was loaded. | |
array< string, float > | $startupTimestampsByCluster = [] |
Map of (DB cluster name => UNIX timestamp) | |
int null | $waitForPosIndex |
Expected minimum index of the last write to the position store. | |
Provide a given client with protection against visible database lag.
This class tries to hide visible effects of database lag. It does this by temporarily remembering the database positions after a client makes a write, and on their next web request we will prefer non-lagged database replicas. When replica connections are established, we wait up to a few seconds for sufficient replication to have occurred, if they were not yet caught up to that same point.
This ensures a consistent ordering of events as seen by a client. Kind of like Hawking's Chronology Protection Agency.
For performance and scalability reasons, almost all data is queried from replica databases. Only queries relating to writing data, are sent to a primary database. When rendering a web page with content or activity feeds on it, the very latest information may thus not yet be there. That's okay in general, but if, for example, a client recently changed their preferences or submitted new data, we do our best to make sure their next web response does reflect at least their own recent changes.
To explain how it works, we will look at an example lifecycle for a client.
A client is browsing the site. Their web requests are generally read-only and display data from database replicas, which may be a few seconds out of date if a client elsewhere in the world recently modified that same data. If the application is run from multiple data centers, then these web requests may be served from the nearest secondary DC.
A client performs a POST request, perhaps to publish an edit or change their preferences. This request is routed to the primary DC (this is the responsibility of infrastructure outside the web app). There, the data is saved to the primary database, after which the database host will asynchronously replicate this to its replicas in the same and any other DCs.
Toward the end of the response to this POST request, the application takes note of the primary database's current "position", and save this under a "clientId" key in the ChronologyProtector store. The web response will also set two cookies that are similarly short-lived (about ten seconds): UseDC=master
and cpPosIndex=<posIndex>@<write time>#<clientId>
.
The ten seconds window is meant to account for the time needed for the database writes to have replicated across all active database replicas, including the cross-dc latency for those further away in any secondary DCs. The "clientId" is placed in the cookie to handle the case where the client IP addresses frequently changes between web requests.
Future web requests from the client should fall in one of two categories:
The store used by ChronologyProtector, as configured via $wgMicroStashType
, should meet the following requirements:
These are the expectations a site administrator must meet for chronology protection:
Web requests that use the POST verb, or carry a UseDC=master
cookie, must be routed to the primary DC only.
An exception is requests carrying the Promise-Non-Write-API-Action: true
header, which use the POST verb for large read queries, but don't actually require the primary DC.
If you have legacy extensions deployed that perform queries on the primary database during GET requests, then you will have to identify a way to route any of its relevant URLs to the primary DC as well, or to accept that their reads do not enjoy chronology protection, and that writes may be slower (due to cross-dc latency). See T91820 for Wikimedia Foundation's routing.
Definition at line 134 of file ChronologyProtector.php.
Wikimedia\Rdbms\ChronologyProtector::__construct | ( | $cpStash = null, | |
$secret = null, | |||
$cliMode = null, | |||
$logger = null ) |
BagOStuff | null | $cpStash | |
string | null | $secret | Secret string for HMAC hashing [optional] |
bool | null | $cliMode | Whether the context is CLI or not, setting it to true would disable CP |
LoggerInterface | null | $logger |
Definition at line 205 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$logger.
Wikimedia\Rdbms\ChronologyProtector::getClientId | ( | ) |
Definition at line 272 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$clientId.
|
static |
Parse a string conveying the client and write index of the chronology protector data.
string | null | $value | Value of "cpPosIndex" cookie |
int | $minTimestamp | Lowest UNIX timestamp that a non-expired value can have |
Definition at line 645 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$clientId.
|
protected |
Definition at line 573 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\lazyStartup(), and Wikimedia\Rdbms\ChronologyProtector\stageSessionPrimaryPos().
Wikimedia\Rdbms\ChronologyProtector::getSessionPrimaryPos | ( | ILoadBalancer | $lb | ) |
Yield client "session consistency" replication position for a new ILoadBalancer.
If the stash has a previous primary position recorded, this will try to make sure that the next query to a replica server of that primary will see changes up to that position by delaying execution. The delay may timeout and allow stale data if no non-lagged replica servers are available.
ILoadBalancer | $lb |
Definition at line 298 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ILoadBalancer\getClusterName(), Wikimedia\Rdbms\ILoadBalancer\getServerName(), Wikimedia\Rdbms\ChronologyProtector\getStartupSessionPositions(), and Wikimedia\Rdbms\ServerInfo\WRITER_INDEX.
|
protected |
Definition at line 446 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$startupPositionsByPrimary, and Wikimedia\Rdbms\ChronologyProtector\lazyStartup().
Referenced by Wikimedia\Rdbms\ChronologyProtector\getSessionPrimaryPos().
|
protected |
Definition at line 455 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$startupTimestampsByCluster, and Wikimedia\Rdbms\ChronologyProtector\lazyStartup().
Referenced by Wikimedia\Rdbms\ChronologyProtector\getTouched().
Wikimedia\Rdbms\ChronologyProtector::getTouched | ( | ILoadBalancer | $lb | ) |
Get the UNIX timestamp when the client last touched the DB, if they did so recently.
ILoadBalancer | $lb |
Definition at line 416 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ILoadBalancer\getClusterName(), and Wikimedia\Rdbms\ChronologyProtector\getStartupSessionTimestamps().
|
protected |
Load the stored replication positions and touch timestamps for the client.
Definition at line 466 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\getCurrentTime().
Referenced by Wikimedia\Rdbms\ChronologyProtector\getStartupSessionPositions(), and Wikimedia\Rdbms\ChronologyProtector\getStartupSessionTimestamps().
|
static |
Build a string conveying the client and write index of the chronology protector data.
int | $writeIndex | |
int | $time | UNIX timestamp; can be used to detect stale cookies (T190082) |
string | $clientId | Client ID hash from ILBFactory::shutdown() |
Definition at line 628 of file ChronologyProtector.php.
|
protected |
Merge the new replication positions with the currently stored ones (highest wins)
array<string,mixed>|false | $storedValue Current replication position data | |
array<string,DBPrimaryPos> | $shutdownPositions New replication positions | |
array<string,float> | $shutdownTimestamps New DB post-commit shutdown timestamps | |
int | null | &$clientPosIndex | New position write index |
Definition at line 528 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\persistSessionReplicationPositions().
Wikimedia\Rdbms\ChronologyProtector::persistSessionReplicationPositions | ( | & | $clientPosIndex = null | ) |
Persist any staged client "session consistency" replication positions.
int | null | &$clientPosIndex | DB position key write counter; incremented on update |
Definition at line 361 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$shutdownPositionsByPrimary, and Wikimedia\Rdbms\ChronologyProtector\mergePositions().
Wikimedia\Rdbms\ChronologyProtector::setEnabled | ( | $enabled | ) |
bool | $enabled | Whether reading/writing session replication positions is enabled |
Definition at line 281 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$enabled.
Wikimedia\Rdbms\ChronologyProtector::setLogger | ( | LoggerInterface | $logger | ) |
Definition at line 263 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ChronologyProtector\$logger.
Wikimedia\Rdbms\ChronologyProtector::setMockTime | ( | & | $time | ) |
float | null | &$time | Mock UNIX timestamp |
Definition at line 589 of file ChronologyProtector.php.
Wikimedia\Rdbms\ChronologyProtector::setRequestInfo | ( | array | $info | ) |
Definition at line 255 of file ChronologyProtector.php.
Wikimedia\Rdbms\ChronologyProtector::stageSessionPrimaryPos | ( | ILoadBalancer | $lb | ) |
Update client "session consistency" replication position for an end-of-life ILoadBalancer.
This remarks the replication position of the primary DB if this request made writes to it using the provided ILoadBalancer instance.
ILoadBalancer | $lb |
Definition at line 328 of file ChronologyProtector.php.
References Wikimedia\Rdbms\ILoadBalancer\getClusterName(), Wikimedia\Rdbms\ChronologyProtector\getCurrentTime(), Wikimedia\Rdbms\ILoadBalancer\getPrimaryPos(), Wikimedia\Rdbms\ILoadBalancer\getServerName(), Wikimedia\Rdbms\ILoadBalancer\hasOrMadeRecentPrimaryChanges(), Wikimedia\Rdbms\ILoadBalancer\hasStreamingReplicaServers(), and Wikimedia\Rdbms\ServerInfo\WRITER_INDEX.
|
protected |
Hash of client parameters.
Definition at line 148 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getClientId(), and Wikimedia\Rdbms\ChronologyProtector\getCPInfoFromCookieValue().
|
protected |
Map of client information fields for logging.
Definition at line 150 of file ChronologyProtector.php.
|
protected |
Whether reading/writing session consistency replication positions is enabled.
Definition at line 155 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\setEnabled().
|
protected |
Storage key name.
Definition at line 146 of file ChronologyProtector.php.
|
protected |
Definition at line 143 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\__construct(), and Wikimedia\Rdbms\ChronologyProtector\setLogger().
|
protected |
Map of (primary server name => position)
Definition at line 162 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\persistSessionReplicationPositions().
|
protected |
Map of (DB cluster name => UNIX timestamp)
Definition at line 166 of file ChronologyProtector.php.
|
protected |
Map of (primary server name => position)
Definition at line 160 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getStartupSessionPositions().
|
protected |
UNIX timestamp when the client data was loaded.
Definition at line 157 of file ChronologyProtector.php.
|
protected |
Map of (DB cluster name => UNIX timestamp)
Definition at line 164 of file ChronologyProtector.php.
Referenced by Wikimedia\Rdbms\ChronologyProtector\getStartupSessionTimestamps().
|
protected |
Expected minimum index of the last write to the position store.
Definition at line 152 of file ChronologyProtector.php.
const Wikimedia\Rdbms\ChronologyProtector::POSITION_COOKIE_TTL = 10 |
Seconds to store position write index cookies (safely less than POSITION_STORE_TTL)
Definition at line 185 of file ChronologyProtector.php.