MediaWiki REL1_41
FileRepo Architecture

Some quick notes on the file/repository architecture.

Functionality is, as always, driven by data model.

  • The repository object stores configuration information about a file storage method.
  • The file object is a process-local cache of information about a particular file.

Thus the file object is the primary public entry point for obtaining information about files, since access via the file object can be cached, whereas access via the repository should not be cached.

Functions which can act on any file specified in their parameters typically find their place either in the repository object, where reference to repository-specific configuration is needed, or in static members of File or FileRepo, where no such configuration is needed.

File objects are generated by a factory function from the repository. The repository thus has full control over the behavior of its subsidiary file class, since it can subclass the file class and override functionality at its whim. Thus there is no need for the File subclass to query its parent repository for information about repository-class-dependent behavior – the file subclass is generally fully aware of the static preferences of its repository. Limited exceptions can be made to this rule to permit sharing of functions, or perhaps even entire classes, between repositories.

These rules alone still do lead to some ambiguity – it may not be clear whether to implement some functionality in a repository function with a filename parameter, or in the file object itself.

So we introduce the following rule: the file subclass is smarter than the repository subclass. The repository should in general provide a minimal API needed to access the storage backend efficiently.

In particular, note that I have not implemented any database access in LocalRepo.php. LocalRepo provides only file access, and LocalFile provides database access and higher-level functions such as cache management.

Tim Starling, June 2007