blob: 8d10c3d0a681bd04ff26a71702d80d4735e57a10 [file] [log] [blame]
This file will describe the design, layouts, and file formats of a
libsvn_fs_x repository.
Since FSX is still in a very early phase of its development, all sections
either subject to major change or simply "TBD".
Design
------
TBD.
Similar to FSFS format 7 but using a radically different on-disk format.
In FSFS, each committed revision is represented as an immutable file
containing the new node-revisions, contents, and changed-path
information for the revision, plus a second, changeable file
containing the revision properties.
To reduce the size of the on-disk representation, revision data gets
packed, i.e. multiple revision files get combined into a single pack
file of smaller total size. The same strategy is applied to revprops.
In-progress transactions are represented with a prototype rev file
containing only the new text representations of files (appended to as
changed file contents come in), along with a separate file for each
node-revision, directory representation, or property representation
which has been changed or added in the transaction. During the final
stage of the commit, these separate files are marshalled onto the end
of the prototype rev file to form the immutable revision file.
Layout of the FS directory
--------------------------
The layout of the FS directory (the "db" subdirectory of the
repository) is:
revs/ Subdirectory containing revs
<shard>/ Shard directory, if sharding is in use (see below)
<revnum> File containing rev <revnum>
<shard>.pack/ Pack directory, if the repo has been packed (see below)
pack Pack file, if the repository has been packed (see below)
manifest Pack manifest file, if a pack file exists (see below)
revprops/ Subdirectory containing rev-props
<shard>/ Shard directory, if sharding is in use (see below)
<revnum> File containing rev-props for <revnum>
<shard>.pack/ Pack directory, if the repo has been packed (see below)
<rev>.<count> Pack file, if the repository has been packed (see below)
manifest Pack manifest file, if a pack file exists (see below)
revprops.db SQLite database of the packed revision properties
transactions/ Subdirectory containing transactions
<txnid>.txn/ Directory containing transaction <txnid>
txn-protorevs/ Subdirectory containing transaction proto-revision files
<txnid>.rev Proto-revision file for transaction <txnid>
<txnid>.rev-lock Write lock for proto-rev file
txn-current File containing the next transaction key
locks/ Subdirectory containing locks
<partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest
<digest> File containing locks/children for path with <digest>
current File specifying current revision and next node/copy id
fs-type File identifying this filesystem as an FSFS filesystem
write-lock Empty file, locked to serialise writers
pack-lock Empty file, locked to serialise 'svnadmin pack' (f. 7+)
txn-current-lock Empty file, locked to serialise 'txn-current'
uuid File containing the UUID of the repository
format File containing the format number of this filesystem
fsx.conf Configuration file
min-unpacked-rev File containing the oldest revision not in a pack file
min-unpacked-revprop File containing the oldest revision of unpacked revprop
rep-cache.db SQLite database mapping rep checksums to locations
Files in the revprops directory are in the hash dump format used by
svn_hash_write.
The format of the "current" file is a single line of the form
"<youngest-revision>\n" giving the youngest revision for the
repository.
The "write-lock" file is an empty file which is locked before the
final stage of a commit and unlocked after the new "current" file has
been moved into place to indicate that a new revision is present. It
is also locked during a revprop propchange while the revprop file is
read in, mutated, and written out again. Furthermore, it will be used
to serialize the repository structure changes during 'svnadmin pack'
(see also next section). Note that readers are never blocked by any
operation - writers must ensure that the filesystem is always in a
consistent state.
The "pack-lock" file is an empty file which is locked before an 'svnadmin
pack' operation commences. Thus, only one process may attempt to modify
the repository structure at a time while other processes may still read
and write (commit) to the repository during most of the pack procedure.
It is only available with format 7 and newer repositories. Older formats
use the global write-lock instead which disables commits completely
for the duration of the pack process.
The "txn-current" file is a file with a single line of text that
contains only a base-36 number. The current value will be used in the
next transaction name, along with the revision number the transaction
is based on. This sequence number ensures that transaction names are
not reused, even if the transaction is aborted and a new transaction
based on the same revision is begun. The only operation that FSFS
performs on this file is "get and increment"; the "txn-current-lock"
file is locked during this operation.
"fsx.conf" is a configuration file in the standard Subversion/Python
config format. It is automatically generated when you create a new
repository; read the generated file for details on what it controls.
When representation sharing is enabled, the filesystem tracks
representation checksum and location mappings using a SQLite database in
"rep-cache.db". The database has a single table, which stores the sha1
hash text as the primary key, mapped to the representation revision, offset,
size and expanded size. This file is only consulted during writes and never
during reads. Consequently, it is not required, and may be removed at an
abritrary time, with the subsequent loss of rep-sharing capabilities for
revisions written thereafter.
Filesystem formats
------------------
TBD.
The "format" file defines what features are permitted within the
filesystem, and indicates changes that are not backward-compatible.
It serves the same purpose as the repository file of the same name.
So far, there is only format 1.
Node-revision IDs
-----------------
A node-rev ID consists of the following three fields:
node_revision_id ::= node_id '.' copy_id '.' txn_id
At this level, the form of the ID is the same as for BDB - see the
section called "ID's" in <../libsvn_fs_base/notes/structure>.
In order to support efficient lookup of node-revisions by their IDs
and to simplify the allocation of fresh node-IDs during a transaction,
we treat the fields of a node-rev ID in new and interesting ways.
Within a new transaction:
New node-revision IDs assigned within a transaction have a txn-id
field of the form "t<txnid>".
When a new node-id or copy-id is assigned in a transaction, the ID
used is a "_" followed by a base36 number unique to the transaction.
Within a revision:
Within a revision file, node-revs have a txn-id field of the form
"r<rev>/<offset>", to support easy lookup. The <offset> is the (ASCII
decimal) number of bytes from the start of the revision file to the
start of the node-rev.
During the final phase of a commit, node-revision IDs are rewritten
to have repository-wide unique node-ID and copy-ID fields, and to have
"r<rev>/<offset>" txn-id fields.
This uniqueness is done by changing a temporary
id of "_<base36>" to "<base36>-<rev>". Note that this means that the
originating revision of a line of history or a copy can be determined
by looking at the node ID.
The temporary assignment of node-ID and copy-ID fields has
implications for svn_fs_compare_ids and svn_fs_check_related. The ID
_1.0.t1 is not related to the ID _1.0.t2 even though they have the
same node-ID, because temporary node-IDs are restricted in scope to
the transactions they belong to.
Copy-IDs and copy roots
-----------------------
Copy-IDs are assigned in the same manner as they are in the BDB
implementation:
* A node-rev resulting from a creation operation (with no copy
history) receives the copy-ID of its parent directory.
* A node-rev resulting from a copy operation receives a fresh
copy-ID, as one would expect.
* A node-rev resulting from a modification operation receives a
copy-ID depending on whether its predecessor derives from a
copy operation or whether it derives from a creation operation
with no intervening copies:
- If the predecessor does not derive from a copy, the new
node-rev receives the copy-ID of its parent directory. If the
node-rev is being modified through its created-path, this will
be the same copy-ID as the predecessor node-rev has; however,
if the node-rev is being modified through a copied ancestor
directory (i.e. we are performing a "lazy copy"), this will be
a different copy-ID.
- If the predecessor derives from a copy and the node-rev is
being modified through its created-path, the new node-rev
receives the copy-ID of the predecessor.
- If the predecessor derives from a copy and the node-rev is not
being modified through its created path, the new node-rev
receives a fresh copy-ID. This is called a "soft copy"
operation, as distinct from a "true copy" operation which was
actually requested through the svn_fs interface. Soft copies
exist to ensure that the same <node-ID,copy-ID> pair is not
used twice within a transaction.
Unlike the BDB implementation, we do not have a "copies" table.
Instead, each node-revision record contains a "copyroot" field
identifying the node-rev resulting from the true copy operation most
proximal to the node-rev. If the node-rev does not itself derive from
a copy operation, then the copyroot field identifies the copy of an
ancestor directory; if no ancestor directories derive from a copy
operation, then the copyroot field identifies the root directory of
rev 0.
Revision file format
--------------------
TBD
A revision file contains a concatenation of various kinds of data:
* Text and property representations
* Node-revisions
* The changed-path data
That data is aggregated in compressed containers with a binary on-disk
representation.
Transaction layout
------------------
A transaction directory has the following layout:
props Transaction props
props-final Final transaction props (optional)
next-ids Next temporary node-ID and copy-ID
changes Changed-path information so far
node.<nid>.<cid> New node-rev data for node
node.<nid>.<cid>.props Props for new node-rev, if changed
node.<nid>.<cid>.children Directory contents for node-rev
<sha1> Text representation of that sha1
txn-protorevs/rev Prototype rev file with new text reps
txn-protorevs/rev-lock Lockfile for writing to the above
The prototype rev file is used to store the text representations as
they are received from the client. To ensure that only one client is
writing to the file at a given time, the "rev-lock" file is locked for
the duration of each write.
The three kinds of props files are all in hash dump format. The "props"
file will always be present. The "node.<nid>.<cid>.props" file will
only be present if the node-rev properties have been changed. The
"props-final" only exists while converting the transaction into a revision.
The <sha1> files' content is that of text rep references:
"<rev> <offset> <length> <size> <digest>"
They will be written for text reps in the current transaction and be
used to eliminate duplicate reps within that transaction.
The "next-ids" file contains a single line "<next-temp-node-id>
<next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID
assignments (without the leading underscores). The next node-ID is
also used as a uniquifier for representations which may share the same
underlying rep.
The "children" file for a node-revision begins with a copy of the hash
dump representation of the directory entries from the old node-rev (or
a dump of the empty hash for new directories), and then an incremental
hash dump entry for each change made to the directory.
The "changes" file contains changed-path entries in the same form as
the changed-path entries in a rev file, except that <id> and <action>
may both be "reset" (in which case <text-mod> and <prop-mod> are both
always "false") to indicate that all changes to a path should be
considered undone. Reset entries are only used during the final merge
phase of a transaction. Actions in the "changes" file always contain
a node kind.
The node-rev files have the same format as node-revs in a revision
file, except that the "text" and "props" fields are augmented as
follows:
* The "props" field may have the value "-1" if properties have
been changed and are contained in a "props" file within the
node-rev subdirectory.
* For directory node-revs, the "text" field may have the value
"-1" if entries have been changed and are contained in a
"contents" file in the node-rev subdirectory.
* For the directory node-rev representing the root of the
transaction, the "is-fresh-txn-root" field indicates that it has
not been made mutable yet (see Issue #2608).
* For file node-revs, the "text" field may have the value "-1
<offset> <length> <size> <digest>" if the text representation is
within the prototype rev file.
* The "copyroot" field may have the value "-1 <created-path>" if the
copy root of the node-rev is part of the transaction in process.
Locks layout
------------
Locks in FSX are stored in serialized hash format in files whose
names are MD5 digests of the FS path which the lock is associated
with. For the purposes of keeping directory inode usage down, these
digest files live in subdirectories of the main lock directory whose
names are the first 3 characters of the digest filename.
Also stored in the digest file for a given FS path are pointers to
other digest files which contain information associated with other FS
paths that are beneath our path (an immediate child thereof, or a
grandchild, or a great-grandchild, ...).
To answer the question, "Does path FOO have a lock associated with
it?", one need only generate the MD5 digest of FOO's
absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look
for a file located like so:
/path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8
And then see if that file contains lock information.
To inquire about locks on children of the path FOO, you would
reference the same path as above, but look for a list of children in
that file (instead of lock information). Children are listed as MD5
digests, too, so you would simply iterate over those digests and
consult the files they reference for lock information.