|  | = How to interpret Subversion dumpfiles = | 
|  |  | 
|  | Version 1.1, 2013-02-02 | 
|  |  | 
|  | == Introduction == | 
|  |  | 
|  | The Subversion dumpfile format is a serialized description of the | 
|  | actions required to (re)build a version history. from scratch. | 
|  |  | 
|  | The goal of this document is that it be sufficient for people writing | 
|  | dumpfile interpreters to emulate the actions the dumpfile describes on | 
|  | a versioned filesystem-like store, such as another version-control | 
|  | system.  It derives from and incorporates some incomplete notes from | 
|  | before r39883. | 
|  |  | 
|  | === Unresolved questions === | 
|  |  | 
|  | 1. In interpreting a Node record which has both a copyfrom source and | 
|  | a property section, it is possible that the copy source node itself | 
|  | has a property section.  How are they to be combined? | 
|  |  | 
|  | 2. The section on the semantics of kinds of operations documents a | 
|  | minor bug at r39883 in the behavior of "add".  Has this been fixed? | 
|  |  | 
|  | Portions of text relevant to these questions are tagged with FIXME. | 
|  |  | 
|  | == Syntax == | 
|  |  | 
|  | === Encoding and delimiters === | 
|  |  | 
|  | Subversion dumpfiles are plain byte streams. The structural parts are | 
|  | ASCII.  Text sections and property key/value pairs may be interpreted | 
|  | as binary data in any encoding by client tools. | 
|  |  | 
|  | A dumpfile consists of four kinds of records.  A record is a group of | 
|  | RFC822-style header lines (each consisting of a key, followed by a | 
|  | colon, followed by text data to end of line), followed by an empty | 
|  | spacer line, followed optionally by a body section.  If the body | 
|  | section is present, another empty spacer line separates it from the | 
|  | following record. | 
|  |  | 
|  | For forward compatibility, unrecognized headers are ignored. | 
|  |  | 
|  | === Record types === | 
|  |  | 
|  | Dumpfiles include four record types.  Two, the version stamp and UUID | 
|  | record, consist of single header lines. The bulk of a dumpfile | 
|  | consists of Revision and Node records. | 
|  |  | 
|  | ==== Version stamp records ==== | 
|  |  | 
|  | A version stamp record is always the first line of the file and | 
|  | looks like this: | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | SVN-fs-dump-format-version: <N>\n | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | where <N> is replaced by the dump format version. Except where | 
|  | specified, the descriptions in this document apply to all | 
|  | versions of the format. | 
|  |  | 
|  | ==== UUID records ==== | 
|  |  | 
|  | Versions 2 and later may have a UUID record following the version | 
|  | stamp. It is of the form | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | UUID: <hex-string> | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | where the <hex-string> is the UUID of the originating repository. | 
|  | An example UUID is "7bf7a5ef-cabf-0310-b7d4-93df341afa7e". | 
|  |  | 
|  | ==== Revision records ==== | 
|  |  | 
|  | A Revision record has three headers and is usually followed by a | 
|  | property section.  Expect the following form and sequence: | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | Revision-number: <N> | 
|  | [Prop-content-length: <P>] | 
|  | Content-length: <L> | 
|  | ! | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | with the Revision-number header always first and the '!' indicating | 
|  | a mandatory empty spacer line.  <P> gives the length in bytes of the | 
|  | following property section. <L> gives the body length of the entire | 
|  | Revision record.  These two numbers will be *identical* for a Revision | 
|  | record; the Content-length header is added for the benefit of software | 
|  | that can parse RFC-822 messages. | 
|  |  | 
|  | A revision record is followed by one or more Node records (see below). | 
|  |  | 
|  | ==== Node records ==== | 
|  |  | 
|  | Each Revision record is followed by one or more Node records. | 
|  | Node records have the following sequence of header lines: | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | Node-path: <path/to/node/in/filesystem> | 
|  | [Node-kind: {file | dir}] | 
|  | Node-action: {change | add | delete | replace} | 
|  | [Node-copyfrom-rev: <rev>] | 
|  | [Node-copyfrom-path: <path> ] | 
|  | [Text-copy-source-md5: <blob>] | 
|  | [Text-copy-source-sha1: <blob>] | 
|  | [Text-content-md5: <blob>] | 
|  | [Text-content-sha1: <blob>] | 
|  | [Text-content-length: <T>] | 
|  | [Prop-content-length: <P>] | 
|  | [Content-length: Y] | 
|  | ! | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | Bracketing in [] indicates optional lines; { | } is an alternation group. | 
|  |  | 
|  | Dump decoders should be prepared for the optional lines after | 
|  | Node-action to be in any order, except that Content-length is | 
|  | always last if it present. | 
|  |  | 
|  | A Node record describes an action on a path relative to the repository | 
|  | root, and always begins with the Node-path specification. | 
|  |  | 
|  | The Node-kind line indicates whether the path is a file or directory. | 
|  | The header value will be one of the strings "file" or "dir". | 
|  | This header may be (and usually is) absent if the node action is a delete. | 
|  |  | 
|  | The Node-action line is always present and specifies the type of | 
|  | operation for this node.  The header value is one of the strings | 
|  | "change", "add", "delete", or "replace".  These operations will be | 
|  | described in detail later in this document. | 
|  |  | 
|  | Either both the Node-copyfrom-rev and Node-copyfrom-path lines will be | 
|  | present, or neither will be.  They pair to describe a copy source for | 
|  | the node. Copy-source semantics will be described in detail later in | 
|  | this document. | 
|  |  | 
|  | The Text-content-{md5,sha1} and Text-copy-source-{md5,sha1} lines are | 
|  | hash integrity checks and will be present only if Text-content-length | 
|  | and the copyfrom pair (respectively) are also present. A decoder may | 
|  | use them to verify that the source content they refer to has not been | 
|  | corrupted. | 
|  |  | 
|  | Text-content-length will be present only when there is a text section. | 
|  | Zero is a legal value for this length, indicating an empty file. | 
|  |  | 
|  | Prop-content-length will be present only when there is a properties section. | 
|  |  | 
|  | Content-length will be present if there is either a text or a | 
|  | properties section.  This is not always the case.  In particular, | 
|  | a delete operation cannot have either.  Some other operations that use | 
|  | copyfrom sources may also not have either. | 
|  |  | 
|  | Again, the '!' stands in for a mandatory empty line following the | 
|  | RFC822-style headers. A body may follow. | 
|  |  | 
|  | === Property sections === | 
|  |  | 
|  | A Revision record *may* have a property section, and a Node record *may* | 
|  | have a property section. Every record with a property section has | 
|  | a Prop-content-length header. | 
|  |  | 
|  | A property section consists of pairs of key and value records and | 
|  | is ended by a fixed trailer.  Here is an example attached to a | 
|  | Revision record: | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | Revision-number: 1422 | 
|  | Prop-content-length: 80 | 
|  | Content-length: 80 | 
|  |  | 
|  | K 6 | 
|  | author | 
|  | V 7 | 
|  | sussman | 
|  | K 3 | 
|  | log | 
|  | V 33 | 
|  | Added two files, changed a third. | 
|  | PROPS-END | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | The fixed trailer is "PROPS-END\n" and its length is included in the | 
|  | Prop-content-length. Before it, each K and V record consists of a | 
|  | header line giving the length of the key or value content in bytes. | 
|  | The content follows.  The content is itself always followed by \n. | 
|  |  | 
|  | In version 3 of the format, a third type 'D' of property record is | 
|  | introduced to describe property deletion. This feature will be | 
|  | described later, in the specification of delta dumps. | 
|  |  | 
|  | == Semantics == | 
|  |  | 
|  | === The kinds of things === | 
|  |  | 
|  | There are four kinds of things described by a dumpfile: paths, | 
|  | properties, content, and flows.  The distinctions among content, | 
|  | paths, and flows matter for understanding some operations. | 
|  |  | 
|  | A path is a filesystem location (a file or directory).  There are two | 
|  | kinds of paths in a dumpfile; node paths and copy sources. | 
|  |  | 
|  | Properties are key-value pairs associated with revisions or paths. | 
|  | Subversion interprets and reserves some properties, those beginning | 
|  | with "svn:". Others are not interpreted by Subversion; they may | 
|  | may be set and read for the convenience of other applications, such | 
|  | as repository browsers or translators. | 
|  |  | 
|  | A flow is a sequence of actions on a file or directory path that is | 
|  | considered to be a single history for change-tracking purposes. | 
|  | Creating a flow tells Subversion that you want to track the history of | 
|  | the path or paths it contains. Destroying a flow breaks the chain of | 
|  | history; changes will not be tracked across the break, even if another | 
|  | flow is created at the same path.  A copy operation creates a new | 
|  | flow connected to the flow from which it was copied. | 
|  |  | 
|  | Content is what file paths point at (one timewise slice of a flow). It | 
|  | is the payload of program source code, documents, images, and so forth | 
|  | that a version control system actually manages. | 
|  |  | 
|  | A Node record describes a change in properties, the addition or deletion | 
|  | of a flow, or a change in content.  It must do at least one of these things, | 
|  | otherwise it would be a no-op and omitted. | 
|  |  | 
|  | When no copyfrom is present, and the action isn't an add or copy, then | 
|  | the kind of the thing identified by (PATH, REVISION) must agree with | 
|  | the kind of the thing identified by (PATH, -1+REVISION). | 
|  |  | 
|  | Terminological node: in Subversion-speak, the term "node" is | 
|  | historically ambiguous.  Sometimes it refers to what this document | 
|  | calls a "flow", and sometimes it refers to the internal per-revision | 
|  | structure that a Node record represents (that is, just one action in a | 
|  | flow).  For clarity, most of this document avoids the term "node" in | 
|  | favor of the more specific "flow" and "Node record", but knowing | 
|  | about this issue will help if you read the Ancient History section. | 
|  |  | 
|  | === The kinds of operations === | 
|  |  | 
|  | .File operations | 
|  | |====================================================================== | 
|  | |                           |   add    | delete | replace  |  change  | | 
|  | |Can have text section?     | optional |   no   | optional | optional | | 
|  | |Can have property section? | optional |   no   | optional | optional | | 
|  | |Can have copy source?      | optional |   no   | optional |    no    | | 
|  | |Fails on existent path     |   yes*   |   no   |    no    |    no    | | 
|  | |Fails on non-existent path |    no    |  yes   |   yes    |   yes    | | 
|  | |====================================================================== | 
|  |  | 
|  | FIXME: As of December 2011 there is a minor bug: Adding a file with history | 
|  | twice _in two different revisions_ succeeds silently. | 
|  |  | 
|  | .Directory operations | 
|  | |====================================================================== | 
|  | |                           |   add    | delete | replace  |  change  | | 
|  | |Can have text section?     |    no    |   no   |    no    |    no    | | 
|  | |Can have property section? | optional |   no   | optional | required | | 
|  | |Can have copy source?      | optional |   no   | optional |    no    | | 
|  | |Fails on existent path     |   yes    |   no   |    no    |    no    | | 
|  | |Fails on non-existent path |    no    |  yes   |   yes    |   yes    | | 
|  | |====================================================================== | 
|  |  | 
|  | A Node record represents an operation that does one of four things: add, | 
|  | delete, change, or replace. | 
|  |  | 
|  | Node records can carry content in one (or both!) of two ways: from a text | 
|  | section or from a copy source (that is, a copy-path and copy-revision | 
|  | pair). | 
|  |  | 
|  | Giving a copy source appends the node to the flow of which that source | 
|  | is part; when you 'add' or 'replace' with a copy source, the content | 
|  | at the path becomes a copy of the source (but see below for a | 
|  | qualification about directories). | 
|  |  | 
|  | Giving a text section also changes the content of the flow. In the | 
|  | (unusual) case that a node has both a copy source and a text section, | 
|  | the correct semantics is to attach the path to the source flow and | 
|  | then change the content. | 
|  |  | 
|  | An add operation creates a new flow for a file or directory. See the | 
|  | table above for possible operand combinations. | 
|  |  | 
|  | A delete operation deletes a flow and its content. If the path is a | 
|  | file, the file is deleted.  If the path is a directory, the directory | 
|  | and all its children are deleted. A subsequent add at the same path | 
|  | will create a new and different flow with its own history. | 
|  |  | 
|  | A change operation changes properties on a file or directory path. See the | 
|  | table above for possible operand combinations. | 
|  |  | 
|  | A replace operation behaves exactly like a delete followed by an add | 
|  | (destroying an old flow, producing a new one) when it has no copy | 
|  | source. When a replace has a copy source, it produces a new flow | 
|  | with history extending back through the copy source. A Node record | 
|  | representing a replace operation may have a property section. | 
|  |  | 
|  | The main reason "replace" exists is because it helps sequential | 
|  | processors of the dump stream avoid possibly notifying about multiple | 
|  | actions on the same path. | 
|  |  | 
|  | It is even possible to have a replace with a copyfrom source *and* | 
|  | text, such as would result from this on the client side: | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | $ svn rm dir/file.txt | 
|  | $ svn cp otherdir/otherfile.txt dir/file.txt | 
|  | $ echo "Replacement text" > dir/file.txt | 
|  | $ svn ci -m "Replace dir/file.txt with a copy of otherdir/otherfile.txt and replace its text, too." | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | Subversion filesystems do not allow the root directory ("/") to be | 
|  | deleted or replaced. | 
|  |  | 
|  | === Some details about copyfroms === | 
|  |  | 
|  | The source and target of a copyfrom are always of like kind; that is, | 
|  | Subversion dump will never generate a node with a source type of file | 
|  | and a target type of directory or vice-versa. | 
|  |  | 
|  | Interpreting copyfrom_path for file copies is straightforward; the | 
|  | target pathname gets the contents of the source pathname. | 
|  |  | 
|  | Directory copies (the primitive beneath branching and tagging) are | 
|  | tricky.  For each source path under the source directory, a new path | 
|  | is generated by removing the head segment of the pathname that is | 
|  | the source directory.  That new path under the target directory gets | 
|  | the content of the source path. | 
|  |  | 
|  | After this operation: | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | Node-path: x/y/z | 
|  | Node-kind: dir | 
|  | Node-action: add | 
|  | Node-copyfrom-rev: 10 | 
|  | Node-copyfrom-path: a/b/c | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | the file a/b/c/d will have been be copied to x/y/z/d. | 
|  |  | 
|  | A single revision may include multiple copyfrom Node records, even multiple | 
|  | copyfroms to the same directory, even mixed directory and file copies | 
|  | to the same directory. | 
|  |  | 
|  | === Properties and persistence === | 
|  |  | 
|  | The properties section of a Revision record consists of some (possibly | 
|  | empty) subset of the three reserved revision properties: svn:author, | 
|  | svn:date, and svn:log, along with any other revision properties. | 
|  |  | 
|  | The revision properties do not persist to later revisions.  Each revision | 
|  | has exactly the revision properties specified in its revision record, or | 
|  | no revision properties if there is no property section. | 
|  |  | 
|  | The key thing to know about Node properties is that they are | 
|  | persistent, once set, until modified by a future property | 
|  | section on the same path. | 
|  |  | 
|  | Normally, a dumpfile re-lists the entire property set for a directory | 
|  | or file in every Node record that changes any part of it. (But see | 
|  | the material on delta dumps for an exception.) | 
|  |  | 
|  | This implies that to delete a given property from a path, a dumpfile | 
|  | generator will issue a Node record with all other properties listed in it; | 
|  | to delete all properties from a path, the dumpfile generator will | 
|  | simply issue a node with an empty properties section. Note that this | 
|  | is different from an *absent* properties section, which will change | 
|  | no properties and will be associated with a change to content! | 
|  |  | 
|  | === Representation of symbolic links === | 
|  |  | 
|  | When the Subversion client sends a content blob representing a | 
|  | symbolic link (that is, with the svn:special property) the contents of | 
|  | the blob is not just the link's target path. It will have the prefix | 
|  | "link ".  The client likewise interprets this prefix at checkout time. | 
|  |  | 
|  | In the future, other special blob formats with other prefix keywords may | 
|  | be defined.  None such yet exist as of revision 1441992 (February 2013). | 
|  |  | 
|  | === Implementation pragmatics === | 
|  |  | 
|  | Because directory operations with copyfroms don't specify all the file | 
|  | paths they modify, an interpreter for this format must build a map of | 
|  | the paths in the file store it is manipulating, and update that map as | 
|  | it processes each Node record. | 
|  |  | 
|  | On a repository with thousands of commits, the per-revision list of | 
|  | maps can become quite large. For space economy, the file map for each | 
|  | revision can be discarded after it is processed *unless it is a source | 
|  | revision for a copyfrom*. | 
|  |  | 
|  | == An example == | 
|  |  | 
|  | Here's an example of revision 1422, which added a new directory | 
|  | "baz", added a new file "bop" inside it, and modified the file "foo.c": | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | Revision-number: 1422 | 
|  | Prop-content-length: 80 | 
|  | Content-length: 80 | 
|  |  | 
|  | K 6 | 
|  | author | 
|  | V 7 | 
|  | sussman | 
|  | K 3 | 
|  | log | 
|  | V 33 | 
|  | Added two files, changed a third. | 
|  | PROPS-END | 
|  |  | 
|  | Node-path: bar/baz | 
|  | Node-kind: dir | 
|  | Node-action: add | 
|  | Prop-content-length: 35 | 
|  | Content-length: 35 | 
|  |  | 
|  | K 10 | 
|  | svn:ignore | 
|  | V 4 | 
|  | TAGS | 
|  | PROPS-END | 
|  |  | 
|  |  | 
|  | Node-path: bar/baz/bop | 
|  | Node-kind: file | 
|  | Node-action: add | 
|  | Prop-content-length: 76 | 
|  | Text-content-length: 54 | 
|  | Content-length: 130 | 
|  |  | 
|  | K 14 | 
|  | svn:executable | 
|  | V 2 | 
|  | on | 
|  | K 12 | 
|  | svn:keywords | 
|  | V 15 | 
|  | LastChangedDate | 
|  | PROPS-END | 
|  | Here is the text of the newly added 'bop' file. | 
|  | Whee. | 
|  |  | 
|  | Node-path: bar/foo.c | 
|  | Node-kind: file | 
|  | Node-action: change | 
|  | Text-content-length: 102 | 
|  | Content-length: 102 | 
|  |  | 
|  | Here is the fulltext of my change to an existing /bar/foo.c. | 
|  | Notice that this file has no properties. | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | == Format variants == | 
|  |  | 
|  | === Version 3 format === | 
|  |  | 
|  | Version 3 format is a delta dump; text changes are represented | 
|  | as diffs against the original file, and properties as incremental | 
|  | changes to a persistent set (that is, a property section does not | 
|  | necessarily implicitly clear the property set on a path before the | 
|  | new property settings are evaluated). | 
|  |  | 
|  | This change is a space optimization. It requires additional | 
|  | computing time to integrate the diff history. | 
|  |  | 
|  | Version 3 is generated by SVN versions 1.1.0-present, if requested by the user. | 
|  |  | 
|  | This format is equivalent to the VERSION 2 format except for the | 
|  | following: | 
|  |  | 
|  | 1. The format starts with the new version number of the dump format | 
|  | ("SVN-fs-dump-format-version: 3\n"). | 
|  |  | 
|  | 2. There are several new optional headers for Node records: | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | [Text-delta: true|false] | 
|  | [Prop-delta: true|false] | 
|  | [Text-delta-base-md5: blob] | 
|  | [Text-delta-base-sha1: blob] | 
|  | [Text-copy-source-sha1: blob] | 
|  | [Text-content-sha1: blob] | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | The default value for the boolean headers is "false".  If the value is | 
|  | set to "true", then the text and property contents will be treated | 
|  | as deltas against the previous contents of the flow (as determined | 
|  | by copy history for adds with history, or by the value in the | 
|  | previous revision for changes--just as with commits). | 
|  |  | 
|  | Property deltas have the same format as regular property lists except | 
|  | that (1) properties with the same value as in the previous contents of | 
|  | the flow are not printed, and (2) deleted properties will be written | 
|  | out as | 
|  |  | 
|  | ------------------------------------------------------------------- | 
|  | D <name length> | 
|  | <name> | 
|  | ------------------------------------------------------------------- | 
|  |  | 
|  | just as a regular property is printed, but with the "K " changed to a | 
|  | "D " and with no value part. | 
|  |  | 
|  | Text deltas are written out as a series of svndiff0 windows.  If | 
|  | Text-delta-base-md5 is provided, it is the checksum of the base to | 
|  | which the text delta is applied; note that older versions (pre-1.5) of | 
|  | 'svnadmin load' may ignore the checksum. | 
|  |  | 
|  | Text-delta-base-sha1, Text-copy-source-sha1, and Text-content-sha1 are not | 
|  | currently used by the loader.  They are written by 1.6-and-later versions of | 
|  | Subversion so that future loaders can optionally choose which checksum to | 
|  | use for checking for corruption. | 
|  |  | 
|  | === Archaic version 1 format === | 
|  |  | 
|  | There are actually two types of version 1 dump streams. The regular ones | 
|  | are generated since r2634 (svn 0.14.0). Older ones also claim to be | 
|  | version 1, but miss the Props-content-length and Text-content-length | 
|  | fields in the block header. In those days there *always* was a | 
|  | properties block. | 
|  |  | 
|  | This note is included for historical completeness only, at is it highly | 
|  | unlikely that any Subversion instances that old remain in production. | 
|  |  | 
|  | == Implementation choices for optional behaviour == | 
|  |  | 
|  | This section lists some of the ways existing implementations interpret the | 
|  | optional aspects of the specification. | 
|  |  | 
|  | When a Revision record has no revision properties, svnadmin and svnrdump | 
|  | write an empty properties section whereas svndumpfilter omits the properties | 
|  | section. (At least in Subversion 1.0 through 1.8.) | 
|  |  | 
|  | == Ancient history == | 
|  |  | 
|  | Old discussion: | 
|  |  | 
|  | (This file started as a proposal, preserved here for posterity.) | 
|  |  | 
|  | A proposal for an svn filesystem dump/restore format. | 
|  |  | 
|  | === Two problems we want to solve === | 
|  |  | 
|  | 1.  When we change our node-id schema, we need to migrate all of our | 
|  | data (by dumping and restoring). | 
|  |  | 
|  | 2.  Serves as a backup format.  Could be read by other software tools | 
|  | someday. | 
|  |  | 
|  |  | 
|  | === Design Goals === | 
|  |  | 
|  | A.  Written as two new public functions in svn_fs.h.  To be invoked | 
|  | by new 'svnadmin' subcommands. | 
|  |  | 
|  | B.  Format uses only timeless fs concepts. | 
|  |  | 
|  | The dump format needs to reference concepts that we *know* are | 
|  | general enough to never change.  These concepts must exist | 
|  | independently of any internal node-id schema, or any DB storage | 
|  | backend.  In other words, we're talking about the basic ideas in | 
|  | our original "design spec" from May 2000. | 
|  |  | 
|  | === Format Semantics === | 
|  |  | 
|  | Here are the timeless semantics of our fs design -- the things that | 
|  | would be stored in our dump format. | 
|  |  | 
|  | - A filesystem is an array of trees. | 
|  | Each tree is called a "revision" and has unversioned properties attached. | 
|  |  | 
|  | - A revision has a tree of "nodes" hanging off of it. | 
|  | Actually, the nodes in the filesystem form a DAG.  A revision | 
|  | always points to an initial node that represents the 'root' of some tree. | 
|  |  | 
|  | - The majority of a tree's nodes are hard-links (references) to | 
|  | nodes that were created in earlier trees. | 
|  |  | 
|  | - A node contains | 
|  |  | 
|  | - versioned text | 
|  | - versioned properties | 
|  | - predecessor history:  "which node am I a variant of?" | 
|  | - copy history:  "which node am I a copy of?" | 
|  |  | 
|  | The history values can be non-existent (meaning the node is | 
|  | completely new), or can have a value of {revision, path}. | 
|  |  | 
|  | === Refinement of proposal #2: === | 
|  |  | 
|  | (after discussion with gstein) | 
|  |  | 
|  | Each node starts with RFC822-style headers at the top.  The final | 
|  | header is a 'Content-length:', followed by the content, so record | 
|  | boundaries can be inferred. | 
|  |  | 
|  | The content section has two implicit parts: a property hash, and the | 
|  | fulltext.  The division between these two sections is implied by the | 
|  | "PROPS-END\n" tag at the end of the prophash.  In the case of a | 
|  | directory node or a revision, only the prophash is present. | 
|  |  | 
|  | //End of document. |