blob: 27a9f5bef85a276eb8d48f147f67db195a3337b1 [file] [log] [blame]
[ At the time of writing (1.7-dev), FSFS format 5 was current. Since then
format 5 has been retracted; it was never released. 1.7 shipped with f4 and
1.8 will most likely ship with f6. Format 5 was current during 1.7-dev and
packed revprops into an authoritative SQLite database. ]
Introduction
------------
The FSFS external data format can be changed to allow for
significantly reduced I/O overhead without changing the
fundamental ideas behind FSFS.
The whole idea is to rearrange packed revision info in a
new FSFS format "6". A two-way conversion between format
"5" and "6" should be possible as well as supporting both
formats for different repositories on the same server.
Revision Order
--------------
FSFS format "5" packs revisions by putting revision 0 at
the beginning of the file and concatenating the others in
ascending order. This intuitive design is counter-productive
because we always trace data HEAD -> r0 and scanning a file
backwards is expensive.
+-------+
| rev 0 |
+-------+
| rev 1 |
+-------+
: ... :
+-------+
| rev N |
+-------+
<EOF>
A counter-intuitive but more efficient reversed order (de-
scending revision order in a packed file) results in always
reading forward through a file.
+-------+
| rev N |
+-------+
: ... :
+-------+
| rev 1 |
+-------+
| rev 0 |
+-------+
<EOF>
Don't Interleave Revision and Content Data
------------------------------------------
Format "5" keeps each revision as a single block. When
re-constructing a node content from the diffs, we jump
through a number of distant revision blocks. For each node.
+--------------+
| rev N Reps |
+--------------+
| rev N Header |
+--------------+
: ... :
+--------------+
| rev 1 Reps |
+--------------+
| rev 1 Header |
+--------------+
| rev 0 Reps |
+--------------+
| rev 0 Header |
+--------------+
<EOF>
Instead, delta-info should be removed from the revision
blocks and put at the end of the file. This speeds up
header-only operations like "log" without impairing node
content lookup. And it lays the foundation for further
improvements.
+--------------+
| rev N Header |
+--------------+
: ... :
+--------------+
| rev 1 Header |
+--------------+
| rev 0 Header |
+--------------+
| rev N Reps |
+--------------+
: ... :
+--------------+
| rev 1 Reps |
+--------------+
| rev 0 Reps |
+--------------+
<EOF>
The only downside is that putting headers first is somewhat
more computationally expensive since offsets needs to be
calculated in advance. This is made even more difficult by
using a variable length encoding for those offsets.
Group the Representations by Delta Tree
---------------------------------------
All diffs that are based on the same node within the packed
revision file are put in one place. Re-constructing a node
would then involve reading the revision headers (relatively
close together) and the content deltas after that (again
with high locality).
+------------------+
| rev N Header |
+------------------+
: ... :
+------------------+
| rev 1 Header |
+------------------+
| rev 0 Header |
+------------------+
| node tree A Reps |
+------------------+
| node tree B Reps |
+------------------+
: ... :
+------------------+
| node tree Z Reps |
+------------------+
<EOF>
We can optimize that even further by exploiting the skip-
delta information: the "walks" through the delta tree for a
given node will merge bit by bit into a main "trunk". The
nodes on these paths can be re-ordered such that very few
seek() operations are required on average to reconstruct
the node content in this file.
Example of 8 changes in revs r0 .. r7 (for simplicity),
looking at the delta-info only ("->" means "stored as diff
against"):
r0
r1->r0
r2->r0
r3->r2
r4->r0
r5->r4
r6->r4
r7->r6
Default ordering r0,r1,r2,r3,r4,r5,r6,r7 requires
1/1/2/2/2/2/3/3 seeks. For 2^N changes (N integer > 0),
we need (N+1)/2 seeks on average.
A path-optimized ordering r0,r4,r6,r7,r5,r2,r3,r1
requires 1/1/1/1/2/2/2/2 seeks. It's (N+3)/4 on average.
That optimized ordering can easily be constructed by
* select the highest rev not covered, yet
* add its diff path to the existing result
(starting with the first revision not
already is in the result)
* repeat until all revs are covered
Note that revision paths start with the low end of
the revision range because contents gets reconstructed
from r0 upwards.
Move the Manifest Info into the Packed File
-------------------------------------------
Once we mastered the offset pre-calculation issue (see
above), we may as well put the manifest info at the be-
ginning of the file. This will benefit "log" in particular
as only one consecutive chunk of data needs to be read
from only one file.
+------------------+
| rev 0 Offset |
+------------------+
| rev 1 Offset |
+------------------+
: ... :
+------------------+
| rev N Offset |
+------------------+
| rev N Header |
+------------------+
: ... :
+------------------+
| rev 1 Header |
+------------------+
| rev 0 Header |
+------------------+
| node tree A Reps |
+------------------+
| node tree B Reps |
+------------------+
: ... :
+------------------+
| node tree Z Reps |
+------------------+
<EOF>