blob: e6133a128ef38c23ef8b4f5a01261d402c6ae856 [file] [log] [blame]
(writ by Ben, 7/21/00)
Problem
-------
* A client checks out a working copy of a tree from a repository.
We call this tree `B' (for `base' revision). We call the client's
tree `WC' (for `working copy').
* The client procedes to commit a few changes to the repository.
The client also makes some local modifications.
* Meanwhile, other people continue to commit changes to the repository.
* Now the client wishes to update WC, to bring all of WC up-to-date
with the latest tree in the repository (`L').
The issue is that WC is an ugly hodgepodge of files; many files are
still based on the original base tree B, but some files have been
individually committed and match the contents of repository trees that
are descendants of B. And some files are just locally modified, not
yet committed.
In other words, given that WC's tree is a unique mix of files unknown
to the server, what exchange of information must take place so that
the server can send a tree-delta that converts WC into L?
(Note: this problem also describes the situation where the client is
attempting to commit WC, and the server first needs to check WC's
`up-to-date-ness' and point out conflicts.)
Triangulation
-------------
Here's the plan of attack.
d2
---> B ------> L
\ ^
\ |
\ | ?? d3
d1 \ |
\ |
-> WC
* The client sends a delta `d1' which describes how to convert B
into WC. (It can do this because each change has been
meticulously tracked in the working copy's administrative files.)
* The server generates a delta `d2' which describes how to convert B
into L. (It can do this because the repository has also been
meticulously tracking each change from B to L.)
* In theory, therefore, knowledge of d1 and d2 should allow us to
create d3, which describes how to change WC into L.
Defining `Change'
-----------------
Deltas only talk about entities that have changed.
We say that two deltas *conflict* iff they each change one or more of
the same entity.
Define a function C() which, given a delta, returns an unordered list
of all node numbers that the delta changes.
Here are the rules that C() follows:
1. If a file's contents change, then that file's node has changed.
2. If a file is added, renamed, or deleted, then both the file's node
*and* it's parent node have changed.
Notation --
C(P->Q) = { 3, 7, 9, 21 }.
This statement means that a certain delta converts tree P into
tree Q, and the node numbers changed in the process are 3, 7, 9,
and 21.
The Process
-----------
1. The client generates a `special' delta
What do we mean by `special'?
The client tracks local changes made to a working copy, queueing them
up as the user works. Normally, when it's time to commit, the client
generates a delta describing all the queued local changes, and submits
them to the server. This makes sense, because the client only cares
about discovering conflicts that are *immediately* relevant to the
newest changes.
But this list of local changes is *not* enough to create a delta
B->WC! To be perfectly clear, there are two types of information that
must included in a delta B->WC:
* local changes that are queued, but not yet committed
* local changes that *have* been committed since WC == B
Therefore, the client's administrative files must keep track of both
types of changes.
2. Reconstruction of WC
Next, the server receives B->WC and effectively "rebuilds" an internal
representation of the oh-so-unique WC. The server's internal
reconstruction of WC needn't be some large, complex object on disk or
in RAM; it's just a representation of the WC tree structure, complete
with each node's number and internal version number.
We call this reconstructed WC `RWC'.
3. The server generates a second delta.
Examining node numbers (and their internal version numbers), the
server creates a delta RWC->L.
4. The server runs C() on both deltas
The server compares the output of C(B->WC) and C(RWC->L), looking for
any intersection of the sets. Any intersection of the sets indicates
conflicting deltas.
Examples
--------
Example 1: No conflict
C(B->WC) = { 2, 5, 9, 13, 21, 87 }
C(RWC->L) = { 3, 4, 6, 22, 54, 99, 102 }
Because the two deltas have changed *completely* different sets of
nodes, there's no conflict. If the client were attempting to
commit, the delta would be approved. If the client were asking
for an update, the server would send back RWC->L.
Example 2: One conflict
C(B->WC) = { 2, 5, 9, 13, 21, 87 }
C(RWC->L) = { 3, 4, 6, 22, 54, 87, 102 }
Both deltas have changed node 87. If the client were attempting
to commit, the entire transaction would be denied. If the client
were asking for an update, the server would point out the
conflict. In both cases, it's up to the client to describe the
conflict either with conflict markers (in the case of a changed
file) or with an informational message (in the case of a changed
directory).
Example 3: Client adds a file
Suppose that completely new file has been added to the working
copy, but not yet committed.
C(B->WC) = { 2, 5, 9, 13, 21, 87, ?? }
C(RWC->L) = { 3, 4, 6, 22, 54, 99, 102 }
The question marks indicate the fact that while C() was
traversing the delta and looking up node numbers, it ran into an
unrecognized pathname. There's no node number for the file,
because it doesn't yet exist in the repository!
This case demonstrates that mere node numbers aren't quite
enough; pathnames must be also be examined when looking for
conflicts between two deltas. E.g., suppose that the new file
has the exact same pathname as node 102 in L?
Example 4: Client deletes a file
Deleting a file shows up in a tree-delta as a "change" to a file,
and thus this file's node shows up in C(B->WC), so this is the
same as example 2.
Example 5: Client moves a file or directory.
This action would show up as *two* changed node numbers in
C(B->WC), and the rest would play out like example 2.