| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" |
| "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
| <html xmlns="http://www.w3.org/1999/xhtml"> |
| <head> |
| <style type="text/css"> /* <![CDATA[ */ |
| @import "../branding/css/tigris.css"; |
| @import "../branding/css/inst.css"; |
| /* ]]> */</style> |
| <link rel="stylesheet" type="text/css" media="print" |
| href="../branding/css/print.css"/> |
| <script type="text/javascript" src="../branding/scripts/tigris.js"></script> |
| <title>Merge Tracking Functional Specification</title> |
| </head> |
| |
| <body> |
| <div class="h1"> |
| <h1>Merge Tracking Functional Specification</h1> |
| |
| <p style="color: red">*** UNDER CONSTRUCTION ***</p> |
| |
| <p><a href="index.html">Merge tracking</a> functional |
| specification.</p> |
| |
| <p style="color: red">TODO: Incorporate <a |
| href="summit.html">CollabNet Summit findings</a>. Describe how each |
| <a href="requirements.html">requirement</a> will actually function for |
| Subversion. Remove redundancies.</p> |
| |
| <div class="h2" id="diff-status"> |
| <h2>Diff/Status operations</h2> |
| |
| <p>Output is shown the same as pre-Merge Tracking, except for:</p> |
| |
| <ul> |
| <li>Diffs pretty-print changes to merge info in an easily |
| human-readable form.</li> |
| |
| <li>Status represents changes to the merge info for the root of a |
| tree as a property change.</li> |
| </ul> |
| |
| </div> <!-- diff-status --> |
| |
| |
| <div class="h2" id="copy-move"> |
| <h2>Copy/Move operations</h2> |
| |
| <p>Copy and move operations handle two types of merge info:</p> |
| |
| <dl> |
| <dt>Explicit</dt> |
| <dd>The pre-existing value of the <code>svn:mergeinfo</code> |
| property on the source path.</dd> |
| |
| <dt>Implicit</dt> |
| <dd>All revisions represented by the object at the source path (from |
| its "appeared in" revision to its current revision).</dd> |
| </dl> |
| |
| <div class="h3" id="ra-copy-move"> |
| <h3>Repository Access operation</h3> |
| |
| <p>Copy/move operations which contact the repository include:</p> |
| |
| <ul> |
| <li>WC to URL</li> |
| <li>URL to WC</li> |
| <li>URL to URL</li> |
| </ul> |
| |
| <p>These operations always propogate both explicit and implicit merge |
| info. Other than the inclusion of merge info, operation is |
| effectively the same as pre-Merge Tracking.</p> |
| |
| </div> <!-- ra-copy-move --> |
| |
| <div class="h3" id="wc-wc-copy-move"> |
| <h3>Working Copy to Working Copy operation</h3> |
| |
| <p>Pre-Merge Tracking, WC to WC operations occurred offline (e.g. with |
| no repository access). This is a typical behavior of refactoring |
| tools (e.g. IDEs like Eclipse), and is very useful when offline |
| (e.g. on an airplane or subway, or at a cafe).</p> |
| |
| <p>However, to propogate merge info during copy/move operations, |
| access to both a path's comprehensive merge info and its history is |
| necessary. To preserve offline operation, the Merge Tracking |
| implementation supports two modes:</p> |
| |
| <ul> |
| <li>A compatibility mode, which neither contacts the repository, nor |
| does any merge info propogation (unless a copy source's merge info |
| has been locally modified, in which its value is propogated the as |
| any Subversion revision property).</li> |
| |
| <li>A mode which requires repository access (e.g. isn't offline), |
| but which propogates all merge info from source path to |
| destination.</li> |
| </ul> |
| |
| <p>This behavior is comparable to the difference between <code>svn |
| status</code> and <code>svn status -u</code>.</p> |
| |
| <p>While some state indicating delayed merge info retrieval and |
| handling could instead be stored in WC to preserve offline operation, |
| there are complications with this when subsequent uncommited revert |
| operations should change the merge info (we'd have to store negative |
| merge info in the WC).</p> |
| |
| </div> <!-- wc-wc-copy-move --> |
| |
| </div> <!-- copy-move --> |
| |
| <div class="h2" id="repeated-merge"> |
| <h2>Repeated Merge</h2> |
| |
| <p>There are two general schemes for solving the <a |
| href="requirements.html#repeated-merge">repeated merge</a> |
| problem.</p> |
| |
| <div class="h3" id="mrca-merge"> |
| <h3>The Most Recent Common Ancestor approach (MRCA)</h3> |
| |
| <p>In this scheme, you record an optional set of merge sources in each |
| node-revision. When asked to do a merge with only one source (that |
| is, just "svn merge URL", with no second argument), you compute the |
| most recent ancestor and do a three-way merge between the common |
| ancestor, the given URL, and the WC.</p> |
| |
| <p>To compute the most recent ancestor, you chain off the immediate |
| predecessors of each node-revision. The immediate predecessors are |
| the direct predecessor (the most recent node-revision within the node) |
| and the merge sources. An interleaved breadth-first search should |
| find the most recent common ancestor.</p> |
| |
| </div> <!-- mrca-merge --> |
| |
| <div class="h3" id="as-merge"> |
| <h3>The Ancestry Set approach (AS)</h3> |
| |
| <p>In this scheme, you record the full ancestry set for each |
| node-revision -- that is, the set of all changes which are accounted |
| for in that node-revision. (How you store this ancestry set is |
| unimportant; the point is, you need a reasonably efficient way of |
| determining it when asked.) If you are asked to "svn merge URL", you |
| apply the changes present in URL's ancestry but absent in WC's |
| ancestry. Note that this is not a single three-way merge; you may |
| have to apply a large number of disjoint changes to the WC.</p> |
| |
| <p>For a longer description of this approach, see the <a |
| href="/design.html#model.merging-and-ancestry">"Merging and Ancestry" |
| section</a> of the original <a href="/design.html">design doc</a>.</p> |
| |
| <div class="h4" id="aslb-merge"> |
| <h4>Ancestry-Sensitive Line-Based Merge</h4> |
| |
| <p>Make 'hunks' of contextually-merged text sensitive to ancestry.</p> |
| |
| <p>A high-resolution version of <a |
| href="requirements.html#repeated-merge">repeated merge</a>. Rather |
| than tracking whole changesets, we track the lineage of specific lines |
| of code within a file. The basic idea is that when re-merging a |
| particular hunk of code, the contextual-merging process is aware that |
| certain lines of code already represent the merging of particular |
| lines of development. Jack Repenning has a great example of this from |
| ClearCase (see ASCII diagram below).</p> |
| |
| <p>See the <a href="../variance-adjusted-patching.html">variance |
| adjusted patching</a> document for an extended discussion of how to |
| implement this by composing diffs; see <a |
| href="http://svn.collab.net/svn-doxygen/svn__diff_8h.html#a11" |
| ><code>svn_diff_diff4()</code></a> for an implementation of same. We |
| may be closer to ancestry-sensitive merging than we think.</p> |
| |
| <p>Here's an example demonstrating how individual lines of code can be |
| tracked. In this diagram, we're drawing the lineage of a single file, |
| with time flowing downwards. The file begins life with three lines of |
| text, "1\n2\n\3\n". The file then splits into two lines of |
| development.</p> |
| |
| <pre> |
| 1 |
| 2 |
| 3 |
| / \ |
| / \ |
| / \ |
| one 1 |
| two 2.5 |
| three 3 |
| | \ | |
| | \ | |
| | \ | |
| | \ | |
| | \ one ## This node is a human's |
| | two-point-five ## merge of two sides. |
| | three |
| | | |
| | | |
| | | |
| one one |
| Two two-point-five |
| three newline |
| \ three |
| \ | |
| \ | |
| \ | |
| \ | |
| \ | |
| \ | |
| \ | |
| \ | |
| one ## This node is a human's |
| Two-point-five ## merge of the changes |
| newline ## since the last merge. |
| three |
| </pre> |
| |
| <p>It's the second merge that's important here.</p> |
| |
| <p>In a system like Subversion, the second merge of the left branch to |
| the right will fail miserably: the whole file's contents will be |
| placed within conflict markers. That's because it's trying to dumbly |
| apply a patch that changes "1\n2\n3" to "one\nTwo\nthree", and the |
| target file has no matching lines at all.</p> |
| |
| <p>A smarter system (like Clearcase) would remember that the previous |
| merge had happened, and specifically notice that the lines "one" and |
| "three" are the results of that previous merge. Therefore, it would |
| ask the human only to deal with the "Two" versus "two-point-five" |
| conflict; the earlier changes ("1\n2\n3" to "one\ntwo\nthree") would |
| already be accounted for.</p> |
| |
| </div> <!-- aslb-merge --> |
| |
| </div> <!-- as-merge --> |
| |
| <div class="h3"> |
| <h3>Comparisons, Arguments, and Questions</h3> |
| |
| <p>AS allows you to merge changes from a branch out of order, without |
| doing any bookkeeping. MRCA requires you to merge changes from a |
| branch in order.</p> |
| |
| <p>At first look, MRCA may be simpler to implement, since it results |
| in a single three-way merge. However, it likely does not handle all |
| edge cases. For instance, it may break down faster if the merging |
| topology is not hierarchical.</p> |
| |
| <p>MRCA may be easier for users to understand, even though AS is |
| probably simpler to a mathematician.</p> |
| |
| <p>Consistency with other modern version controls systems is |
| desirable.</p> |
| |
| <p>If a user asks to merge a directory, should we apply MRCA or AS to |
| each subdirectory and file to determine what ancestor(s) to use? Or |
| should we apply MRCA or AS just once, to the directory itself? The |
| latter approach seems simpler and more efficient, but will break down |
| quickly if the user wants to merge subdirectories of a branch in |
| advance of merging in the whole thing.</p> |
| |
| </div> <!-- h3 --> |
| |
| </div> <!-- repeated-merge --> |
| |
| |
| <div class="h2" id="related-documents"> |
| <h2>Related Documents and Discussion</h2> |
| |
| <ul> |
| <li><a href="http://www.codeville.org/">Codeville</a> is reputed to |
| excel both in its usefulness of storage of line-history in the |
| <em>weave</em> format, and a corresponding merge algorithm: |
| |
| <ul> |
| <li><a href="http://revctrl.org/PreciseCodevilleMerge">"Precise |
| Codeville merge"</a> algorithm and Python implementation. The |
| algorithm takes into account line history, history points where |
| they came from, the ability to retrieve ancestors' text as needed, |
| and a snapshot of the current file. It purports accuracy where |
| <a href="http://revctrl.org/CodevilleMerge">other algorithms fall |
| down</a>.</li> |
| |
| <li>Bram Cohen describes <a |
| href="http://thread.gmane.org/gmane.comp.version-control.revctrl/2" |
| >the merge algorithm</a> (May 2005)</li> |
| </ul> |
| </li> |
| |
| <li><a |
| href="http://svn.collab.net/repos/svn/trunk/subversion/libsvn_fs_base/notes/structure">Structure |
| of the Subversion FS BerkeleyDB backend</a></li> |
| |
| </ul> |
| |
| </div> <!-- related-documents --> |
| |
| </div> <!-- h1 --> |
| </body> |
| </html> |