www/merge-tracking/func-spec.html - subversion - Git at Google

 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <style type="text/css"> /* <![CDATA[ */
   @import "../branding/css/tigris.css";
   @import "../branding/css/inst.css";
   /* ]]> */</style>
 <link rel="stylesheet" type="text/css" media="print"
   href="../branding/css/print.css"/>
 <script type="text/javascript" src="../branding/scripts/tigris.js"></script>
 <title>Merge Tracking Functional Specification</title>
 </head>

 <body>
 <div class="h1">
 <h1>Merge Tracking Functional Specification</h1>

 <p style="color: red">*** UNDER CONSTRUCTION ***</p>

 <p><a href="index.html">Merge tracking</a> functional specification.
 Describes Subversion 1.5.0, except where noted as
 <i>unimplemented.</i></p>

 <p style="color: red">TODO: Describe how each <a
 href="requirements.html">requirement</a> will actually function for
 Subversion.  Remove redundancies.</p>

 <div class="h2" id="diff-status">
 <h2>Diff/Status operations</h2>

 <p>Output is shown the same as pre-Merge Tracking, except for:</p>

 <ul>
   <li>Diffs pretty-print changes to merge info in an easily
     human-readable form.</li>

   <li>Diffs sometimes report spurious property changes from merge info
     (bug?).</li>

   <li>Status represents changes to the merge info for the root of a
     tree as a property change.</li>
 </ul>

 </div>  <!-- diff-status -->


 <div class="h2" id="copy-move">
 <h2>Copy/Move operations</h2>

 <p>Copy and move operations handle two types of merge info:</p>

 <dl>
   <dt>Explicit</dt>
   <dd>The pre-existing value of the <code>svn:mergeinfo</code>
     property on the source path.</dd>

   <dt>Implicit</dt>
   <dd>All revisions represented by the object at the source path (from
     its "appeared in" revision to its current revision).</dd>
 </dl>

 <div class="h3" id="ra-copy-move">
 <h3>Repository Access operation</h3>

 <p>Copy/move operations which contact the repository include:</p>

 <ul>
   <li>WC to URL (<i>code in progress, tests complete</i>, copy test
     #11 still failing over ra_dav)</li>
   <li>URL to WC</li>
   <li>URL to URL</li>
 </ul>

 <p>These operations always propogate both explicit and implicit merge
 info.  Other than the inclusion of merge info, operation is
 effectively the same as pre-Merge Tracking.</p>

 </div>  <!-- ra-copy-move -->

 <div class="h3" id="wc-wc-copy-move">
 <h3>Working Copy to Working Copy operation</h3>

 <p>Pre-Merge Tracking, WC to WC operations occurred offline (e.g. with
 no repository access).  This is a typical behavior of refactoring
 tools (e.g. IDEs like Eclipse), and is very useful when offline
 (e.g. on an airplane or subway, or at a cafe).</p>

 <p>However, to propogate merge info during copy/move operations,
 access to both a path's comprehensive merge info and its history is
 necessary.  To preserve offline operation, the Merge Tracking
 implementation supports two modes:</p>

 <ul>
   <li>A compatibility mode, which neither contacts the repository, nor
     does any merge info propogation (unless a copy source's merge info
     has been locally modified, in which its value is propogated the as
     any Subversion revision property).</li>

   <li>A mode which requires repository access (e.g. isn't offline),
     but which propogates all merge info from source path to
     destination (<i>unimplemented</i>, start with copy test #31).</li>
 </ul>

 <p>This behavior is comparable to the difference between <code>svn
 status</code> and <code>svn status -u</code>.</p>

 <p>While some state indicating delayed merge info retrieval and
 handling could instead be stored in WC to preserve offline operation,
 there are complications with this when subsequent uncommited revert
 operations should change the merge info (we'd have to store negative
 merge info in the WC).</p>

 </div>  <!-- wc-wc-copy-move -->

 </div>  <!-- copy-move -->

 <div class="h2" id="meta-data">
 <h2>Merge-related Meta Data</h2>

 <p>Merge Tracking meta data is stored in housekeeping properties
 (e.g. <code>svn:mergeinfo</code>).</p>

 <div class="h3" id="meta-data-mainpulation">
 <h3>Meta Data Manipulation</h3>

 <p>While direct manipulation of housekeeping properties can be used to
 change merge info, commands to manipulate this information have been
 provided.  Either style of operation supports adjustment of merge info
 when <a href="requirements.html#manual-merge">manual merges</a> occur,
 and can also be used to fulfill <a
 href="requirements.html#revision-blocking">block changes undesired for
 merge</a> (later, this might be better-addressed by a separate
 housekeeping property).</p>

 <ul>
   <li><code>merge --record-only</code> adds (or subtracts, if a
     reversed revision range is supplied) merge info for a path
     <i>without performing the actual merge</i>.</li>

   <li><code>propedit</code>/<code>propset</code> changes merge info
     for a path.</li>

   <li><code>propdel</code> removes mere info for a path.</li>
 </ul>

 </div>  <!-- meta-data-mainpulation -->

 <div class="h3" id="meta-data-audit">
 <h3>Meta Data Audit and Query</h3>

 <p>These features may or may not be completed for 1.5.0.</p>

 <ul>
       <li>Change Set Merge Availability (TODO)</li>
       <li>Find Change Set (TODO)</li>
       <li><a href="#commutative-author-and-rev">Commutative Author and Revision
       Reporting</a></li>
 </ul>

 <div class="h4" id="commutative-author-and-rev">
 <h4>Commutative Author and Revision Auditing</h4>

 <div class="h5" id="auditing-scope">
 <h5>Scope</h5>

 <p>Most commands which show username and merge information should also
 respect merge information and support <a
 href="requirements.html#commutative-author-and-rev">Commutative
 Auditing</a>.  These commands, collectively referred to <em>auditing
 commands</em>, are:</p>

 <ul>
   <li><code>svn log</code></li>
   <li><code>svn blame</code></li>
   <li><code>svn status --show-updates</code></li>
 </ul>

 <p><code>svn info</code> is purposely not included in this list, on
 the grounds that one would typically need more information than it can
 reasonably provide.</p>

 <p>A new switch, <code>--merge-sensitive</code>, along with a corresponding
 single-character shortcut, will be introduced for the auditing commands.
 Using it will enable these commands to show the additional information gleaned
 from parsing and processing the merge info on the targets in question.  This
 switch will also work with <code>--xml</code> to include additional merge
 information.  The new functionality added by <code>--merge-sensitive</code> is
 as follows.</p>

 <dl>
   <dt><code>svn log</code></dt>
   <dd><p>The original log message, in the current format, with the
   addition of a list of revisions and merge source paths that have
   been merged into the target.  The output for <code>log</code> should
   be consistent with the <code>diff</code> output for the
   <code>svn:mergeinfo</code> property.</p>

   <p>The <code>--verbose</code> switch will output the log information
   for the merged revisions as well.  This output may be in the style
   of <code>svnmerge.py</code>: the primary log message, followed by
   each of the original log messages indented with separators between
   them.</p>
   </dd>

   <dt><code>svn blame</code></dt>
   <dd>Two additional columns for each line, with the original revision
   and author of that line.  Unlike other commands, we do not need to
   worry about multiple source revisions, because each line can have at
   most one author.</dd>

   <dt><code>svn status --show-updates</code></dt>
   <dd>Add additional columns, reflecting the last original authors and
   revisions.</dd>
 </dl>

 </div>  <!-- auditing-scope -->

 <div class="h5" id="auditing-questions">
 <h5>Pending Questions</h5>

 <ul>
   <li>How will <code>--merge-sensitive</code> behave for commits which remove
   merge info (e.g. reverts)?</li>

   <li>In the case of <code>svn log</code>, would the user be better served if we
   just included the original revision logs in line with the logs (i.e., no
   special indentation, etc.)?</li>

   <li>What about <code>svn ls --verbose</code>, which also shows revisions and
   usernames?</li>
 </ul>

 </div>  <!-- auditing-questions -->

 <div class="h5" id="auditing-extra-credit">
 <h5>Additional Features</h5>

 <p>Although not part of the initial implementation, additional features have
 been suggested:</p>

 <ul>
   <li>A configuration option to always enable <code>--merge-sensitive</code>.
   </li>
 </ul>

 </div>  <!-- auditing-extra-credit -->

 </div>  <!-- commutative-author-and-rev -->

 </div>  <!-- meta-data-audit -->

 </div>  <!-- meta-data -->

 <div class="h2" id="repeated-merge">
 <h2>Repeated Merge</h2>

 <p>There are two general schemes for solving the <a
 href="requirements.html#repeated-merge">repeated merge</a> problem.
 Subversion 1.5 uses the <a href="#mrca-merge">Most Recent Common
 Ancestor (MRCA)</a> approach.  If a later version of Subversion
 (e.g. 2.0) overhauls the Merge Tracking implementation, it'll likely
 use the <a href="#as-merge">Ancestry Set (AS)</a> approach.</p>

 <p>Either solution also supports the <a
 href="requirements.html#cherry-picking">cherry picking</a>, <a
 href="requirements.html#rollback-merge">rollback</a>, and <a
 href="requirements.html#properties">property merging</a> use cases.  A
 <a href="requirements.html#merge-previews">merge preview</a> which is
 lighter-weight than an uncommitted merge into a WC is not
 supported.</p>

 <div class="h3" id="mrca-merge">
 <h3>The Most Recent Common Ancestor approach</h3>

 <p>In this scheme, An optional set of merge sources in each
 node-revision.  When asked to do a merge with only one source (that
 is, just <code>svn merge URL</code>, with no second argument), you
 compute the most recent ancestor and do a three-way merge between the
 common ancestor, the given URL, and the WC.</p>

 <p>To compute the most recent ancestor, you chain off the immediate
 predecessors of each node-revision.  The immediate predecessors are
 the direct predecessor (the most recent node-revision within the node)
 and the merge sources.  An interleaved breadth-first search should
 find the most recent common ancestor.</p>

 </div>  <!-- mrca-merge -->

 <div class="h3" id="as-merge">
 <h3>The Ancestry Set approach</h3>

 <p>In this scheme, you record the full ancestry set for each
 node-revision -- that is, the set of all changes which are accounted
 for in that node-revision.  (How you store this ancestry set is
 unimportant; the point is, you need a reasonably efficient way of
 determining it when asked.)  If you are asked to "svn merge URL", you
 apply the changes present in URL's ancestry but absent in WC's
 ancestry.  Note that this is not a single three-way merge; you may
 have to apply a large number of disjoint changes to the WC.</p>

 <p>For a longer description of this approach, see the <a
 href="/design.html#model.merging-and-ancestry">"Merging and Ancestry"
 section</a> of the original <a href="/design.html">design doc</a>.</p>

 <div class="h4" id="aslb-merge">
 <h4>Ancestry-Sensitive Line-Based Merge</h4>

 <p>Make 'hunks' of contextually-merged text sensitive to ancestry.</p>

 <p>A high-resolution version of <a
 href="requirements.html#repeated-merge">repeated merge</a>.  Rather
 than tracking whole changesets, we track the lineage of specific lines
 of code within a file.  The basic idea is that when re-merging a
 particular hunk of code, the contextual-merging process is aware that
 certain lines of code already represent the merging of particular
 lines of development.  Jack Repenning has a great example of this from
 ClearCase (see ASCII diagram below).</p>

 <p>See the <a href="../variance-adjusted-patching.html">variance
 adjusted patching</a> document for an extended discussion of how to
 implement this by composing diffs; see <a
 href="http://svn.collab.net/svn-doxygen/svn__diff_8h.html#a11"
 ><code>svn_diff_diff4()</code></a> for an implementation of same.  We
 may be closer to ancestry-sensitive merging than we think.</p>

 <p>Here's an example demonstrating how individual lines of code can be
 tracked.  In this diagram, we're drawing the lineage of a single file,
 with time flowing downwards.  The file begins life with three lines of
 text, "1\n2\n\3\n".  The file then splits into two lines of
 development.</p>

 <pre>
                     1
                     2
                     3
                   /   \
                  /     \
                 /       \
             one           1
             two           2.5
             three         3
              |     \      |
              |      \     |
              |       \    |
              |        \   |
              |         \ one                ## This node is a human's
              |           two-point-five     ## merge of two sides.
              |           three
              |            |
              |            |
              |            |
             one          one
             Two          two-point-five
             three        newline
                \         three
                 \         |
                  \        |
                   \       |
                    \      |
                     \     |
                      \    |
                       \   |
                        \  |
                          one                ## This node is a human's
                          Two-point-five     ## merge of the changes
                          newline            ## since the last merge.
                          three
 </pre>

 <p>It's the second merge that's important here.</p>

 <p>In a system like Subversion, the second merge of the left branch to
 the right will fail miserably: the whole file's contents will be
 placed within conflict markers.  That's because it's trying to dumbly
 apply a patch that changes "1\n2\n3" to "one\nTwo\nthree", and the
 target file has no matching lines at all.</p>

 <p>A smarter system (like Clearcase) would remember that the previous
 merge had happened, and specifically notice that the lines "one" and
 "three" are the results of that previous merge.  Therefore, it would
 ask the human only to deal with the "Two" versus "two-point-five"
 conflict; the earlier changes ("1\n2\n3" to "one\ntwo\nthree") would
 already be accounted for.</p>

 </div>  <!-- aslb-merge -->

 </div>  <!-- as-merge -->

 <div class="h3">
 <h3>Comparisons, Arguments, and Questions</h3>

 <p>AS allows you to merge changes from a branch out of order, without
 doing any bookkeeping.  MRCA requires you to merge changes from a
 branch in order.</p>

 <p>MRCA is simpler to implement, since it results in a three-way merge
 (which is well-understood by Subversion).  However, it may not handle
 all edge cases.  For instance, it may break down faster if the merging
 topology is not hierarchical.</p>

 <p>MRCA may be easier for users to understand, even though AS is
 probably simpler to a mathematician.</p>

 <p>Consistency with other modern version controls systems is
 desirable.</p>

 <p>If a user asks to merge a directory, should we apply MRCA or AS to
 each subdirectory and file to determine what ancestor(s) to use?  Or
 should we apply MRCA or AS just once, to the directory itself?  The
 latter approach seems simpler and more efficient, but will break down
 quickly if the user wants to merge subdirectories of a branch in
 advance of merging in the whole thing.</p>

 </div>  <!-- h3 -->

 </div>  <!-- repeated-merge -->


 <div class="h2" id="conflict-resolution">
 <h2>Merge Conflict Resolution</h2>

 <p>Merging inevitably produces conflicts which cannot be resolved by
 an algorithm alone.  In such a case, human intervention is required to
 resolve the conflicts.  The merge algorithm used by Subversion's Merge
 Tracking implementation makes this problem worse, since it breaks a
 requested merge range into several merges to avoid <a
 href="requirements.html#repeated-merge">repeating merges</a> which
 have already been applied to a merge target or its children.</p>

 <p>To help alleviate the pain of conflict resolution, a merge conflict
 resolution callback can be employed by Subversion clients
 (<i>unimplemented</i>).  This callback is invoked whenever merge
 conflicts are encountered, and can takes steps like launching a
 graphical merge tool (for interactive conflict resolution), or
 following a pre-specified directive like "always use the version from
 my merge source".  This last implementation can be used to support the
 <a href="requirements.html#automated-merge">SCM automated merge</a>
 use case.</p>

 <p>In a future release, the command-line client may supply a
 merge conflict resolution callback which will behave much like
 <em>svk</em>, when in interactive mode displaying some context for
 each conflict and prompting for how to resolve it, or when in
 non-interactive mode, taking directives beforehand
 (<i>unimplemented</i>).</p>

 <p>Related discussion from the dev@ mailing list can be found
 here:</p>

 <ul>
   <li><a
   href="http://subversion.tigris.org/servlets/ReadMsg?listName=dev&amp;msgNo=121756">
   Feedback solicited from IDE developers</a></li>

   <li><a
   href="http://subversion.tigris.org/servlets/ReadMsg?listName=dev&amp;msgNo=121263"
   >Original API proposal</a> (likely requires changes)</li>
 </ul>

 <p><a href="http://subversion.tigris.org/issues/show_bug.cgi?id=2022"
 >Issue #2022</a> is loosely related.</p>

 <div class="h3" id="distributable-resolution">
 <h3>Distribution of Conflict Resolution</h3>

 <p>No explicit facility is provided for distribution of conflict
 resolution.  To support this use case, developers can co-ordinate with
 each other to resolve merge conflicts on portions of a tree, and trade
 patches.</p>

 </div>  <!-- distributable-resolution -->

 </div>  <!-- conflict-resolution -->


 <div class="h2" id="migration-and-interoperability">
 <h2>Migration and Interoperability</h2>

 <div class="h3" id="migration">
 <h3>Migration</h3>

 <p>No explicit steps are necessary to migrate the content of a
 pre-Merge Tracking repository.  Only an upgrade to Subversion 1.5.0 is
 necessary.</p>

 <p>TODO: Merge meta data from svnmerge.py.  Dan Berlin has written
 Python code to perform this migration; it needs to be made available
 in the <code>tools/server-side/</code> area of the distribution .</p>

 </div>  <!-- migration -->

 <div class="h3" id="interoperability">
 <h3>Interoperability</h3>

 <p>Executive summary for client/repository inter-op:</p>

 <ul>
   <li>Older Subversion clients may <a href="requirements.html#compatibility"
   >interact with a 1.5.x+ Subversion repository</a>, but will continue
   to lack Merge Tracking functionality for:

   <ul>
     <li>Recording meta data about any merges performed.</li>
     <li>Using merge meta data to avoid <a
     href="requirements.html#repeated-merge">repeated merging</a>.</li>
   </ul>
   </li>

   <li>1.5.x+ Subversion clients may interact with a older Subversion
   repositories, with Merge Tracking functionality effectively
   neutralized.</li>
 </ul>

 <p>Gory detail for client/repository inter-op:</p>

 <ul>
   <li>A repository 1.4.x- doesn't provide any way to retrieve
   inherited merge info for a path (regardless of client version).  For
   a 1.5.x+ client which could theoretically make use of any merge info
   available to it, this will typically neutralize its Merge Tracking
   functionality.  The one case where merge info might come into play
   is when the merge info for a path is available locally (e.g. in the
   client's WC); in this case, repeated merges may be avoided.</li>

   <li>A 1.5.x client will record merge tracking meta data for merges
   performed, regardless of repository version.  However, a repository
   1.4.x- won't know to do anything special with this merge info.  When
   the repository is upgraded to 1.5.x+, we'll retain this merge info
   in the svn:mergeinfo property, but I'm not yet clear on what'll
   happen to the sqlite merge info index.  We may need some sort of
   upgrade path here, but don't have one yet, and aren't promising
   one.</li>
 </ul>

 <p>Subversion <a href="requirements.html#dump-load">dump files
 continue to be fully portable</a> between pre- and post-Merge Tracking
 versions of Subversion.</p>

 </div>  <!-- interoperability -->

 </div>  <!-- migration-and-interoperability -->


 <div class="h2" id="related-documents">
 <h2>Related Documents and Discussion</h2>

 <ul>
   <li><a href="http://subversion.tigris.org/merge-tracking/summit.html"
   >CollabNet customer Merge Tracking Summit</a></li>

   <li><a href="http://www.codeville.org/">Codeville</a> is reputed to
   excel both in its usefulness of storage of line-history in the
   <em>weave</em> format, and a corresponding merge algorithm:

   <ul>
     <li><a href="http://revctrl.org/PreciseCodevilleMerge">"Precise
     Codeville merge"</a> algorithm and Python implementation.  The
     algorithm takes into account line history, history points where
     they came from, the ability to retrieve ancestors' text as needed,
     and a snapshot of the current file.  It purports accuracy where
     <a href="http://revctrl.org/CodevilleMerge">other algorithms fall
     down</a>.</li>

     <li>Bram Cohen describes <a
     href="http://thread.gmane.org/gmane.comp.version-control.revctrl/2"
     >the merge algorithm</a> (May 2005)</li>
   </ul>
   </li>

   <li><a
   href="http://svn.collab.net/repos/svn/trunk/subversion/libsvn_fs_base/notes/structure">Structure
   of the Subversion FS BerkeleyDB backend</a></li>

 </ul>

 </div>  <!-- related-documents -->

 <p>$Date$</p>

 </div>  <!-- h1 -->
 </body>
 </html>
	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
	<html xmlns="http://www.w3.org/1999/xhtml">
	<head>
	<style type="text/css"> /* <![CDATA[ */
	@import "../branding/css/tigris.css";
	@import "../branding/css/inst.css";
	/* ]]> */</style>
	<link rel="stylesheet" type="text/css" media="print"
	href="../branding/css/print.css"/>
	<script type="text/javascript" src="../branding/scripts/tigris.js"></script>
	<title>Merge Tracking Functional Specification</title>
	</head>

	<body>
	<div class="h1">
	<h1>Merge Tracking Functional Specification</h1>

	<p style="color: red">* UNDER CONSTRUCTION *</p>

	<p><a href="index.html">Merge tracking</a> functional specification.
	Describes Subversion 1.5.0, except where noted as
	<i>unimplemented.</i></p>

	<p style="color: red">TODO: Describe how each <a
	href="requirements.html">requirement</a> will actually function for
	Subversion. Remove redundancies.</p>

	<div class="h2" id="diff-status">
	<h2>Diff/Status operations</h2>

	<p>Output is shown the same as pre-Merge Tracking, except for:</p>

	<ul>
	<li>Diffs pretty-print changes to merge info in an easily
	human-readable form.</li>

	<li>Diffs sometimes report spurious property changes from merge info
	(bug?).</li>

	<li>Status represents changes to the merge info for the root of a
	tree as a property change.</li>
	</ul>

	</div> <!-- diff-status -->


	<div class="h2" id="copy-move">
	<h2>Copy/Move operations</h2>

	<p>Copy and move operations handle two types of merge info:</p>

	<dl>
	<dt>Explicit</dt>
	<dd>The pre-existing value of the <code>svn:mergeinfo</code>
	property on the source path.</dd>

	<dt>Implicit</dt>
	<dd>All revisions represented by the object at the source path (from
	its "appeared in" revision to its current revision).</dd>
	</dl>

	<div class="h3" id="ra-copy-move">
	<h3>Repository Access operation</h3>

	<p>Copy/move operations which contact the repository include:</p>

	<ul>
	<li>WC to URL (<i>code in progress, tests complete</i>, copy test
	#11 still failing over ra_dav)</li>
	<li>URL to WC</li>
	<li>URL to URL</li>
	</ul>

	<p>These operations always propogate both explicit and implicit merge
	info. Other than the inclusion of merge info, operation is
	effectively the same as pre-Merge Tracking.</p>

	</div> <!-- ra-copy-move -->

	<div class="h3" id="wc-wc-copy-move">
	<h3>Working Copy to Working Copy operation</h3>

	<p>Pre-Merge Tracking, WC to WC operations occurred offline (e.g. with
	no repository access). This is a typical behavior of refactoring
	tools (e.g. IDEs like Eclipse), and is very useful when offline
	(e.g. on an airplane or subway, or at a cafe).</p>

	<p>However, to propogate merge info during copy/move operations,
	access to both a path's comprehensive merge info and its history is
	necessary. To preserve offline operation, the Merge Tracking
	implementation supports two modes:</p>

	<ul>
	<li>A compatibility mode, which neither contacts the repository, nor
	does any merge info propogation (unless a copy source's merge info
	has been locally modified, in which its value is propogated the as
	any Subversion revision property).</li>

	<li>A mode which requires repository access (e.g. isn't offline),
	but which propogates all merge info from source path to
	destination (<i>unimplemented</i>, start with copy test #31).</li>
	</ul>

	<p>This behavior is comparable to the difference between <code>svn
	status</code> and <code>svn status -u</code>.</p>

	<p>While some state indicating delayed merge info retrieval and
	handling could instead be stored in WC to preserve offline operation,
	there are complications with this when subsequent uncommited revert
	operations should change the merge info (we'd have to store negative
	merge info in the WC).</p>

	</div> <!-- wc-wc-copy-move -->

	</div> <!-- copy-move -->

	<div class="h2" id="meta-data">
	<h2>Merge-related Meta Data</h2>

	<p>Merge Tracking meta data is stored in housekeeping properties
	(e.g. <code>svn:mergeinfo</code>).</p>

	<div class="h3" id="meta-data-mainpulation">
	<h3>Meta Data Manipulation</h3>

	<p>While direct manipulation of housekeeping properties can be used to
	change merge info, commands to manipulate this information have been
	provided. Either style of operation supports adjustment of merge info
	when <a href="requirements.html#manual-merge">manual merges</a> occur,
	and can also be used to fulfill <a
	href="requirements.html#revision-blocking">block changes undesired for
	merge</a> (later, this might be better-addressed by a separate
	housekeeping property).</p>

	<ul>
	<li><code>merge --record-only</code> adds (or subtracts, if a
	reversed revision range is supplied) merge info for a path
	<i>without performing the actual merge</i>.</li>

	<li><code>propedit</code>/<code>propset</code> changes merge info
	for a path.</li>

	<li><code>propdel</code> removes mere info for a path.</li>
	</ul>

	</div> <!-- meta-data-mainpulation -->

	<div class="h3" id="meta-data-audit">
	<h3>Meta Data Audit and Query</h3>

	<p>These features may or may not be completed for 1.5.0.</p>

	<ul>
	<li>Change Set Merge Availability (TODO)</li>
	<li>Find Change Set (TODO)</li>
	<li><a href="#commutative-author-and-rev">Commutative Author and Revision
	Reporting</a></li>
	</ul>

	<div class="h4" id="commutative-author-and-rev">
	<h4>Commutative Author and Revision Auditing</h4>

	<div class="h5" id="auditing-scope">
	<h5>Scope</h5>

	<p>Most commands which show username and merge information should also
	respect merge information and support <a
	href="requirements.html#commutative-author-and-rev">Commutative
	Auditing</a>. These commands, collectively referred to <em>auditing
	commands</em>, are:</p>

	<ul>
	<li><code>svn log</code></li>
	<li><code>svn blame</code></li>
	<li><code>svn status --show-updates</code></li>
	</ul>

	<p><code>svn info</code> is purposely not included in this list, on
	the grounds that one would typically need more information than it can
	reasonably provide.</p>

	<p>A new switch, <code>--merge-sensitive</code>, along with a corresponding
	single-character shortcut, will be introduced for the auditing commands.
	Using it will enable these commands to show the additional information gleaned
	from parsing and processing the merge info on the targets in question. This
	switch will also work with <code>--xml</code> to include additional merge
	information. The new functionality added by <code>--merge-sensitive</code> is
	as follows.</p>

	<dl>
	<dt><code>svn log</code></dt>
	<dd><p>The original log message, in the current format, with the
	addition of a list of revisions and merge source paths that have
	been merged into the target. The output for <code>log</code> should
	be consistent with the <code>diff</code> output for the
	<code>svn:mergeinfo</code> property.</p>

	<p>The <code>--verbose</code> switch will output the log information
	for the merged revisions as well. This output may be in the style
	of <code>svnmerge.py</code>: the primary log message, followed by
	each of the original log messages indented with separators between
	them.</p>
	</dd>

	<dt><code>svn blame</code></dt>
	<dd>Two additional columns for each line, with the original revision
	and author of that line. Unlike other commands, we do not need to
	worry about multiple source revisions, because each line can have at
	most one author.</dd>

	<dt><code>svn status --show-updates</code></dt>
	<dd>Add additional columns, reflecting the last original authors and
	revisions.</dd>
	</dl>

	</div> <!-- auditing-scope -->

	<div class="h5" id="auditing-questions">
	<h5>Pending Questions</h5>

	<ul>
	<li>How will <code>--merge-sensitive</code> behave for commits which remove
	merge info (e.g. reverts)?</li>

	<li>In the case of <code>svn log</code>, would the user be better served if we
	just included the original revision logs in line with the logs (i.e., no
	special indentation, etc.)?</li>

	<li>What about <code>svn ls --verbose</code>, which also shows revisions and
	usernames?</li>
	</ul>

	</div> <!-- auditing-questions -->

	<div class="h5" id="auditing-extra-credit">
	<h5>Additional Features</h5>

	<p>Although not part of the initial implementation, additional features have
	been suggested:</p>

	<ul>
	<li>A configuration option to always enable <code>--merge-sensitive</code>.
	</li>
	</ul>

	</div> <!-- auditing-extra-credit -->

	</div> <!-- commutative-author-and-rev -->

	</div> <!-- meta-data-audit -->

	</div> <!-- meta-data -->

	<div class="h2" id="repeated-merge">
	<h2>Repeated Merge</h2>

	<p>There are two general schemes for solving the <a
	href="requirements.html#repeated-merge">repeated merge</a> problem.
	Subversion 1.5 uses the <a href="#mrca-merge">Most Recent Common
	Ancestor (MRCA)</a> approach. If a later version of Subversion
	(e.g. 2.0) overhauls the Merge Tracking implementation, it'll likely
	use the <a href="#as-merge">Ancestry Set (AS)</a> approach.</p>

	<p>Either solution also supports the <a
	href="requirements.html#cherry-picking">cherry picking</a>, <a
	href="requirements.html#rollback-merge">rollback</a>, and <a
	href="requirements.html#properties">property merging</a> use cases. A
	<a href="requirements.html#merge-previews">merge preview</a> which is
	lighter-weight than an uncommitted merge into a WC is not
	supported.</p>

	<div class="h3" id="mrca-merge">
	<h3>The Most Recent Common Ancestor approach</h3>

	<p>In this scheme, An optional set of merge sources in each
	node-revision. When asked to do a merge with only one source (that
	is, just <code>svn merge URL</code>, with no second argument), you
	compute the most recent ancestor and do a three-way merge between the
	common ancestor, the given URL, and the WC.</p>

	<p>To compute the most recent ancestor, you chain off the immediate
	predecessors of each node-revision. The immediate predecessors are
	the direct predecessor (the most recent node-revision within the node)
	and the merge sources. An interleaved breadth-first search should
	find the most recent common ancestor.</p>

	</div> <!-- mrca-merge -->

	<div class="h3" id="as-merge">
	<h3>The Ancestry Set approach</h3>

	<p>In this scheme, you record the full ancestry set for each
	node-revision -- that is, the set of all changes which are accounted
	for in that node-revision. (How you store this ancestry set is
	unimportant; the point is, you need a reasonably efficient way of
	determining it when asked.) If you are asked to "svn merge URL", you
	apply the changes present in URL's ancestry but absent in WC's
	ancestry. Note that this is not a single three-way merge; you may
	have to apply a large number of disjoint changes to the WC.</p>

	<p>For a longer description of this approach, see the <a
	href="/design.html#model.merging-and-ancestry">"Merging and Ancestry"
	section</a> of the original <a href="/design.html">design doc</a>.</p>

	<div class="h4" id="aslb-merge">
	<h4>Ancestry-Sensitive Line-Based Merge</h4>

	<p>Make 'hunks' of contextually-merged text sensitive to ancestry.</p>

	<p>A high-resolution version of <a
	href="requirements.html#repeated-merge">repeated merge</a>. Rather
	than tracking whole changesets, we track the lineage of specific lines
	of code within a file. The basic idea is that when re-merging a
	particular hunk of code, the contextual-merging process is aware that
	certain lines of code already represent the merging of particular
	lines of development. Jack Repenning has a great example of this from
	ClearCase (see ASCII diagram below).</p>

	<p>See the <a href="../variance-adjusted-patching.html">variance
	adjusted patching</a> document for an extended discussion of how to
	implement this by composing diffs; see <a
	href="http://svn.collab.net/svn-doxygen/svn__diff_8h.html#a11"
	><code>svn_diff_diff4()</code></a> for an implementation of same. We
	may be closer to ancestry-sensitive merging than we think.</p>

	<p>Here's an example demonstrating how individual lines of code can be
	tracked. In this diagram, we're drawing the lineage of a single file,
	with time flowing downwards. The file begins life with three lines of
	text, "1\n2\n\3\n". The file then splits into two lines of
	development.</p>

	<pre>
	1
	2
	3
	/ \
	/ \
	/ \
	one 1
	two 2.5
	three 3
	\| \ \|
	\| \ \|
	\| \ \|
	\| \ \|
	\| \ one ## This node is a human's
	\| two-point-five ## merge of two sides.
	\| three
	\| \|
	\| \|
	\| \|
	one one
	Two two-point-five
	three newline
	\ three
	\ \|
	\ \|
	\ \|
	\ \|
	\ \|
	\ \|
	\ \|
	\ \|
	one ## This node is a human's
	Two-point-five ## merge of the changes
	newline ## since the last merge.
	three
	</pre>

	<p>It's the second merge that's important here.</p>

	<p>In a system like Subversion, the second merge of the left branch to
	the right will fail miserably: the whole file's contents will be
	placed within conflict markers. That's because it's trying to dumbly
	apply a patch that changes "1\n2\n3" to "one\nTwo\nthree", and the
	target file has no matching lines at all.</p>

	<p>A smarter system (like Clearcase) would remember that the previous
	merge had happened, and specifically notice that the lines "one" and
	"three" are the results of that previous merge. Therefore, it would
	ask the human only to deal with the "Two" versus "two-point-five"
	conflict; the earlier changes ("1\n2\n3" to "one\ntwo\nthree") would
	already be accounted for.</p>

	</div> <!-- aslb-merge -->

	</div> <!-- as-merge -->

	<div class="h3">
	<h3>Comparisons, Arguments, and Questions</h3>

	<p>AS allows you to merge changes from a branch out of order, without
	doing any bookkeeping. MRCA requires you to merge changes from a
	branch in order.</p>

	<p>MRCA is simpler to implement, since it results in a three-way merge
	(which is well-understood by Subversion). However, it may not handle
	all edge cases. For instance, it may break down faster if the merging
	topology is not hierarchical.</p>

	<p>MRCA may be easier for users to understand, even though AS is
	probably simpler to a mathematician.</p>

	<p>Consistency with other modern version controls systems is
	desirable.</p>

	<p>If a user asks to merge a directory, should we apply MRCA or AS to
	each subdirectory and file to determine what ancestor(s) to use? Or
	should we apply MRCA or AS just once, to the directory itself? The
	latter approach seems simpler and more efficient, but will break down
	quickly if the user wants to merge subdirectories of a branch in
	advance of merging in the whole thing.</p>

	</div> <!-- h3 -->

	</div> <!-- repeated-merge -->


	<div class="h2" id="conflict-resolution">
	<h2>Merge Conflict Resolution</h2>

	<p>Merging inevitably produces conflicts which cannot be resolved by
	an algorithm alone. In such a case, human intervention is required to
	resolve the conflicts. The merge algorithm used by Subversion's Merge
	Tracking implementation makes this problem worse, since it breaks a
	requested merge range into several merges to avoid <a
	href="requirements.html#repeated-merge">repeating merges</a> which
	have already been applied to a merge target or its children.</p>

	<p>To help alleviate the pain of conflict resolution, a merge conflict
	resolution callback can be employed by Subversion clients
	(<i>unimplemented</i>). This callback is invoked whenever merge
	conflicts are encountered, and can takes steps like launching a
	graphical merge tool (for interactive conflict resolution), or
	following a pre-specified directive like "always use the version from
	my merge source". This last implementation can be used to support the
	<a href="requirements.html#automated-merge">SCM automated merge</a>
	use case.</p>

	<p>In a future release, the command-line client may supply a
	merge conflict resolution callback which will behave much like
	<em>svk</em>, when in interactive mode displaying some context for
	each conflict and prompting for how to resolve it, or when in
	non-interactive mode, taking directives beforehand
	(<i>unimplemented</i>).</p>

	<p>Related discussion from the dev@ mailing list can be found
	here:</p>

	<ul>
	<li><a
	href="http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=121756">
	Feedback solicited from IDE developers</a></li>

	<li><a
	href="http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=121263"
	>Original API proposal</a> (likely requires changes)</li>
	</ul>

	<p><a href="http://subversion.tigris.org/issues/show_bug.cgi?id=2022"
	>Issue #2022</a> is loosely related.</p>

	<div class="h3" id="distributable-resolution">
	<h3>Distribution of Conflict Resolution</h3>

	<p>No explicit facility is provided for distribution of conflict
	resolution. To support this use case, developers can co-ordinate with
	each other to resolve merge conflicts on portions of a tree, and trade
	patches.</p>

	</div> <!-- distributable-resolution -->

	</div> <!-- conflict-resolution -->


	<div class="h2" id="migration-and-interoperability">
	<h2>Migration and Interoperability</h2>

	<div class="h3" id="migration">
	<h3>Migration</h3>

	<p>No explicit steps are necessary to migrate the content of a
	pre-Merge Tracking repository. Only an upgrade to Subversion 1.5.0 is
	necessary.</p>

	<p>TODO: Merge meta data from svnmerge.py. Dan Berlin has written
	Python code to perform this migration; it needs to be made available
	in the <code>tools/server-side/</code> area of the distribution .</p>

	</div> <!-- migration -->

	<div class="h3" id="interoperability">
	<h3>Interoperability</h3>

	<p>Executive summary for client/repository inter-op:</p>

	<ul>
	<li>Older Subversion clients may <a href="requirements.html#compatibility"
	>interact with a 1.5.x+ Subversion repository</a>, but will continue
	to lack Merge Tracking functionality for:

	<ul>
	<li>Recording meta data about any merges performed.</li>
	<li>Using merge meta data to avoid <a
	href="requirements.html#repeated-merge">repeated merging</a>.</li>
	</ul>
	</li>

	<li>1.5.x+ Subversion clients may interact with a older Subversion
	repositories, with Merge Tracking functionality effectively
	neutralized.</li>
	</ul>

	<p>Gory detail for client/repository inter-op:</p>

	<ul>
	<li>A repository 1.4.x- doesn't provide any way to retrieve
	inherited merge info for a path (regardless of client version). For
	a 1.5.x+ client which could theoretically make use of any merge info
	available to it, this will typically neutralize its Merge Tracking
	functionality. The one case where merge info might come into play
	is when the merge info for a path is available locally (e.g. in the
	client's WC); in this case, repeated merges may be avoided.</li>

	<li>A 1.5.x client will record merge tracking meta data for merges
	performed, regardless of repository version. However, a repository
	1.4.x- won't know to do anything special with this merge info. When
	the repository is upgraded to 1.5.x+, we'll retain this merge info
	in the svn:mergeinfo property, but I'm not yet clear on what'll
	happen to the sqlite merge info index. We may need some sort of
	upgrade path here, but don't have one yet, and aren't promising
	one.</li>
	</ul>

	<p>Subversion <a href="requirements.html#dump-load">dump files
	continue to be fully portable</a> between pre- and post-Merge Tracking
	versions of Subversion.</p>

	</div> <!-- interoperability -->

	</div> <!-- migration-and-interoperability -->


	<div class="h2" id="related-documents">
	<h2>Related Documents and Discussion</h2>

	<ul>
	<li><a href="http://subversion.tigris.org/merge-tracking/summit.html"
	>CollabNet customer Merge Tracking Summit</a></li>

	<li><a href="http://www.codeville.org/">Codeville</a> is reputed to
	excel both in its usefulness of storage of line-history in the
	<em>weave</em> format, and a corresponding merge algorithm:

	<ul>
	<li><a href="http://revctrl.org/PreciseCodevilleMerge">"Precise
	Codeville merge"</a> algorithm and Python implementation. The
	algorithm takes into account line history, history points where
	they came from, the ability to retrieve ancestors' text as needed,
	and a snapshot of the current file. It purports accuracy where
	<a href="http://revctrl.org/CodevilleMerge">other algorithms fall
	down</a>.</li>

	<li>Bram Cohen describes <a
	href="http://thread.gmane.org/gmane.comp.version-control.revctrl/2"
	>the merge algorithm</a> (May 2005)</li>
	</ul>
	</li>

	<li><a
	href="http://svn.collab.net/repos/svn/trunk/subversion/libsvn_fs_base/notes/structure">Structure
	of the Subversion FS BerkeleyDB backend</a></li>

	</ul>

	</div> <!-- related-documents -->

	<p>$Date$</p>

	</div> <!-- h1 -->
	</body>
	</html>