| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" |
| "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
| <html xmlns="http://www.w3.org/1999/xhtml"> |
| <head> |
| <title>Merge Tracking Design</title> |
| <style type="text/css"> |
| .question { color: grey; } |
| .answer { } |
| </style> |
| </head> |
| |
| <body> |
| <div class="h1"> |
| <h1>Merge Tracking Design</h1> |
| |
| <p style="color: red">*** UNDER CONSTRUCTION ***</p> |
| |
| <p>Subversion's <a href="index.html">merge tracking</a> uses a layered |
| design, with the user-visible operations based primarily on the |
| information from the <a href="#merge-history">merge history</a>.</p> |
| |
| <ul> |
| <li><a href="#merge-history">Merge History</a></li> |
| <li><a href="#data-structures">Data Structures</a></li> |
| <li>Merge Operations (TODO)</li> |
| <li><a href="#audit-operations">Audit Operations</a></li> |
| <li>Other Operations (TODO)</li> |
| </ul> |
| |
| <div class="h2" id="merge-history"> |
| <h2>Merge History</h2> |
| |
| <p>Or, <em>Tracking What Revisions Have Been Merged Where</em> |
| provides the information used by Subversion's merge tracking-related |
| capabilities (history sensitive merging, etc.). The design is based |
| on Dan Berlin's <a |
| href="http://svn.haxx.se/dev/archive-2006-04/0916.shtml">proposal</a> |
| <small>(Message-Id: |
| <1146242044.23718.1.camel@dberlin0-corp.corp.google.com>)</small> |
| and subsequent edits.</p> |
| |
| <div class="h3"> |
| <h3>Goals</h3> |
| |
| <p>The goal of the Merge History portion of the design is to track the |
| information needed by the operations outlined by the majority of the |
| <a href="requirements.html">use cases</a> (e.g. the revision numbers |
| being merged by a merge operation), and keeping this information in |
| the right places as various operations (<code>copy</code>, |
| <code>delete</code>, <code>add</code>, etc.) are performed. This |
| portion of the design does <em>not</em> encompass the operations |
| themselves.</p> |
| |
| <p>The goals:</p> |
| |
| <ul> |
| <li>To be able to track this down to what files in a working copy |
| and be able to determine what files have had what revisions merged |
| into them.</li> |
| |
| <li>To not need to contact the server more than we already do now to |
| determine which revisions have been merged in a file or directory |
| (e.g. some contact is acceptable, asking the server about each file is |
| not).</li> |
| |
| <li>To be able to edit merge information in a human-readable form.</li> |
| |
| <li>For the information to be stored in a space efficient manner, |
| and to be able to determine the revisions merged into a given |
| file/director in a time efficient manner.</li> |
| |
| <li>Still getting a conservatively correct answer (not worse than |
| what we have now) when no merge info is available.</li> |
| |
| <li>To be able to collect, transmit, and keep this information up to |
| date as much as possible on the client side.</li> |
| |
| <li>To be able to index this information in the future order to |
| answer queries.</li> |
| </ul> |
| |
| <p>Non-goals for <em>the 1.5 design</em> include:</p> |
| |
| <ul> |
| <li>Doing actual <a href="func-spec.html#as-merge">history sensitive |
| merging</a>. Subversion does not yet have sufficient support for |
| creation of fully accurate changeset graphs, which are necessary to |
| handle cyclic merging (e.g. of so-called "reflected" |
| revisions, also known as repeated bi-directional mergeing). For |
| details on this problem (caused by lack of tracking of multiple |
| parents for a change), see <a |
| href="http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=127537" |
| >this discussion thread</a>, and especially <a |
| href="http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=127570" |
| >this follow-up</a>.</li> |
| |
| <li>Curing cancer (aka being all things to all people).</li> |
| </ul> |
| |
| </div> <!-- goals --> |
| |
| <div class="h3" id="storage"> |
| <h3>Information Storage</h3> |
| |
| <p>The first question that many people ask is "where should we store |
| the merge information?" (what we store will be covered next). A merge |
| history property, named <code>SVN_PROP_MERGEINFO</code> |
| (e.g. <code>svn:mergeinfo</code>) stored in directory and file |
| properties. Each will store the <em>full</em>, <em>complete</em> list |
| of current merged-in changes, as far as it knows. This ensures that |
| the merge algorithm and other consumers do not have to walk back |
| through old revisions in order to compute a complete list of merge |
| information for a path.</p> |
| |
| <p>Directly merged into the item means changes from any merge that |
| have affected this path, which includes merges into parents, |
| grandparents, etc., that had some affect on the path.</p> |
| |
| <p>Doing this storage of complete information at each point makes |
| manual editing safe, because the changes to a path's merge info are |
| localized to that path.</p> |
| |
| <p>However, as a space optimization, if the information on a |
| sub-directory or file is exactly the same as the merge information for |
| its parent directory, it <em>may</em> be elided in favor of that |
| parent information. This eliding may is done on the fly, but could |
| also be done during a postpass (e.g. a <code>svnadmin |
| mergeinfo-optimize</code>). Eliding information means that an |
| implementation may have to walk parent directories in order to gather |
| information about merge info (however, this may have been necessary |
| anyways). It is expected that directory trees are not that deep, and |
| the lookup of merge info properties quick enough (due to indexing, |
| etc.), to make this eliding not affect performance too much.</p> |
| |
| <p>Eliding will never affect the semantics of merge information, as it |
| should only be performed in the case when it was exactly the same, and |
| if it was exactly the same, it could not have had an effect on the |
| merge results.</p> |
| |
| <p>Any path may have merge info attached to it.</p> |
| |
| <p>The way we choose which path's merge info to use in case of |
| conflicts involves a simple system of inheritance <a |
| href="#merge-history-footnotes">[1]</a>, where the "most specific" |
| place wins. This means that if the property is set on a file, that |
| completely overrides the directory level properties for the directory |
| containing the file. Non-inheritable merge info can be set on |
| directories, signifying that the merge info applies only to the |
| directory but not its children.</p> |
| |
| <p>The way we choose which to store to depends on how much and where |
| you merge, and will be covered in the semantics.</p> |
| |
| <p>The reasoning for this system is to avoid having to either copy |
| info everywhere, or crawl everywhere, in order to determine which |
| revisions have been applied. At the same time, we want to be space |
| and time efficient, so we can't just store the entire revision list |
| everywhere.</p> |
| |
| <p>As for what is stored:</p> |
| |
| <p>A survey of the community shows a slight preference for a human |
| editable storage format along the lines of how |
| <code>svnmerge.py</code> stores its merge info (e.g. path name and |
| list of revisions). Binary storage of such information would buy, on |
| average, a 2-3 byte decrease per revision/range in size over ASCII <a |
| href="#merge-history-footnotes">[2]</a>, while making it not directly |
| human-readable/editable.</p> |
| |
| <p>The revisions we have merged <em>into</em> something are |
| represented as a path, a colon, and then a comma separated revision |
| list, containing one or more revision or revision ranges. Revision |
| range end and beginning points are separated by "-".</p> |
| |
| <div class="h4" id="merge-info-grammar"> |
| <h4>Grammar</h4> |
| <table> |
| <tr> |
| <th align="left">Token</th> |
| <th align="left">Definition</th> |
| </tr> |
| <tr> |
| <td>revisionrange</td> |
| <td>REVISION "-" REVISION</td> |
| </tr> |
| <tr> |
| <td>revisioneelement</td> |
| <td>(revisionrange | REVISION)"*"?</td> |
| </tr> |
| <tr> |
| <td>rangelist</td> |
| <td>revisioneelement (COMMA revisioneelement)*</td> |
| </tr> |
| <tr> |
| <td>revisionline</td> |
| <td>PATHNAME COLON rangelist</td> |
| </tr> |
| <tr> |
| <td>top</td> |
| <td>revisionline (NEWLINE revisionline)*</td> |
| </tr> |
| </table> |
| </div> <!-- merge-info-grammar --> |
| |
| <p>This merge history ("top"), existing on a path specifies all the |
| changes that have ever been merged into this object (file, dir or |
| repo) within this repository. It specifies the sources of the merges, |
| (and thus two or more pathnames may be required to represent one |
| source object at different revisions due to renaming).</p> |
| |
| <p>The optional "*" following a revisionelement token signifies a |
| non-inheritable revision/revision range.</p> |
| |
| <p>This list will <em>not</em> be stored in a canonicalized minimal |
| form for a path (e.g. it may contain single revision numbers that |
| could fall into ranges). This is chiefly because the benefit of such |
| a canonical format -- which is slightly easier for visual |
| <em>comparison</em>, but not indexing -- is outweighed by the fact |
| that generating a canonical form may require groveling through a lot |
| of information to determine what that minimal canonical form is. In |
| particular, it may be that the revision list "5,7,9" is, in minimal |
| canonical form, "5-9", because 6 and 8 do not have any affect on the |
| path name that 5 and 9 are from. Canonicalization could be done as a |
| server side post pass because the information is stored in |
| properties.</p> |
| |
| <p>Note that this revision format will not scale on its own if you |
| have a list of million revisions. None will easily. However, because |
| it is stored in properties, one can change the WC and FS backends to |
| simply do something different with this single property if they wanted |
| to. Given the rates of change of various very active repositories, |
| this will not be a problem we need to solve for many many years.</p> |
| |
| <p>The payload of <code>SVN_PROP_MERGEINFO</code> will be duplicated |
| in an index separate from the FS which is created during |
| <code>svnadmin create</code>, or on-demand for a pre-existing |
| repository which has started using Subversion 1.5. This index will |
| support fast querying, be populated during a merge or <code>svnadmin |
| load</code>, and cough up its contents as needed during API calls. |
| The contents of <code>SVN_PROP_MERGEINFO</code> is stored redundantly |
| in index (to the FS). Dan Berlin has prototyped a simple index using |
| sqlite3. David James later proposed a more normalized schema design, |
| some of the features of which may become useful for implementing Merge |
| Tracking functionality in a more performant manner.</p> |
| |
| </div> <!-- storage --> |
| |
| |
| <div class="h3"> |
| <h3>Information Updating</h3> |
| |
| <p>Each operation you can perform may update or copy the merge info |
| associated with a path.</p> |
| |
| <p><code>svn add</code>: No change to merge info.</p> |
| |
| <p><code>svn delete</code>: No direct change to merge info |
| (indirectly, because the props go away, so does the merge info for the |
| file).</p> |
| |
| <p><code>svn copy</code>: Makes a full copy of any explicit merge info |
| from the source path to the destination path. Also adds "implied" |
| merge info from the source path.</p> |
| |
| <p><code>svn rename</code>: Same as <code>svn copy</code>.</p> |
| |
| <p><code>svn merge</code>: Adds or subtracts to the merge info, |
| according to the following:</p> |
| |
| <ul> |
| <li>Where to store the info: |
| <ol> |
| <li>If the merge target is a single file, the merge info goes to |
| the property <code>SVN_PROP_MERGEINFO</code> set on that file.</li> |
| |
| <li>If the merge target is a directory, the merge info goes to the |
| property <code>SVN_PROP_MERGEINFO</code> set on the shallowest |
| directory of the merge (e.g. the topmost directory affected by the |
| merge) that will require different info than the info already set |
| on other directories.</li> |
| </ol> |
| |
| The last clause of rule 2 is only meant to handle cherry picking and |
| multiple merges. In the case that people repeatedly merge the same |
| tree into the same tree, the information will end up only on the |
| shallowest directory of the merge. If changes are selectively |
| applied (e.g. all changes are applied to every directory but one), |
| the information will be on the shallowest common ancestor of all |
| those directories, <em>as well</em> as information being placed on |
| the directory where the changes are not applied, so that it will |
| override the information from that shallow directory. See cherry |
| picking example for more details. Besides selective application, |
| apply changes that affect some directory, and then applying |
| different changes to subdirectories of that directory, will also |
| produce merge info on multiple directories in a given path. |
| </li> |
| |
| <li>What info is stored: |
| <ol> |
| <li>If you are merging in reverse, revisions are subtracted from |
| the revision lines, but we never write out anti-revisions. Thus, |
| if you subtract all the merged revisions, you just get an empty |
| list, and if you do a reverse merge from there, you still get an |
| empty list.</li> |
| |
| <li>If you are merging forward, the revision(s) you are merging is |
| added to the range list in sorted order (such that all revisions |
| and revision ranges in the list are monotonically increasing from |
| left to right).</li> |
| |
| <li>The path (known as PATHNAME in the grammar) used as the key to |
| determine which revision line to change is the sub-directory path |
| being merged from, relative to the repo root, with the repo URL |
| stripped from it.</li> |
| </ol> |
| |
| In the case that we are merging changes that themselves contain |
| merge info, the merge info properties must be merged. The effect of |
| this is that indirect merge info becomes direct merge info as it is |
| integrated as part of the merge info now set on the property. The |
| way this merge is performed is to merge the revision lists for each |
| identical pathname, and to copy the rest. Blocking of merges and |
| how this affects this information is not covered in this design. |
| The indirect info merging is *in addition* to specifying the merge |
| that we are now doing. See the repeated merge with indirect info |
| example for an example. |
| </li> |
| </ul> |
| |
| <p>Thus a merge of revisions 1-9 from |
| http://example.com/repos-root/trunk would produce "/trunk:1-9"</p> |
| |
| <p>Cross-repo merging is a bridge we can cross if we ever get there :).</p> |
| |
| </div> <!-- h3 --> |
| |
| |
| <div class="h3"> |
| <h3>Examples</h3> |
| |
| <div class="h4"> |
| <h4>Repeated merge</h4> |
| |
| <p>(I have assumed no renames here, and that all directories were |
| added in rev 1, for simplicity. The pathname will never change, this |
| should not cause any issues that need examples.)</p> |
| |
| <p>Assume trunk has 9 revisions, 1-9.</p> |
| |
| <p>A merge of /trunk into /branches/release will produce the merge |
| info "/trunk: 1-9".</p> |
| |
| <p>Assume trunk now has 6 additional revisions, 14-18.</p> |
| |
| <p>A merge of /trunk into /branches/release should only merge 14-18 |
| and produce the merge info "/trunk: 1-9,14-18". This merge info will |
| be placed on /branches/release.</p> |
| |
| <p>(Note the canonical minimal form of the above would be 1-18, as |
| 9-14 do not affect that path. This is also an acceptable answer, as |
| is any variant that represents the same information.)</p> |
| |
| </div> <!-- h4 --> |
| |
| <div class="h4"> |
| <h4>Repeated merge with indirect info</h4> |
| |
| <p>Assume the repository is in the state it would be after the |
| "Repeated merge" example. Assume additionally, we now have a branch |
| /branches/next-release, with revisions 20-24 on it.</p> |
| |
| <p>We wish to merge /branches/release into /branches/next-release.</p> |
| |
| <p>A merge of /branches/release into /branches/next-release will produce |
| the merge info:</p> |
| <pre> |
| "/branches/release: 1-24 |
| /trunk:1-9,14-18" |
| </pre> |
| |
| <p>This merge info will be placed on /branches/next-release.</p> |
| |
| <p>Note that the merge information about merges *to* /branches/release |
| has been added to our merge info.</p> |
| |
| <p>A future merge of /trunk into /branches/next-release, assuming no |
| new revisions on /trunk, will merge nothing.</p> |
| |
| </div> <!-- h4 --> |
| |
| <div class="h4"> |
| <h4>Cherry picking a change to a file</h4> |
| |
| <p>Assume the repository is in the state it would be after the |
| "Repeated merge with indirect info" example.</p> |
| |
| <p>Assume we have revision 25 on /trunk, which affects /trunk/foo.c |
| and /trunk/foo/bar/bar.c</p> |
| |
| <p>We wish to merge the portion of change affecting /trunk/foo.c</p> |
| |
| <p>A merge of revision 25 of /trunk/foo.c into /branches/release/foo.c |
| will produce the merge info:</p> |
| <pre> |
| "/trunk/foo.c:1-9,14-18,25". |
| This merge information will be placed on /branches/release/foo.c |
| </pre> |
| |
| <p>All other merge information will still be intact on |
| /branches/release (ie there is information on /branches/release's |
| directory).</p> |
| |
| <p>(The cherry picking one directory case is the same as file, with |
| files replaced with directories, hence i have not gone through the |
| example).</p> |
| |
| </div> <!-- h4 --> |
| |
| <div class="h4"> |
| <h4>Merging changes into parents and then merging changes into |
| children</h4> |
| |
| <p>Assume the repository is in the state it would be after the |
| "Repeated merge with indirect info" example. Assume we have revision |
| 25 on /trunk, which affects /trunk/foo Assume we have revision 26 on |
| /trunk, which affects /trunk/foo/baz We wish to merge revision 25 into |
| /branches/release/foo, and merge revision 26 into |
| /branches/release/foo/baz.</p> |
| |
| <p>A merge of revision 25 of /trunk/foo into /branches/release/foo will |
| produce the merge info:</p> |
| <pre>"/trunk/foo:1-9,14-18,25" |
| </pre> |
| |
| <p>This merge information will be placed on /branches/release/foo</p> |
| |
| <p>A merge of revision 26 of /trunk/foo/baz into |
| /branches/release/foo/baz will produce the merge info:</p> |
| <pre>"/trunk/foo/baz:1-9,14-18,26". |
| </pre> |
| |
| <p>This merge information will be placed on |
| /branches/release/foo/baz.</p> |
| |
| <p>Note that if you instead merge revision 26 of /trunk/foo into |
| /branches/release/foo, you will get the same effect, but the merge |
| info will be:</p> |
| <pre>"/trunk/foo:1-9,14-18,25-26". |
| </pre> |
| |
| <p>This merge information will be placed on /branches/releases/foo</p> |
| |
| <p>Both are different "spellings" of the same merge information, and |
| future merges should produce the same result with either merge info |
| (one is of course, more space efficient, and transformation of one to |
| the other could be done on the fly or as a postpass, if desired).</p> |
| |
| <p>All other merge information will still be intact on |
| /branches/release (e.g. there is information on /branches/release's |
| directory).</p> |
| |
| </div> <!-- h4 --> |
| |
| </div> <!-- h3 --> |
| |
| |
| <div class="h3" id="merge-history-faq"> |
| <h3>FAQ</h3> |
| |
| <p class="question">What happens if someone commits a merge with a |
| non-merge tracking client?</p> |
| |
| <p class="answer">It simply means the next time you merge, you may |
| receive conflicts that you would have received if you were using a |
| non-history-sensitive client.</p> |
| |
| <p class="question">What happens if the merge history is not there?</p> |
| |
| <p class="answer">The same thing that happens if the merge history is |
| not there now.</p> |
| |
| <p class="question">Are there many different "spellings" of the same |
| merge info?</p> |
| |
| <p class="answer">Yes. Depending on the URLs and target you specify |
| for merges, I believe it is possible to end up with merge info in |
| different places, or with slightly different revision lines that have |
| the same semantic effect (e.g. info like /trunk:1-9 vs |
| /trunk:1-8\n/trunk/bar:9 when revision 9 on /trunk only |
| affected /trunk/bar), so you can end up with merge info in different |
| places, even though the semantic result will be the same in all |
| cases.</p> |
| |
| <p class="question">Can we do history sensitive WC-to-WC merges |
| without contacting the server?</p> |
| |
| <p class="answer">No. But you probably couldn't anyway without |
| <em>all</em> repo merge data stored locally.</p> |
| |
| <p class="question">What happens if a user edits merge history |
| incorrectly?</p> |
| |
| <p class="answer">They get the results specified by their merge |
| history.</p> |
| |
| <p class="question">What happens if a user manually edits a file and |
| unmerges a revision (e.g. not using a "reverse merge" command), but |
| doesn't update the merge info to match?</p> |
| |
| <p class="answer">The merge info will believe the change has still |
| been merged. This is a similar effect to performing a <a |
| href="requirements.html#manual-merge">manual merge</a>.</p> |
| |
| <p class="question">What happens if I <code>svn |
| move</code>/<code>rename</code> a directory, and then merge it |
| somewhere?</p> |
| |
| <p class="answer">This doesn't change history, only the future, thus |
| we will simply add the merge info for that directory as if it was a |
| new directory. We will not do something like attempt to modify all |
| merge info to specify the new directory, as that would be wrong.</p> |
| |
| <p class="question">I don't think only that copying info on <code>svn |
| copy</code> is correct. What if you copy a dir with merge info into a |
| dir where the dir has merge info -- won't it get the info wrong |
| now?</p> |
| |
| <div class="answer"> |
| <p>No. Let's say you have:</p> |
| |
| <pre> |
| a/foo (merge info: /trunk:5-9 |
| a/branches/bar (merge info: /trunk:1-4) |
| </pre> |
| |
| <p>If you copy a/foo into a/branches/bar, we now have:</p> |
| |
| <pre> |
| a/branches/bar (merge info: /trunk:1-4) |
| a/branches/bar/foo (merge info: /trunk:5-9) |
| </pre> |
| |
| <p>This is strictly correct. The only changes which have been merged |
| into a/branches/bar/foo, are still 5-9. The only changes which have |
| been merged into /branches/bar are 1-4. No merges have been performed |
| by your copy, only copies have been performed. If you perform a merge |
| of revisions 1-9 into bar, the results one would expect that the |
| history sensitive merge algorithm will skip revisions 5-9 for |
| a/branches/bar/foo, and skip revisions 1-4 for a/branches/bar. The |
| above information gives the algorithm the information necessary to do |
| this. |
| |
| So if you want to argue svn copy has the wrong merge info semantics, |
| it's not because of the above, AFAIK :)</p> |
| </div> |
| |
| </div> <!-- merge-history-faq --> |
| |
| <div class="h3" id="merge-history-footnotes"> |
| <h3>Footnotes</h3> |
| |
| <ol> |
| <li>This is not going to be a full blown design for property |
| inheritance, nor should this design depend on such a system being |
| implemented.</li> |
| |
| <li>Assuming 4 byte revision numbers, and repos with revisions |
| numbering in the hundreds of thousands. You could do slightly |
| better by variable length encoding of integers, but even that will |
| generally be 4 bytes for hundreds of thousands of revs. Thus, we |
| have strings like "102341" vs 4 byte numbers, meaning you save about |
| 2 bytes for a 4 byte integer. Range lists in binary would need a |
| distinguisher from single revisions, adding a single bit to both |
| (meaning you'd get 31 bit integers), and thus, would require 8 bytes |
| per range vs 12 bytes per range. While 30% is normally nothing to |
| sneeze at space wise, it's also not significantly more efficient in |
| time, as most of the time will not be spent parsing revision lists, |
| but doing something with them. The space efficiency therefore does |
| not seem to justify the cost you pay in not making them easily |
| editable.</li> |
| </ol> |
| |
| </div> <!-- merge-history-footnotes --> |
| |
| </div> <!-- merge-history --> |
| |
| <div class="h2" id="data-structures"> |
| <h2>Data Structures</h2> |
| |
| <p>Merge Tracking is implemented using a few simple data |
| structures.</p> |
| |
| <dl> |
| <dt>merge info</dt> |
| <dd>An <code>apr_hash_t</code> mapping merge source paths to a range |
| list.</dd> |
| |
| <dt>range list</dt> |
| <dd>An <code>apr_array_header_t</code> containing |
| <code>svn_merge_range_t *</code> elements.</dd> |
| |
| <dt><code>svn_merge_range_t</code></dt> |
| <dd>A revision range, modelled using a <code>start</code> and |
| <code>end</code> <code>svn_revnum_t</code>, which are identical if |
| the range consists of only a single revision.</dd> |
| </dl> |
| |
| </div> <!-- data-structures --> |
| |
| <div class="h2" id="conflict-resolution"> |
| <h2>Conflict Resolution</h2> |
| |
| <p>The <a href="func-spec.html#conflict-resolution">functional |
| specification</a> and the <a href="../svn_1.5_releasenotes.html">1.5 |
| release notes</a> detail the command-line client behavior and conflict |
| resolution API.</p> |
| |
| </div> <!-- conflict-resolution --> |
| |
| <div class="h2" id="audit-operations"> |
| <h2>Audit Operations</h2> |
| |
| <p>As outlined in the |
| <a href="requirements.html#auditing">requirements and use cases</a>, merge |
| tracking auditting is the ability to report information about merge |
| operations. It consists of three sections:</p> |
| |
| <ul> |
| <li>Changeset Merge Availability (TODO)</li> |
| |
| <li>Find Changeset (TODO)</li> |
| |
| <li><a href="#commutative-reporting">Commutative Author and Revision |
| Reporting</a></li> |
| </ul> |
| |
| <div class="h3" id="commutative-reporting"> |
| <h3>Commutative Author and Revision Reporting</h3> |
| |
| <p>Most of the logic for reporting will live in <code>libsvn_repos</code>, with |
| appropriate changes to the API and additional parameters exposed to the client |
| through the RA layer. We are looking at an API which takes one or more paths |
| and a revision (range?), and returns the merge info which was added or removed |
| in that revision. For the 'blame' case, we may also need to include some type |
| of line number parameter, the handling of which is going to get ugly, since our |
| FS isn't based on a weave.</p> |
| |
| <p>Existing functions of interest are <code>svn_repos_fs_get_mergeinfo()</code> |
| and <code>svn_fs_merge_info__get_mergeinfo()</code>.</p> |
| |
| <div class="h4" id="audit-log"> |
| <h4><code>svn log</code> Implementation</h4> |
| |
| <p>Prior to merge tracking, log messages had a linear relationship to one |
| another. That is, the only information gleaned from the order in which a |
| message was returned was where the revision number of that message fell in |
| relation to the revision numbers of messages preceding and succeeding it.</p> |
| |
| <p>The introduction of merge tracking changes that paradigm. Log messages |
| for independent revisions are still linearly related as before, but log |
| messages for merging revisions now have children. These children are log |
| messages for the revisions which have been merged, and they may in turn |
| also have children.</p> |
| |
| <p>The result is a tree structure which the repository layer builds as it |
| collects log message information. This tree structure then gets serialized |
| and marshaled back to the client, which can then rebuilt the tree if needed. |
| Additionally, less information needs to be explicitly given, as the tree |
| structure itself implies revision relationships. |
| </p> |
| |
| <p>We currently use the <code>svn_log_message_receiver_t</code> interface |
| to return log messages between application layers. To enable communication |
| of the tree structure, we add another parameter, <code>child_count</code>. |
| When <code>child_count</code> is zero, the node is a leaf node. When |
| <code>child_count</code> is greater than zero, the node is an interior node, |
| with the given number of children. These children may also have children and |
| indicate such by their own <code>child_count</code> parameters. Children |
| are returned in-band through the receiver interface immediately following their |
| parents. Consumers of this API can be aware of the number of children and |
| rebuild the tree, or pass the values farther up the application stack. In |
| effect, this method implements a preorder traversal of the log message tree.</p> |
| |
| <p>(For convenience, we may want to consolidate all the data parameters of |
| <code>svn_log_message_receiver_t</code> into a single structure.)</p> |
| |
| <p>A revision, R, is considered a <em>merging revision</em> if the mergeinfo |
| for any path for which the log was requested changed between R and R-1. The |
| difference in the mergeinfo, both added revisions and removed revisions, |
| between R and R-1 indicates the revisions which are children of R.</p> |
| |
| <p>The exception for this is the case of a copy which is the creation of a |
| branch. When a branch is created, the mergeinfo for R-1 is empty, and the |
| mergeinfo for R is 1:R-1. In this case, the revision should not be considered |
| a merging revision, and none of the revisions R:R-1 should be shown as R's |
| children.</p> |
| |
| </div> <!-- audit-log --> |
| |
| <div class="h4" id="audit-blame"> |
| <h4><code>svn blame</code> Implementation</h4> |
| |
| <p>Even though the command line client doesn't consume both the original |
| author and revision and the merging author and revision, the blame API should |
| provide both for use by other clients.</p> |
| |
| </div> <!-- audit-blame --> |
| |
| </div> <!-- commutative-reporting --> |
| |
| </div> <!-- audit-operations --> |
| |
| <p>$Date$</p> |
| |
| </div> <!-- h1 --> |
| </body> |
| </html> |