blob: 2bde930cbbc82dd80ef002030fafa238d6f7c2fd [file] [log] [blame]
-*- text -*-
TREE CONFLICT DETECTION
Issue reference: http://subversion.tigris.org/issues/show_bug.cgi?id=2282
This file describes how tree conflicts described in use-cases.txt
can be detected. It documents how detection currently works in the
actual code, and also explains the limits of tree conflict detection
imposed by Subversion's current design.
Note that at the time of writing tree conflict detection has been
implemented only for use cases 1 to 3. The current implementation has
imperfect tree conflict detection, but it is still better than not
handling tree conflicts at all. It provides a good safety net that
helps users avoid running into tree conflict use cases 1 to 3. Once
Subversion has been taught about true renames tree conflict detection
can be changed to make use of this and become extremely precise. See
below for further explanation.
==========
USE CASE 1
==========
If 'svn update' modifies a file that has been scheduled for deletion
in the working copy, the file is a tree conflict victim.
==========
USE CASE 2
==========
If 'svn update' deletes a file that has local modifications, the file
is a tree conflict victim.
==========
USE CASE 3
==========
If 'svn update' deletes a file that has been scheduled for deletion in
the working copy, the file is a tree conflict victim.
==========
USE CASE 4
==========
We skip tree conflict detection if the record_only field of the
merge-command baton is TRUE. A record-only merge operation updates
mergeinfo without touching files.
If 'svn merge' tries to modify a file that does not exist in the
target working copy, then the target file is a tree conflict victim.
Notes on Resolution
-------------------
A likely cause of this case is that the source diff doesn't cover as
many revisions as it should. The file should either be brought in by
adding the revision that created the file to the list of revisions
to be merged, or changes made to it on the source branch should be
omitted from the merge range entirely.
If the user does not wish to choose a source diff that avoids this
conflict, then the user must resolve the conflict manually.
If a modification to the nonexistent file is part of a larger diff
with changes to other files that should be merged, the user will
need to be able to manually resolve the tree-conflict while keeping
the desired changes.
Users must be able to run a second merge command to resolve the
tree-conflict, or repeat a previous merge operation, but with
additional revisions, without harm.
However, the current plan is to disallow merges into tree-conflicted
directories. This means that users will first have to mark the
tree-conflict around the missing victim as resolved before attempting
to merge the file again, this time including the revision that created
the file. This may be a bit of an awkward work flow but is required to
solve the problem this use case has in the current implementation,
namely that missing files may accidentally be overlooked during merging.
==========
USE CASE 5
==========
We skip tree conflict detection if the record_only field of the
merge-command baton is TRUE. A record-only merge operation updates
mergeinfo without touching files.
If 'svn merge' deletes an existing file, the file is a tree conflict
victim if its text is different from the corresponding file on the left
side of the merge source.
To account for uncommitted text modifications in the working copy,
we should do any text comparisons against the WORKING revision.
Rationale:
We don't want to flag every file deletion as a tree conflict. We
want to warn the user if the file to be deleted locally is different
from the file deleted in the merge source. The user then has a chance
to merge these unique changes.
Implementation:
Call svn_client_diff_summarize2() to compare the target file to the
file at the left side of the merge source.
==========
USE CASE 6
==========
We skip tree conflict detection if the record_only field of the
merge-command baton is TRUE. A record-only merge operation updates
mergeinfo without touching files.
If 'svn merge' tries to delete a file that does not exist in the
target working copy, then the target file is a tree conflict victim.
This is similar to UC4.
Rationale:
Semantically, a tree conflict occurs if 'svn merge' either tries to apply
the "delete" half of a "move" onto a file that was simply deleted in the
target branch's history, or tries to apply a simple "delete" onto a file
that has been moved in the target branch, or tries to move a file that
has already been moved to a different name in the target branch.
Notes on Resolution
-------------------
Some users may want to skip the tree conflict and have the result automatically
resolved if two rename operations have the same destination, or if a file is
simply deleted on both branches. But we have to mark these as tree conflicts
due to the current lack of "true rename" support. It does not appear to be
feasible to detect more than the double-delete aspect of the move operation.
=========================
OBSTRUCTIONS DURING MERGE
=========================
If 'svn merge' fails to apply an operation to a file because the
file is obstructed (i.e. an unversioned item of the same name is
in the file's place), the obstructed file is a tree conflict victim.
Rationale:
We want to make sure that a merge either completes successfully
or any problems found during a merge are flagged as conflicts.
Skipping obstructed items during merge is no longer acceptable
behaviour, since users might not be aware of obstructions that were
skipped when they commit the result of a merge.
=========================================
TREE CONFLICT DETECTION WITH TRUE RENAMES
=========================================
To properly detect the situations described in the "diagram of current
behaviour" for use case 2 and 3, we need to have access to a list of
all files the update will add with history.
For use cases 1 and 3, we need a list of all files added locally with
history.
We need access to this list during the whole update editor drive.
Then we could do something like this in the editor callbacks:
edit_file(file):
if file is locally deleted:
for each added_file in files_locally_added_with_history:
if file has common ancestor with added_file:
/* user ran "svn move file added_file" */
use case 1 has happened!
delete_file(file):
if file is locally modified:
for each added_file in files_added_with_history_by_update:
if file has common ancestor with added_file:
use case 2 has happened!
else if file is locally deleted:
for each added_file in files_added_with_history_by_update:
if file has common ancestor with added_file:
use case 3 has happened!
Since the update editor drive crawls through the working copy and the
callbacks consider only a single file, we need to generate the list
before checking for tree conflicts. Two ideas for this are:
1) Wrap the update editor with another editor that passes
all calls through but takes note of which files the
update adds with history. Once the wrapped editor is
done run a second pass over the working copy to populate
it with tree conflict info.
2) Wrap the update editor with another editor that does
not actually execute any edits but remembers them all.
It only applies the edits once the wrapped editor has
been fully driven. Tree conflicts could now be detected
precisely because the list of files we need would be
present before the actual edit is carried out.
Approach 1 has the problem that there is no reliable way of storing
the file list in face of an abort.
Approach 2 is obviously insane. ;-)
Keeping the list in RAM is dangerous, because the list would be lost
if the user aborts, leaving behind an inconsistent working copy that
potentially lacks tree conflict info for some conflicts.
The usual place to store persistent information inside the working
copy is the entries file in the administrative area. Loggy writes to
this file ensure consistency even if the update is aborted. But
keeping the list in entries files also has problems: Which entries
file do we keep it in? Scattering the list across lots of entries
files isn't an option because the list needs to be global. Crawling
the whole working copy at the start of an update to gather lost file
lists would be too much of a performance penalty.
Storing it in the entries file of the anchor of the update operation
(i.e. the current working directory of the "svn update" process) is a
bad idea as well because when the interrupted update is continued the
anchor might have changed. The user may change the working directory
before running "svn update" again.
Either way, interrupted updates would leave scattered partial lists of
files in entries throughout the working copy. And interrupted updates
may not correctly mark all tree conflicts.
So how can, for example, use case 3 be detected properly?
The answer could be "true renames". All the above is due to the fact
that we have to try to catch use case 3 from a "delete this file"
callback. We are in fact trying to reconstruct whether a deletion
of a file was due to the file being moved with "svn move" or not.
But if we had a callback in the update editor like:
move_file(source, dest);
detecting use case 3 would be extremely simple. Simply check whether
the source of the move is locally deleted. If it is, use case 3 has
happened, and the source of the move is a tree conflict victim.
Use case 2 could be caught by checking whether the source of the move
has local modifications.
Use case 1 could be detected by checking whether the target for a file
modification by update matches the source of a rename operation in the
working copy. This would require storing rename information inside the
administrative areas of both the source and target directories of file
move operations to avoid having to maintain a global list of rename
operations in the working copy for reference by the update editor.