| This file documents the 'svnpatch' format that's used with both diff and patch |
| subcommands. |
| |
| I HISTORY |
| ------- |
| |
| Subversion's diff facility by default generates an unidiff format output. The |
| unidiff format is famous with tools like diff(1) and has been used for decades |
| to produce contextual differences between files. We also often associate it |
| with patch(1) to apply those contextual differences. When it comes to |
| non-contextual changes like moving a directory, adding a property to a file, or |
| modifying an image, unidiff is helpless. Enters the svnpatch format. It |
| enables capturing all non-contextual changes into a WC-portable output so that |
| it is possible to create rich patches to apply across working copies. Another |
| way to look at it is as an "offline merge", in which one dumps the diffs, |
| passes them as a patch on to her peer who then applies it, without ever |
| interacting with the repository. |
| |
| The svnpatch format is in fact a simplified version of the Subversion protocol |
| -- see subversion/libsvn_ra_svn/protocol -- that meets our needs. The |
| advantage here is that changes are serialized into a language that Subversion |
| already speaks and only a few minor tweaks were needed to accommodate. As an |
| example, revisions have been stripped from the protocol to allow fuzzing. |
| |
| The implementation with the command line client uses `svn diff --svnpatch' to |
| generate the rich diffs and `svn patch' to apply the diffs against a working |
| copy. Other frontends can also take advantage of svnpatch in the same way |
| through the usual API's svn_client_diff5 and svn_client_patch that use files to |
| communicate. |
| |
| |
| II SVNPATCH FORMAT IN A NUTSHELL |
| ----------------------------- |
| |
| First off, let's define it. svnpatch format is made of two ordered parts: |
| * (a) human-readable: made of unidiff bytes |
| * (b) computer-readable: made of svn protocol bytes (ra_svn), gzip'ed, |
| base64-encoded |
| |
| But, as we're not in a client/server configuration: |
| - (b) only uses the svn protocol's Editor Command Set, there's no need for |
| the Main Command Set nor the Report Command Set |
| - a client reads Editor Commands from the patch, i.e. the patch silently |
| drives the client's editor |
| - the only direction the information takes is from the patch to the client |
| - svndiff1 is solely used instead of being able to choose between svndiff1 |
| and svndiff0 (e.g. binary-change needs svndiff) |
| |
| Such a format can be seen as a subset of the svn protocol which: |
| - Capabilities and Edit Pipelining have nothing to do with as we can't adjust |
| once the patch is rock-hard written in the file nor negotiate anything |
| - commands are restricted to the Editor Command Set |
| - lacks revision numbers and checksums except for binary files (see VI |
| FUZZING) |
| |
| For more about Command Sets, consult libsvn_ra_svn/protocol. |
| |
| |
| III BOUNDARIES BETWEEN THE TWO PARTS |
| -------------------------------- |
| |
| Now since the svn protocol would be happy to handle just any change that a |
| working copy comes with, rules have to be set up so that we meet our goals (see |
| I HISTORY). |
| |
| Concretely, what's in each part? |
| |
| In (a): |
| - contextual differences |
| - property-changes (in a similar way to 'svn diff') |
| - new non-binary-file content |
| |
| In (b): |
| - tree-changes ({add,del,move,copy}-directory, {add,del,move,copy}-file) |
| - property-changes |
| - binary-changes |
| |
| Consequences are we face cases where one change's representation lives in the |
| two parts of the patch. e.g. a modified-file move: the move is represented |
| within (b) while contextual differences within (a); a file add: an add-file |
| Editor Command in (b) plus its content in (a). |
| |
| Furthermore, we never end up with redundant information but with |
| property-changes. A file copy with modifications generates (a) contextual |
| diff, (b) add-file w/ copy-path. |
| |
| The only thing that's left unreadable is tree-changes as defined above. |
| However, a higher level layer (e.g. GUIs) would perfectly be able to |
| base64-decode, uncompress and read operations to visually-render the changes. |
| |
| The (b) block starts with a header and its version. |
| |
| Here's what a directory add, a file add and a propset would look like: |
| |
| [[[ |
| Index: bar |
| =================================================================== |
| --- bar |
| +++ bar |
| @@ -0,0 +1,2 @@ |
| +This is bar content. |
| + |
| |
| Property changes on: bar |
| ___________________________________________________________________ |
| Name: newprop |
| + propval |
| |
| ======================== SVNPATCH BLOCK 1 ========================= |
| H4sICOz0mEYAA291dABtjsEKwyAMhu97Co/tQejcoZC3cU26CWLElu31l0ZXulE8GL//8yed4UzJ |
| FubVdHJ64wAHuXp5eEQ7h0gy3uDuS80cTFc1qzQ9fXqQejYXzoLUGCHRu4ERtuHl4/5rq8ZQtHlm |
| /jajOzZHXqhZGp3i4Qe3df931IwwrBVePlyTX//3AAAA |
| ]]] |
| |
| Let's uncompress and decode the above base64 block (lines are wrapped): |
| |
| ( open-root ( ( ) 2:d0 ) ) ( add-file ( 3:bar 2:d0 2:c1 ( ) ) ) ( |
| change-file-prop ( 2:c1 7:newprop ( 7:propval ) ) ) ( add-dir ( 3:foo 2:d0 2:d2 |
| ( ) ) ) ( close-dir ( 2:d2 ) ) ( close-dir ( 2:d0 ) ) ( close-file ( 2:c1 ( ) ) |
| ) ( close-edit ( ) ) |
| |
| Further examples can be found in subversion/tests/cmdline/diff_tests.py |
| test-suite. |
| |
| |
| IV SVNPATCH EDIT-ABILITY |
| --------------------- |
| |
| Because encoded and compressed, the computer-readable chunk (b) is not directly |
| editable. Should it be in cleartext, the user would still have to go through |
| svn protocol writing manually -- calculate checksums and strings length, and |
| place tokens, assumed to be not so friendly for the end-user. However, there's |
| a much easier workaround: apply the patch, and then start editing the working |
| copy with regular svn subcommands. |
| |
| |
| V PATCHING |
| -------- |
| |
| When it comes to applying an svnpatch patch (RAS syndrom), the 'svn patch' |
| subcommand is a good friend. We do support applying (a) Unidiffs |
| internally, and (b) is handled with routines that read and drive |
| editor functions out from the patch file much like what's being performed by |
| libsvn_ra_svn with a network stream. |
| |
| Now some words about the order to process (a) and (b). There might be cases |
| when operations to a single file live in the two parts of the patch (see above). |
| Since Unidiff indexes are made against the most up-to-date file name, it makes |
| sense that 'svn patch' first deals with the svnpatch block and then the Unidiff |
| block. E.g. consider a WC with a file copy from foo to bar and then contextual |
| modifications to bar. The patch that represents this WC changes would show |
| diffs against 'bar' file. So 'svn patch' first has to schedule-add-with-history |
| bar from foo and then apply contextual diffs, which would not work the other way |
| around. |
| |
| When the Editor Command Set comes to be extended, 'svn patch' will face |
| unexpected commands and/or syntax. As in libsvn_ra_svn, we warn the user with |
| 'unsupported command' messages and ignore its application. |
| |
| |
| VI FUZZING a.k.a. DYSTOPIA |
| ----------------------- |
| |
| The svn protocol is not very sensitive to fuzzing since most operations include |
| a revision number. However, to stick with this policy would widely lower the |
| patch-application scope we're expecting. For instance, 'svn patch' would fail |
| at deleting dir@REV when REV is different from the one that comes with the |
| delete-entry Editor Command. Obviously we need loose here, and the solution is |
| to free the svn protocol from revision numbers and checksums in our |
| implementation for every change but binary-changes (for the checksums). (It |
| would be insane to associate binary stuff with fuzzing in this world.) Now |
| dealing with (b) patching is similar in many ways to GNU Patch's: we end up |
| trying by all methods to drive the editor in the dark jungle, possibly failing |
| in few cases shooting 'hunk failed' warnings. |
| |
| |
| VII PATCH AND MERGE IN SUBVERSION |
| ----------------------------- |
| |
| 'svn patch' is similar in many ways to 'svn merge'. Basically, we have a |
| tree-delta in hand that we want to apply to a working-tree. Thus it's not |
| surprising to see they have a lot in common when comparing both implementations. |
| 'patch' uses a mix of revamped merge_callbacks (see libsvn_client/merge.c) and |
| repos-repos editor functions (see libsvn_client/repos_diff.c). Why not merge |
| those two together then, for code-share sake? Well, although they share a close |
| logic, to join the two implies having one single file (repos_diff.c) to handle |
| at least three burdens: repos-repos diff, merge, and patch. Such a design |
| can't be achieved without a myriad of tests/conditions and a large amount of |
| blurry mess at mixing three different tools in one place. In the end, what was |
| supposed to enhance software maintainability turned out to cause a lot of damage |
| at tightening different things together. |