| Implementing Incomplete Directory Support in SVN |
| |
| ######################################################################### |
| ### ### |
| ### Note: Although this feature was called "incomplete directories" ### |
| ### while under development, we might want to call it something ### |
| ### else when it goes live. "Incomplete" makes it sound like ### |
| ### there's something wrong with the directory, something missing. ### |
| ### Perhaps "sparse directories" or "partial directories" would be ### |
| ### less user-frightening. ### |
| ### ### |
| ######################################################################### |
| |
| Contents |
| ======== |
| |
| 1. Design |
| 2. User Interface |
| 3. Examples |
| 4. Implementation Strategy |
| 5. Current Status |
| |
| 1. Design |
| ========= |
| |
| This design document started out as a post by Eric Gillespie: |
| |
| http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=117053 |
| From: Eric Gillespie <epg@pretzelnet.org> |
| To: dev@subversion.tigris.org |
| Subject: [PROPOSAL] Incomplete working copies (issue #695) |
| Date: Thu, 22 Jun 2006 22:35:06 -0700 |
| Message-ID: <25668.1151040906@gould.diplodocus.org> |
| |
| [The design has evolved since then; the text below is not exactly |
| the same as what Eric posted, but has the same general ideas.] |
| |
| I'd like to propose a new solution to this issue, and hopefully get |
| it into 1.5. What i'm really looking for is the kind of |
| flexibility Perforce has with its client specs in which parts of a |
| tree you check out. |
| |
| I don't think Ben Reser's proposal |
| (http://svn.haxx.se/dev/archive-2005-07/0398.shtml) covers this. |
| Using his first example, there is no way to avoid pulling in |
| trunk/foo/images/another-big-dir when it is added. |
| |
| This is based on an idea from Karl Fogel. |
| |
| Implementing Incomplete Directory Support in SVN |
| ================================================== |
| |
| Many users have very large trees of which they only want to |
| checkout certain parts. checkout -N is not today up to this task. |
| This proposal introduces the --depth option to the checkout, |
| switch, and update subcommands as a replacement for -N, which |
| allows working copies to have very specific contents, leaving out |
| everything the user does not want. |
| |
| This is similar to Perforce's client specs, but without the ability |
| to have a repository entry have a different name in the working |
| copy. We actually already have this capability in switch. |
| |
| Depth: |
| |
| We have a new "depth" field in .svn/entries, which has (currently) |
| four possible values: depth-empty, depth-files, depth-immediates, |
| and depth-infinity. Only this_dir entries may have depths other |
| than depth-infinity. |
| |
| depth-empty ------> Updates will not pull in any files or |
| subdirectories not already present. |
| |
| depth-files ------> Updates will pull in any files not already |
| present, but not subdirectories. |
| |
| depth-immediates -> Updates will pull in any files or |
| subdirectories not already present; those |
| subdirectories' this_dir entries will |
| have depth-empty. |
| |
| depth-infinity ---> Updates will pull in any files or |
| subdirectories not already present; those |
| subdirectories' this_dir entries will |
| have depth-infinity. Equivalent to |
| today's default update behavior. |
| |
| The --depth option sets depth values as it updates the working |
| copy, setting any new subdirectories' this_dir depth values as |
| described above. |
| |
| 2. User interface |
| ================= |
| |
| Affected commands: |
| |
| * checkout |
| * switch |
| * update |
| * status |
| * info |
| |
| The -N option becomes a synonym for --depth=files for these commands. |
| This changes the existing -N behavior for these commands, but in a |
| trivial way (see below). |
| |
| checkout without --depth or -N behaves the same as it does today. |
| switch and update without --depth or -N behave the same way as |
| today IFF the working copy is fully depth-infinity. switch and |
| update without --depth or -N will NOT change depth values |
| (exception: a missing directory specified on the command line will |
| be pulled in). |
| |
| Thus, 'checkout' is identical to 'checkout --depth=infinity', but |
| 'switch' and 'update' are not the same as 'switch --depth=infinity' and |
| 'update --depth=infinity'. The former update entries according to |
| existing depth values, while the latter pull in everything. |
| |
| To get started, run checkout with --depth=empty or --depth=files. |
| If additional files or directories are desired, pull them in with |
| update commands using appropriate --depth options. |
| |
| The 'svn status' should list the depth status of the directories, in |
| addition to whatever statuses are being currently listed. |
| |
| The 'svn info' command should list the depth, IFF invoked on a directory. |
| [I believe it already does, on this branch. -kfogel] |
| |
| 3. Examples |
| =========== |
| |
| svn co http://.../A |
| |
| Same as today; everything has depth-infinity. |
| |
| svn co -N http://.../A |
| |
| Today, this creates wc containing only mu. Now, this will be |
| identical to 'svn co --depth=files /A'. |
| |
| svn co --depth=empty http://.../A Awc |
| |
| Creates wc Awc, but *empty*. |
| |
| Awc/.svn/entries this_dir depth-empty |
| |
| svn co --depth=files http://.../A Awc1 |
| |
| Creates wc Awc1 with all files (i.e., Awc1/mu) but no |
| subdirectories. |
| |
| Awc1/.svn/entries this_dir depth-files |
| ... |
| |
| svn co --depth=immediates http://.../A Awc2 |
| |
| Creates wc Awc2 with all files and all subdirectories, but |
| subdirectories are *empty*. |
| |
| Awc2/.svn/entries this_dir depth-immediates |
| B |
| C |
| Awc2/B/.svn/entries this_dir depth-empty |
| Awc2/C/.svn/entries this_dir depth-empty |
| ... |
| |
| svn up Awc/B: |
| |
| Since B is not yet checked out, add it at depth infinity. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| Awc/B/.svn/entries this_dir depth-infinity |
| ... |
| Awc/B/E/.svn/entries this_dir depth-infinity |
| ... |
| ... |
| |
| svn up Awc |
| |
| Since A is already checked out, don't change its depth, just |
| update it. B and everything under it is at depth-infinity, |
| so it will be updated just as today. |
| |
| svn up --depth=immediates Awc/D |
| |
| Since D is not yet checked out, add it at depth-immediates. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| D |
| Awc/D/.svn/entries this_dir depth-immediates |
| ... |
| Awc/D/G/.svn/entries this_dir depth-empty |
| ... |
| |
| svn up --depth=empty Awc/B/E |
| |
| Remove everything under E, but leave E as an empty directory |
| since B is depth-infinity. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| D |
| Awc/B/.svn/entries this_dir depth-infinity |
| ... |
| Awc/B/E/.svn/entries this_dir depth-empty |
| ... |
| |
| svn up --depth=empty Awc/D |
| |
| Remove everything under D, and D itself since A is depth-empty. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| |
| svn up Awc/D |
| |
| Bring D back at depth-infinity. |
| |
| Awc/.svn/entries this_dir depth-empty |
| ... |
| Awc/D/.svn/entries this_dir depth-infinity |
| ... |
| ... |
| |
| svn up --depth=immediates Awc |
| |
| Bring in everything that's missing (C/ and mu) and empty all |
| subdirectories (and set their this_dir to depth-empty). |
| |
| Awc/.svn/entries this_dir depth-immediates |
| B |
| C |
| Awc/B/.svn/entries this_dir depth-empty |
| Awc/C/.svn/entries this_dir depth-empty |
| ... |
| |
| 4. Implementation Strategy |
| ========================== |
| |
| It would be nice if all this could be accomplished with just simple |
| tweaks to how we drive the update reporter (svn_ra_reporter2_t). |
| However, it looks like it's not going to be that easy. |
| |
| Handling 'checkout --depth=empty' would be easy. It should get us |
| an empty directory at depth-empty, with no files and no subdirs, |
| and if we just report it as at HEAD every time, the server will |
| never send updates down (hmmm, this could be a problem for getting |
| dir property updates, though). Then any files or subdirs we have |
| explicitly included we can just report at their respective |
| revisions, and get proper updates; at least that'll work for the |
| depth infinity ones. |
| |
| But consider 'checkout --depth=immediates'. The desired state is a |
| depth-files directory D, with all files up-to-date, and with |
| skeleton subdirs at depth empty. Plain updates should preserve this |
| state of affairs. |
| |
| If we report D as at its BASE revision, files at their BASE |
| revisions, and subdirs at HEAD, then: |
| |
| - When new files appear in the repos, they'll get sent down (good) |
| - When new subdirs appear, they'll get sent down in full (bad) |
| |
| But if we don't report subdirs as at HEAD, then the server will try to |
| update them (bad). And if we report D at HEAD, then the working copy |
| won't receive new files that have appeared in the repository since D's |
| BASE revision (note that we *can* get updates for files we already |
| have, though, by continuing to report them at their respective BASEs). |
| |
| The same logic applies to subdirectories at depth-files or |
| depth-immediates. |
| |
| So, I think this means that for efficient depth handling, we'll |
| need to have the client directly reporting the desired depth to the |
| server; i.e., extending the RA protocol. |
| |
| Meanwhile, legacy servers will send back a bunch of information the |
| client doesn't want, and the client will just ignore it, and |
| everything will be slower than it needs to be, and people will |
| complain on the users@ list, and we'll tell them to upgrade their |
| servers, and they'll say they can't because they don't have control |
| over the server, and we'll say "So? This ain't no Grand Hotel!" |
| |
| 5. Current Status |
| ================= |
| |
| http://svn.collab.net/repos/svn/branches/incomplete-directories/ |
| contains the latest code. |
| |
| *** The most important thing to know is that the branch code *** |
| *** implements an earlier three-depth scheme (0, 1, infinity) *** |
| *** and does not yet reflect the new four-depth universe. *** |
| |
| A new enum type 'svn_depth_t depth' is defined in svn_types.h. |
| Both client and server side now understand the concept of depth, |
| and the basic update use cases handle depth. See depth_tests.py |
| for what is known to be working. (Many edge cases are not yet |
| handled correctly.) |
| |
| On the client side, most of the svn_client.h interfaces that |
| formerly took 'svn_boolean_t recurse' now take 'svn_depth_t depth'. |
| Some of this recurse-becomes-depth change has propagated down into |
| libsvn_wc, which now stores a depth field in svn_wc_entry_t (and |
| therefore in .svn/entries). The update reporter knows to report |
| differing depths to the server, in the same way it already reports |
| differing revisions. In other words, take the concept of "mixed |
| revision" working copies and extend it to "mixed depth" working |
| copies. |
| |
| On the server side, most of the significant changes are in |
| libsvn_repos/reporter.c. The code that receives update reports now |
| receives notice of paths that have different depths from their |
| parent, and of course the overall update operation has a global |
| depth, which applies whenever not shadowed by some local depth for |
| a given path. |
| |
| The RA code on both sides knows how to send and receive depths; the |
| relevant svn_ra_* APIs now take depth arguments, which sometimes |
| supersede older 'recurse' booleans. In these cases, the RA layer |
| does the usual compatibility dance: receiving "recurse=FALSE" from |
| an older client causes the server to behave as if "depth=immediates" |
| had been transmitted. |
| |
| Work remaining, in no particular order: |
| |
| * There is still no compatibility code for new clients dealing |
| with old servers. This is the "legacy servers will send back |
| a bunch of information the client doesn't want" scenario |
| described in the 'Implementation' section above. The client |
| doesn't know how to ignore information it doesn't want yet, |
| it'll just do whatever the server tells it. I'm not sure how |
| useful the compatibility mode is, since it wouldn't really |
| shorten the wall clock time of the operations by much, |
| although it would still save the disk space. |
| |
| * There's no interface for getting rid of stuff once you've |
| brought it into your working copy -- no "exclusion" interface, |
| in other words. So you can do this: |
| |
| $ svn co --depth=empty http://.../repos/greek-tree/ |
| $ cd greek-tree |
| $ svn up A |
| |
| ...and that will get you A/ at depth-infinity. But once you |
| no longer need A/, there's no way to do: |
| |
| $ svn exclude A ## or whatever the command is named |
| |
| In fact, you can't yet even do: |
| |
| $ svn up --depth=empty A |
| |
| ...to at least "fold up" A/ and save the disk space. |
| |
| * I've put a lot of "### TODO" comments on the branch. Do a |
| branch diff to see them. |
| |
| * Certain APIs need to behave specially when passed |
| svn_depth_unknown: they need to treat it as meaning "go get |
| the depth from the working copy, and either use it directly or |
| calculate the appropriate depth based on it". |
| |
| Right now, only svn_client_checkout3 really does this |
| properly (see r21910 and r21829). Other APIs that probably |
| should do it are: |
| |
| svn_client_diff4() |
| svn_client_diff_summarize2() |
| svn_client_diff_summarize_peg2() |
| svn_client_diff_peg3() |
| svn_client_merge3() |
| svn_client_merge_peg3() |
| svn_client_update3() |
| |
| * Some APIs still take recurse booleans. It's not clear to me |
| that all of these should be switched to depth, but the |
| question needs some more consideration: |
| |
| svn_client_import() # Note: takes 'nonrecursive' right now |
| svn_client_revert() # Any point taking depth? |
| svn_client_commit4() # Is manual control of depth needed here? |
| svn_client_propset2() # Same question here. |
| svn_client_propget2() # Same question here. |
| svn_client_proplist2() # Same question here. |
| svn_client_resolved() # Same question here. |
| |
| * Small bug in how depth is stored in entries file format: |
| |
| Suppose we have A at depth:empty and A/B at depth:files. |
| A/.svn/entries will have a short "dir" entry for B. |
| Naturally, that entry does not mention B's revision or other |
| details, because that stuff should live in A/B/.svn/entries. |
| But for some reason, A/.svn/entries *does* list B's depth. |
| That's bad. It shouldn't talk about B's depth, one should go |
| look in A/B/.svn/entries for B's depth. I'm sure this is a |
| simple fix in the entry reading/writing code, just haven't had |
| a chance to chase it down yet. |
| |
| * I haven't done anything with 'svn status' yet, don't know if |
| it would behave correctly w.r.t. depths out of the box or not. |
| Clearly, this needs investigation. |
| |
| * I haven't done anything with either 'svn switch --depth' or |
| 'svn switch' handling mixed-depth working copies automatically. |
| Probably some bits work right now, and other bits don't. |
| |
| * All of my testing has been over svn:// and sometimes local://. |
| All the necessary changes are in for http:// as well, but they |
| are still untested. |