| Implementing Sparse Directory Support in SVN |
| |
| ######################################################################### |
| ### ### |
| ### Note: This feature used to be called "incomplete directories"; ### |
| ### It is now called "sparse directories", because "incomplete" ### |
| ### made it sound like something was wrong with your directories. ### |
| ### ### |
| ######################################################################### |
| |
| Contents |
| ======== |
| |
| 1. Design |
| 2. User Interface |
| 3. Examples |
| 4. Implementation Strategy |
| 5. Compatability Matters |
| 6. Current Status |
| |
| 1. Design |
| ========= |
| |
| This design document started out as a post by Eric Gillespie: |
| |
| http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=117053 |
| From: Eric Gillespie <epg@pretzelnet.org> |
| To: dev@subversion.tigris.org |
| Subject: [PROPOSAL] Incomplete working copies (issue #695) |
| Date: Thu, 22 Jun 2006 22:35:06 -0700 |
| Message-ID: <25668.1151040906@gould.diplodocus.org> |
| |
| [The design has evolved since then; the text below is not exactly |
| the same as what Eric posted, but has the same general ideas.] |
| |
| I'd like to propose a new solution to this issue, and hopefully get |
| it into 1.5. What i'm really looking for is the kind of |
| flexibility Perforce has with its client specs in which parts of a |
| tree you check out. |
| |
| I don't think Ben Reser's proposal |
| (http://svn.haxx.se/dev/archive-2005-07/0398.shtml) covers this. |
| Using his first example, there is no way to avoid pulling in |
| trunk/foo/images/another-big-dir when it is added. |
| |
| This is based on an idea from Karl Fogel. |
| |
| Implementing Incomplete Directory Support in SVN |
| ================================================== |
| |
| Many users have very large trees of which they only want to |
| checkout certain parts. checkout -N is not today up to this task. |
| This proposal introduces the --depth option to the checkout, |
| switch, and update subcommands as a replacement for -N, which |
| allows working copies to have very specific contents, leaving out |
| everything the user does not want. |
| |
| This is similar to Perforce's client specs, but without the ability |
| to have a repository entry have a different name in the working |
| copy. We actually already have this capability in switch. |
| |
| Depth: |
| |
| We have a new "depth" field in .svn/entries, which has (currently) |
| four possible values: depth-empty, depth-files, depth-immediates, |
| and depth-infinity. Only this_dir entries may have depths other |
| than depth-infinity. |
| |
| depth-empty ------> Updates will not pull in any files or |
| subdirectories not already present. |
| |
| depth-files ------> Updates will pull in any files not already |
| present, but not subdirectories. |
| |
| depth-immediates -> Updates will pull in any files or |
| subdirectories not already present; those |
| subdirectories' this_dir entries will |
| have depth-empty. |
| |
| depth-infinity ---> Updates will pull in any files or |
| subdirectories not already present; those |
| subdirectories' this_dir entries will |
| have depth-infinity. Equivalent to |
| today's default update behavior. |
| |
| The --depth option sets depth values as it updates the working |
| copy, setting any new subdirectories' this_dir depth values as |
| described above. |
| |
| 2. User interface |
| ================= |
| |
| Affected commands: |
| |
| * checkout |
| * switch |
| * update |
| * status |
| * info |
| |
| The -N option becomes a synonym for --depth=files for these commands. |
| This changes the existing -N behavior for these commands, but in a |
| trivial way (see below). |
| |
| checkout without --depth or -N behaves the same as it does today. |
| switch and update without --depth or -N behave the same way as |
| today IFF the working copy is fully depth-infinity. switch and |
| update without --depth or -N will NOT change depth values |
| (exception: a missing directory specified on the command line will |
| be pulled in). |
| |
| Thus, 'checkout' is identical to 'checkout --depth=infinity', but |
| 'switch' and 'update' are not the same as 'switch --depth=infinity' and |
| 'update --depth=infinity'. The former update entries according to |
| existing depth values, while the latter pull in everything. |
| |
| To get started, run checkout with --depth=empty or --depth=files. |
| If additional files or directories are desired, pull them in with |
| update commands using appropriate --depth options. |
| |
| The 'svn status' should list the depth status of the directories, in |
| addition to whatever statuses are being currently listed. |
| |
| The 'svn info' command should list the depth, iff invoked on a |
| directory whose depth is not the default (depth infinity). |
| |
| 3. Examples |
| =========== |
| |
| svn co http://.../A |
| |
| Same as today; everything has depth-infinity. |
| |
| svn co -N http://.../A |
| |
| Today, this creates wc containing only mu. Now, this will be |
| identical to 'svn co --depth=files /A'. |
| |
| svn co --depth=empty http://.../A Awc |
| |
| Creates wc Awc, but *empty*. |
| |
| Awc/.svn/entries this_dir depth-empty |
| |
| svn co --depth=files http://.../A Awc1 |
| |
| Creates wc Awc1 with all files (i.e., Awc1/mu) but no |
| subdirectories. |
| |
| Awc1/.svn/entries this_dir depth-files |
| ... |
| |
| svn co --depth=immediates http://.../A Awc2 |
| |
| Creates wc Awc2 with all files and all subdirectories, but |
| subdirectories are *empty*. |
| |
| Awc2/.svn/entries this_dir depth-immediates |
| B |
| C |
| Awc2/B/.svn/entries this_dir depth-empty |
| Awc2/C/.svn/entries this_dir depth-empty |
| ... |
| |
| svn up Awc/B: |
| |
| Since B is not yet checked out, add it at depth infinity. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| Awc/B/.svn/entries this_dir depth-infinity |
| ... |
| Awc/B/E/.svn/entries this_dir depth-infinity |
| ... |
| ... |
| |
| svn up Awc |
| |
| Since A is already checked out, don't change its depth, just |
| update it. B and everything under it is at depth-infinity, |
| so it will be updated just as today. |
| |
| svn up --depth=immediates Awc/D |
| |
| Since D is not yet checked out, add it at depth-immediates. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| D |
| Awc/D/.svn/entries this_dir depth-immediates |
| ... |
| Awc/D/G/.svn/entries this_dir depth-empty |
| ... |
| |
| svn up --depth=empty Awc/B/E |
| |
| Remove everything under E, but leave E as an empty directory |
| since B is depth-infinity. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| D |
| Awc/B/.svn/entries this_dir depth-infinity |
| ... |
| Awc/B/E/.svn/entries this_dir depth-empty |
| ... |
| |
| svn up --depth=empty Awc/D |
| |
| Remove everything under D, and D itself since A is depth-empty. |
| |
| Awc/.svn/entries this_dir depth-empty |
| B |
| |
| svn up Awc/D |
| |
| Bring D back at depth-infinity. |
| |
| Awc/.svn/entries this_dir depth-empty |
| ... |
| Awc/D/.svn/entries this_dir depth-infinity |
| ... |
| ... |
| |
| svn up --depth=immediates Awc |
| |
| Bring in everything that's missing (C/ and mu) and empty all |
| subdirectories (and set their this_dir to depth-empty). |
| |
| Awc/.svn/entries this_dir depth-immediates |
| B |
| C |
| Awc/B/.svn/entries this_dir depth-empty |
| Awc/C/.svn/entries this_dir depth-empty |
| ... |
| |
| 4. Implementation Strategy |
| ========================== |
| |
| It would be nice if all this could be accomplished with just simple |
| tweaks to how we drive the update reporter (svn_ra_reporter2_t). |
| However, it looks like it's not going to be that easy. |
| |
| Handling 'checkout --depth=empty' would be easy. It should get us |
| an empty directory at depth-empty, with no files and no subdirs, |
| and if we just report it as at HEAD every time, the server will |
| never send updates down (hmmm, this could be a problem for getting |
| dir property updates, though). Then any files or subdirs we have |
| explicitly included we can just report at their respective |
| revisions, and get proper updates; at least that'll work for the |
| depth infinity ones. |
| |
| But consider 'checkout --depth=immediates'. The desired state is a |
| depth-immediates directory D, with all files up-to-date, and with |
| skeleton subdirs at depth empty. Plain updates should preserve this |
| state of affairs. |
| |
| If we report D as at its BASE revision, files at their BASE |
| revisions, and subdirs at HEAD, then: |
| |
| - When new files appear in the repos, they'll get sent down (good) |
| - When new subdirs appear, they'll get sent down in full (bad) |
| |
| But if we don't report subdirs as at HEAD, then the server will try to |
| update them (bad). And if we report D at HEAD, then the working copy |
| won't receive new files that have appeared in the repository since D's |
| BASE revision (note that we *can* get updates for files we already |
| have, though, by continuing to report them at their respective BASEs). |
| |
| The same logic applies to subdirectories at depth-files or |
| depth-immediates. |
| |
| So, I think this means that for efficient depth handling, we'll |
| need to have the client directly reporting the desired depth to the |
| server; i.e., extending the RA protocol. |
| |
| Meanwhile, legacy servers will send back a bunch of information the |
| client doesn't want, and the client will just ignore it, and |
| everything will be slower than it needs to be, and people will |
| complain on the users@ list, and we'll tell them to upgrade their |
| servers, and they'll say they can't because they don't have control |
| over the server, and we'll say "So? This ain't no Grand Hotel!" |
| |
| 5. Compatability Matters |
| ======================== |
| |
| This feature introduces two new concepts into the RA protocol which |
| will not be understood by older servers: |
| |
| * Reported Depths -- the depths associated with individual paths |
| included by the client in the description (via the |
| svn_ra_reporter_t) of its working copy state. |
| |
| * Requested Depth -- the single depth value used to limit the |
| scope of the server's response to the client. |
| |
| As such, it's useful to understand how these concepts will be |
| handled across the compatability matrix of depth-aware and |
| non-depth-aware clients and servers. |
| |
| NOTE: in the sections below, it is not necessarily that case that a |
| value or state which is said to be "transmitted" literally has a |
| presence in the RA protocol. Some such bits of state have default |
| values in the protocol and can therefore be effectively transmitted |
| while not literally identifiable in a network trace of the |
| client-server traffic. |
| |
| Depth-aware Clients (DACs) |
| |
| DACs will transmit reported depths (with "infinity" as the |
| default) and will transmit a requested depth (with "unknown" as |
| the default). They will also -- for the sake of older, |
| non-depth-aware servers (NDASs) -- transmit a requested recurse |
| value derived from the requested depth: |
| |
| depth recurse |
| ----- ------- |
| empty no |
| files no |
| unknown yes |
| immediates yes |
| infinity yes |
| |
| When speaking to an NDAS, the requested recurse value is the |
| only thing the server understands , but is obviously more |
| "grainy" than the requested depth concept. The DAC, therefore, |
| must filter out any additional, unwanted data that the server |
| transmits in its response. (This filtering will happen in the |
| RA implementation itself so the RA APIs behave as expected |
| regardless of the server's pedigree.) |
| |
| When speaking to a depth-aware server (DAS), the requested |
| recurse value is ignored. A requested depth of "unknown" means |
| "only send information about the stuff in my report, |
| depth-aware-ily". Other requested depth values are honored by |
| the server properly, and the DAC must handle the transformation |
| of any working copy depths from their pre-update to their |
| post-update depths and content as described in `3. Examples'. |
| |
| Non-depth-aware Clients (NDACs) |
| |
| NDACs will never transmit reported depths and never transmit a |
| requested depth. But they will transmit a requested recurse |
| value (either "yes" or "no", with "yes" being the default). (A |
| DAS uses the presence of a requested depth in the actual protocol |
| to distinguish DACs from NDACs, and knows to ignore the |
| requested recurse value transmitted by a DAC.) |
| |
| When speaking to an NDAS, what happens happens. It's the past, |
| man -- you don't get to define the interaction this late in the |
| game! |
| |
| When speaking to a DAS, the not-reported depths are treated like |
| reported depths of "infinity", and the reported recurse values |
| "yes" and "no" map to depths of "infinity" and "files", |
| respectively. |
| |
| 6. Current Status |
| ================= |
| |
| The sparse-directories code is merged to trunk in revision r23994. |
| |
| A new enum type 'svn_depth_t depth' is defined in svn_types.h. |
| Both client and server side now understand the concept of depth, |
| and the basic update use cases handle depth. See depth_tests.py |
| for what is known to be working. (Many cases are not yet tested, |
| and almost certainly some of them will fail right now.) |
| |
| On the client side, most of the svn_client.h interfaces that |
| formerly took 'svn_boolean_t recurse' now take 'svn_depth_t depth'. |
| (The -N option is deprecated, but it still works: it simply maps to |
| --depth=files, which results in the same behavior as -N used to.) |
| |
| Some of this recurse-becomes-depth change has propagated down into |
| libsvn_wc, which now stores a depth field in svn_wc_entry_t (and |
| therefore in .svn/entries). The update reporter knows to report |
| differing depths to the server, in the same way it already reports |
| differing revisions. In other words, take the concept of "mixed |
| revision" working copies and extend it to "mixed depth" working |
| copies. |
| |
| On the server side, most of the significant changes are in |
| libsvn_repos/reporter.c. The code that receives update reports now |
| receives notice of paths that have different depths from their |
| parent, and of course the overall update operation has a global |
| depth, which applies whenever not shadowed by some local depth for |
| a given path. |
| |
| The RA code on both sides knows how to send and receive depths; the |
| relevant svn_ra_* APIs now take depth arguments, which sometimes |
| supersede older 'recurse' booleans. In these cases, the RA layer |
| does the usual compatibility dance: receiving "recurse=FALSE" from |
| an older client causes the server to behave as if "depth=files" |
| had been transmitted. |
| |
| Work remaining: |
| |
| The list of outstanding issues is shown by this issue tracker query |
| of Summary prefixed with [sparse-directories]: |
| |
| <http://subversion.tigris.org/issues/buglist.cgi?component=subversion&issue_status=NEW&issue_status=STARTED&issue_status=REOPENED&short_desc=%5Bsparse-directories%5D&short_desc_type=casesubstring> |
| |