| A Streamlined HTTP Protocol for Subversion |
| |
| GOAL |
| ==== |
| |
| Write a new HTTP protocol for svn -- one which is entirely proprietary |
| and designed for speed and comprehensibility. |
| |
| |
| PURPOSE / HISTORY |
| ================= |
| |
| Subversion standardized on Apache and the WebDAV/DeltaV protocol as a |
| back in the earliest days of development, based on some very strong |
| value propositions: |
| |
| A. Able to go through corporate firewalls |
| B. Zillions of authn/authz options via Apache |
| C. Standardized encryption (SSL) |
| D. Excellent logging |
| E. Built-in repository browsing |
| F. Caching within intermediate proxies |
| G. Interoperability with other WebDAV clients |
| |
| Unfortunately, DeltaV is an insanely complex and inefficient protocol, |
| and doesn't fit Subversion's model well at all. The result is that |
| Subversion speaks a "limited portion" of DeltaV, and pays a huge |
| performance price for this complexity. |
| |
| |
| REQUIREMENTS |
| ============ |
| |
| Write a new HTTP protocol for svn ("HTTP v2"). Map RA requests |
| directly to HTTP requests. |
| |
| * svn over HTTP should be much faster (eliminate extra turnarounds) |
| |
| * svn over HTTP should be almost as easy to extend as svnserve. |
| |
| * svn over HTTP should be comprehensible to devs and users both |
| (require no knowledge of DeltaV concepts). |
| |
| * svn over HTTP should be designed for optimum cacheability by web |
| proxies. |
| |
| * svn over HTTP should make use of pipelined and parallel requests |
| when possible. |
| |
| |
| |
| Our Plans, in a Nutshell |
| ======================== |
| |
| * Phase 1: Remove all DeltaV mechanics & formalities |
| |
| - get rid of all the PROPFIND 'discovery' turnarounds. |
| - stop doing CHECKOUT requests before each PUT |
| - publish a public URI syntax for browsing historical objects |
| |
| * Phase 2: Speed up commits |
| |
| - Make PUT requests pipelined, the way ra_svn does. |
| |
| * Phase 3: (maybe) get rid of XML in request/response bodies |
| |
| - if there's a worthwhile speed gain, us serialzed Thrift objects. |
| |
| |
| |
| Phase 1 in Detail |
| ================= |
| |
| At the moment, ra_serf has to 'discover' and manipulate the following |
| DeltaV objects: |
| |
| - Version Controlled Resource (VCC) : !svn/vcc |
| - Baseline resource: !svn/bln |
| - Working baseline resource: !svn/wbl |
| - Baseline collection resource: !svn/bc/REV/ |
| - Activity collection: !svn/act/activityUUID/ |
| - Versioned resource: !svn/ver/REV/path |
| - Working resource: !svn/wrk/activityUUID/path |
| |
| All of these objects will be deprecated and no longer used. |
| mod_dav_svn will still support older clients, of course, but new |
| clients will be able to automatically construct all of the URIs they |
| need. |
| |
| |
| * Opening an RA session: |
| |
| ra_serf will send an OPTIONS request when creating a new |
| ra_session. mod_dav_svn will send back what it already sends now, |
| but will also return new information: |
| |
| youngest revision: number |
| "root stub": !svn/me |
| "pegrev stub": !svn/bc |
| "revision stub": !svn/rev |
| |
| The presence of these new stubs tells ra_serf that this is a new |
| server, and that the new streamlined HTTP protocol can be used. |
| ra_serf then caches them in the ra_session object. If these new |
| OPTIONS responses are not returned, ra_serf falls back to 'classic' |
| DeltaV protocol. |
| |
| |
| * What the new stubs are used for: |
| |
| - root stub: represents the "repository itself". This is the URI |
| that custom REPORTS are sent against. |
| |
| Note: this eliminates our need for the VCC resource. |
| |
| - pegrev stub: an opaque string to append to, whenever the client |
| wants to refer to a (pegrev, path) in the repository. |
| Specifically, /REV/PATH are appended, e.g. |
| |
| GET !svn/bc/2398/trunk/foo.c |
| |
| Note: that this syntax is already the one mod_dav_svn understands; |
| what's changing here is that we no longer need to do a bunch of |
| PROPFINDs to discover it -- we get the stub right up front when |
| the session is opened. |
| |
| - revision stub: represents an opaque string to append to, whenever |
| the client wants to access a revision's revprops (either reading |
| or writing). Specifically, /REV is appended, e.g. |
| |
| PROPFIND !svn/rev/2398 |
| |
| Standard PROPFIND and PROPATCH requests can be used against the |
| constructed URI, with the understanding that the name/value pairs |
| being accessed are unversioned revision props, rather than file |
| or directory props. |
| |
| Note: this eliminates our need for baseline (bln) or working |
| baseline (wbl) resources. |
| |
| |
| * Simple read requests |
| |
| These RA functions each send single request/response, either GET or |
| PROPFIND. |
| |
| The only changes here is that we no longer need to "discover" |
| pegrev or revision URIs with extra turnarounds; instead we construct |
| them directly. |
| |
| get-latest-rev -> already present in ra_session (via OPTIONS) |
| |
| get-file -> GET (against a pegrev URI) |
| |
| get-dir -> PROPFIND depth 1 (against a pegrev URI) |
| |
| rev-prop -> PROPFIND (against a revision URI) |
| |
| rev-proplist -> PROPFIND (against a revision URI, but recursive) |
| |
| check-path -> PROPFIND (against a pegrev URI) |
| |
| stat -> PROPFIND (against a pegrev URI) |
| |
| get-lock -> PROPFIND (against a public HEAD URI) |
| |
| |
| * Complex read requests |
| |
| These RA functions are each accomplished in a single REPORT |
| request/response. |
| |
| These REPORTs are not changing, except that they'll be sent against |
| the root stub URI (!svn/me) rather than a VCC URI. Again, we're |
| eliminating all "discovery" turnarounds which used to preceed these |
| requests. |
| |
| log -> REPORT (against root stub) |
| |
| get-dated-rev -> REPORT (against root stub) |
| |
| get-locations -> REPORT (against root stub) |
| |
| get-locations-segments -> REPORT (against root stub) |
| |
| get-file-revs -> REPORT (against root stub) |
| |
| get-locks -> REPORT (against root stub) |
| |
| get-mergeinfo -> REPORT (against root stub) |
| |
| replay -> REPORT (against root stub) |
| |
| replay-range -> pipelined REPORT requests (against root stub) |
| on each revision in the range |
| |
| |
| * The "update" family of requests |
| |
| update |
| switch |
| status |
| diff |
| |
| For these RA functions, the existing ra_serf strategy stays the same: |
| |
| 1. Client sends custom REPORT describing state of working copy; |
| it does *not* request text-deltas in response (the way ra_neon does). |
| |
| 2. Server responds with a 'skeletal' editor-drive. |
| |
| 3. Client pipelines bunches of GET and PROPFIND requests. |
| |
| |
| The only changes we plan to make: |
| |
| - the REPORT happens against the new 'root stub', rather than a |
| discovered VCC URI. |
| |
| - no need to cache the !svn/ver "wcprops" in the working copy |
| anymore, since our commit process has changed (see below). |
| |
| - no need to do any PROPFIND discovery of pegrev objects to fetch; |
| client can construct them at will using the 'pegrev stub' it |
| received when the ra_session began. |
| |
| |
| * Simple write requests |
| |
| change-rev-prop -> PROPPATCH (against a revision URI) |
| |
| lock -> LOCK (against a public HEAD URI) |
| |
| unlock -> UNLOCK (against a public HEAD URI) |
| |
| |
| * Commit process |
| |
| This will change significantly. The current methodology looks like: |
| |
| OPTIONS to start ra_session |
| PROPFINDs to discover various opaque URIs |
| MKACTIVITY to create a transaction |
| for each changed object: |
| CHECKOUT object to get working resource |
| {PUT, PROPPATCH, DELETE, COPY} working resource |
| MKCOL to create new directories |
| MERGE to commit the transaction |
| |
| The new sequence looks like: |
| |
| OPTIONS to start ra_session |
| POST against root stub, to create a transaction |
| for each changed object: |
| {PUT, PROPPATCH, DELETE, COPY, MKCOL} against transaction resources |
| MERGE to commit the transaction |
| |
| Specific new changes: |
| |
| - The new POST request replaces the MKACTIVITY request. |
| |
| - no more need to "discover" the activity URI; !svn/act/ is gone. |
| - client no longer creates an activity UUID itself. |
| - instead, POST returns two new stubs: |
| |
| "transaction stub": !svn/txn/TXN_UUID |
| "transaction prop stub": !svn/txp/TXN_UUID |
| |
| - transaction stub: an opaque URI which contains the svn |
| transaction's actual UUID (generated by libsvn_fs). Client |
| can then append paths to the stub to refer to any file or |
| directory within the transaction, e.g. |
| |
| PUT !svn/txn/TXN_UUID/trunk/foo.c |
| |
| - transaction prop stub: a opaque URI representing unversioned |
| props on a transaction Client can use this URI to read or |
| modify unversioned transaction properties (such as |
| 'svn:log'), e.g. |
| |
| PROPPATCH !svn/txp/TXN_UUID |
| |
| - Once the commit transaction is created, the client is free to |
| send write requests against transaction resources it constructs itself. |
| |
| Note: this eliminates the CHECKOUT requests, and also removes |
| our need to use versioned resources (!svn/ver) or working |
| resources (!svn/wrk). |
| |
| - When modifying transaction resources, clients should send |
| 'If-match:' headers to facilitate server-side out-of-dateness |
| checks. (TODO: value of header is probably an etag?) |
| |