Webdav/design/rfc_overview-1.1.txt - zetacomponents - Git at Google

 eZ component: Webdav, RFC overview, 1.1
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 :Author: Tobias Schlitt
 :Revision: $Rev$
 :Date: $Date$
 :Status: Draft

 .. contents::

 Scope
 ~~~~~

 This document tries to summarize the major points of `RFC 2518`_ (WebDAV) and
 the associated `RFC 2616`_ (HTTP/1.1) in respect to distributed editing. The
 main points here are the support of entity tags and locking. This document only
 summarizes the RFCs and contains small design related hints here and there. The
 main design for the functionality analyzed here is contained in the
 design-1.1.txt file.

 .. _`RFC 2518`: http://www.ietf.org/rfc/rfc2518.txt
 .. _`RFC 2616`: http://www.ietf.org/rfc/rfc2616.txt

 Entity Tags
 ~~~~~~~~~~~

 Entity tags are generally used in the HTTP/1.1 protocol to provide a mechanism
 of validating that a resource is in the same state. Whenever the state of a
 resource changes, its entity tag needs to change. In following, the definition
 of the HTTP/1.1 validation mechanisms in general, the definition of an entity
 tag and the definition of the ETag-header are described. In addition the usage
 of entity tags in the Webdav RFC is described.

 Section 3.11 of the HTTP/1.1 RFC describes entity tags. These strings identify
 (tag) the state of a resource, named "entities" in the RFC. The entity tag
 consists of a quoted string and an optional weakness modifier. The quoted
 string must be unique for each state. ::

       entity-tag = [ weak ] opaque-tag
       weak       = "W/"
       opaque-tag = quoted-string

 A non-weak entity tag must identify a certain state uniquely. With the added W/
 prefix, one and the same tag may identify different states of a resource that
 are semantically equivalent.

 Entity tags are used in HTTP/1.1 in combination with the following headers:

 - Request
   - If-Match
   - If-None-Match
   - If-Range
 - Response
   - ETag

 Since the Webdav component will generate the entity tag, we should ensure to
 only generate strong entity tags.

 Headers
 =======

 The following section describes the headers that are affected by entity tags
 and how the server should respect them.

 If-Match
 --------

 The If-Match header is generally used to make the method it is send with
 conditional. Only if the conditions defined by the header are met, the action
 associated with the method should be performed by the server.

 The header format is defined as follows (14.24): ::

        If-Match = "If-Match" ":" ( "*" | 1#entity-tag )

 The If-Match header in general assumes that the affected resource exists, if it
 does not, the request must fail since no entity is there to compare the given
 criteria too (no entity exists). The header either specifies "*", to indicate
 that an entity must exist, whichever that is. Alternatively any number of
 entity tags can be given, divided by ",". If one of the given tags match the
 current state of the resource, the method is performed as if not If-Match
 header was given. Else the method must fail with 412 (Precondition failed).

 In case the request would have failed anyway (not result in a 2xx or 412
 status), the If-Match condition is not even checked, but the error response
 generated by the request is returned. The reaction of a server to a combination
 of multiple If-* headers is undefined.

 .. Note::
    We should just throw all If-* headers away if a combination of the occurs,
    so the back-end does not need to deal with it.

 Examples: ::

        If-Match: "xyzzy"
        If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
        If-Match: *

 If-None-Match
 -------------

 This header behaves similar to the If-Match header, except that the operation
 is only performed if none of the submitted entity tags matches. In case the
 operation is a GET or a HEAD operation and one of the tags matched, the server
 should return a 304 (Not Modified) code, in all other cases a 412 (Precondition
 Failed) must be returned.

 The header is defined like this: ::

        If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag )

 The '*' again means that any entity tag exists (as for If-Match). In case of
 the If-None-Match header, the operation will be executed only of no entity of
 the resource exists. In fact this means that the resource does not exist.

 Examples: ::

        If-None-Match: "xyzzy"
        If-None-Match: W/"xyzzy"
        If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
        If-None-Match: W/"xyzzy", W/"r2d2xxxx", W/"c3piozzzz"
        If-None-Match: *

 If-Range
 --------

 The If-Range header does only make sense to be respected, if the server supports
 partial GET requests and resuming of such. Since the Webdav component does not
 support this, yet, the header will be ignored.

 .. Note::
    If we support partial GET sometimes, support for this header must be
    considered, too.

 ETag
 ----

 The ETag header is send by a server to give the client an entity-tag that
 identifies the current state of a certain resource. This tag can later be used
 by the client with any of the headers described above.  The ETag header is
 therefore a response header, while the headers described above are request
 headers.

 The ETag header is build up like this: ::

       ETag = "ETag" ":" entity-tag

 Examples::

       ETag: "xyzzy"
       ETag: W/"xyzzy"
       ETag: ""

 .. Attention::
    The following is an assumption that should be verified somehow. It seems that
    the ETag header is only defined for a single resource, which indicates that
    responses that affect multiple resources should not contain it.  For Webdav
    this includes several methods like COPY and MOVE, but also PROPFIND with a
    Depth header other than 0.

 Validation
 ==========

 The purpose of entity tags is to ensure, that a certain operation is only
 performed, if a resource resides in a certain state. This state is defined by
 an entity tag. Most likely, this applies to the GET operation in combination
 with caching for pure HTTP/1.1. However, in combination with the WebDAV
 extension, validation of entity tags might also be necessary for operations
 like PUT and others.

 The HTTP/1.1 RFC defines 2 different validator schemes: Strong and weak
 validators. Entity tags are generally considered strong validators, since they
 should change as soon as the affected resource changes its state. However, the
 protocol provides a way to declare entity tags to be weak validators, as
 described in the ETag header above.

 .. Note::
    We should not make use of this way of "weaking" entity tags, but always provide
    the strong method.

 Caches (like proxy servers and browser caches) use additional methods to
 validate their content. The most common way here is to use the Last-Modified
 header in addition, which indicates the last modification time of the resource.


 -- Note::
    We need to decide if we should support this validator in addition. This
    would also involve more headers to react on, like If-Modified-Since. This is
    not mandatory.


 Locking
 ~~~~~~~

 This section tries to summarize the important facts about locking mentioned in
 RFC 2518 in various places, enhanced by first pre-considerations of design and
 implementation issues associated with them.

 Lock types
 ==========

 The WebDAV RFC distinguishes between read and write locks, while only write
 locks are defined in detail. The RFC explains, that "the syntax is extensible,
 and permits the eventual specification of locking for other access types".

 A write lock determines that the locking principle may exclusively write to the
 affected resources. Reading is possibly for every other principle, too. If a
 principle that does not hold a specific lock tries to perform a writing operation
 to a resource which is locked by another principle, this operation must fail.

 Lock scopes
 ===========

 The WebDAV RFC specifies 2 different scopes for locking:

 - Exclusive
 - Shared

 For an exclusive lock exactly 1 principle may hold a lock on a specific resource
 and only this principle is allowed to perform the affected operations on the
 locked resource. For a shared lock it is possible that multiple principles take
 part in the lock (group editing). Every principle that takes part in a shared lock
 may perform the affected operations on the locked resource.

 WebDAV does not provide any channel to allow communication between the
 principles involved into a shared lock. The communication of these principles must
 be handled externally.

 The following table shows lock compatibility:

 +----------------------+-----------------+--------------+
 | Current lock state/  |   Shared Lock   |   Exclusive  |
 | Lock request         |                 |   Lock       |
 +----------------------+-----------------+--------------+
 | None                 |   True          |   True       |
 +----------------------+-----------------+--------------+
 | Shared Lock          |   True          |   False      |
 +----------------------+-----------------+--------------+
 | Exclusive Lock       |   False         |   False*     |
 +----------------------+-----------------+--------------+

 *Legend*: True = lock may be granted.  False = lock MUST NOT be granted. \*=It is
 illegal for a principal to request the same lock twice.

 Lock tokens
 ===========

 A lock token identifies a specific lock uniquely across all resources for all
 times. Whenever a successful LOCK request was processed, it returns the
 specific lock token for this lock. The lock token associates the locking principle
 with the locked resource. Therefore multiple lock tokens might be assigned to a
 single resource, if the lock is a shared lock.

 A lock token must be unique throughout all resources for all times. The WebDAV
 RFC therefore defines a lock token scheme, which can optionally be used by the
 server. The so called "opaquelocktoekn" scheme makes use of UUID_, as defined
 in `ISO-11578`_. A `PECL package for UUIDs`_ is available.

 .. _`UUID`: http://en.wikipedia.org/wiki/UUID
 .. _`ISO-11578`: http://www.iso.ch/cate/d2229.html
 .. _`PECL package for UUIDs`: http://pecl.php.net/package/uuid

 Since the opaquelocktoken scheme is not mandatory, the code snippet ::

     $token = md5( uniqid( rand(), true ) );

 could be used as an alternative to provide the necessary amount of uniqueness.
 To create a lock token that, a this way generated ID could be appended to the
 URI of the affected resource to provide transparency of the source of a lock
 token. For example: http://webdav/foo/bar.txt#<id>.

 Every principle that can access the WebDAV server has access to lock tokens
 through the LOCKDISCOVERY request method, so the lock must be bound to a
 different authentication mechanism. An owner string is submitted with the LOCK
 request, which might help here.

 Affected requests
 =================

 Locks affect several request, beside the explicitly lock related requests. The
 following 2 sections summarize the affected request methods and give a short
 overview about how these are affected.

 LOCK
 ----

 The LOCK method is a new method, which needs to be supported. The request body
 of the LOCK method contains a dedicated XML element. Both, the request
 abstraction object and objects for the conent, already exist in the Webdav
 component.

 The method supports the Depth header, but only with the values 0 and INFINITY.
 The value 1 is not supported. 0 means that only the resource itself is affected
 and INFITY includes all descendant resources. For non-collection resources both
 mean the same, For collection resources 0 means that only the collection should
 be locked and INFINITY reciusively locks all descendants of the collection in
 addition to the collection itself. No Depth header means that INFINITY is
 asumed.

 A LOCK method must only return a single lock token for all resources locked
 with this request. If an UNLOCK method is successfully executed with this lock
 token, all affected resources must be unlocked.

 If a LOCK operation fails because there is a conflict with one of the resources
 to LOCK, the complete operation needs to fail (no partial success). The
 response code 409 (Conflict) must be returned, the body must be a multistatus
 XML element that contains the resource that is responsible for the conflict.

 Status codes returned by the LOCK method are:

 200 (OK) - The lock request succeeded and the value of the lockdiscovery
 property is included in the body.

 412 (Precondition Failed) - The included lock token was not enforceable on this
 resource or the server could not satisfy the request in the lockinfo XML
 element.

 423 (Locked) - The resource is locked, so the method has been rejected.

 Example - Simple LOCK request: ::

    >>Request

    LOCK /workspace/webdav/proposal.doc HTTP/1.1
    Host: webdav.sb.aol.com
    Timeout: Infinite, Second-4100000000
    Content-Type: text/xml; charset="utf-8"
    Content-Length: xxxx
    Authorization: Digest username="ejw",
       realm="ejw@webdav.sb.aol.com", nonce="...",
       uri="/workspace/webdav/proposal.doc",
       response="...", opaque="..."

    <?xml version="1.0" encoding="utf-8" ?>
    <D:lockinfo xmlns:D='DAV:'>
      <D:lockscope><D:exclusive/></D:lockscope>
      <D:locktype><D:write/></D:locktype>
      <D:owner>
           <D:href>http://www.ics.uci.edu/~ejw/contact.html</D:href>
      </D:owner>
    </D:lockinfo>

    >>Response

    HTTP/1.1 200 OK
    Content-Type: text/xml; charset="utf-8"
    Content-Length: xxxx

    <?xml version="1.0" encoding="utf-8" ?>
    <D:prop xmlns:D="DAV:">
      <D:lockdiscovery>
           <D:activelock>
                <D:locktype><D:write/></D:locktype>
                <D:lockscope><D:exclusive/></D:lockscope>
                <D:depth>Infinity</D:depth>
                <D:owner>
                     <D:href>
                          http://www.ics.uci.edu/~ejw/contact.html
                     </D:href>
                </D:owner>
                <D:timeout>Second-604800</D:timeout>
                <D:locktoken>
                     <D:href>
                opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4
                     </D:href>
                </D:locktoken>
           </D:activelock>
      </D:lockdiscovery>
    </D:prop>

 Example - Refreshing LOCK request: ::

    >>Request

    LOCK /workspace/webdav/proposal.doc HTTP/1.1
    Host: webdav.sb.aol.com
    Timeout: Infinite, Second-4100000000
    If: (<opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4>)
    Authorization: Digest username="ejw",
       realm="ejw@webdav.sb.aol.com", nonce="...",
       uri="/workspace/webdav/proposal.doc",
       response="...", opaque="..."

    >>Response

    HTTP/1.1 200 OK
    Content-Type: text/xml; charset="utf-8"
    Content-Length: xxxx

    <?xml version="1.0" encoding="utf-8" ?>
    <D:prop xmlns:D="DAV:">
      <D:lockdiscovery>
           <D:activelock>
                <D:locktype><D:write/></D:locktype>
                <D:lockscope><D:exclusive/></D:lockscope>
                <D:depth>Infinity</D:depth>
                <D:owner>
                     <D:href>
                     http://www.ics.uci.edu/~ejw/contact.html
                     </D:href>
                </D:owner>
                <D:timeout>Second-604800</D:timeout>
                <D:locktoken>
                     <D:href>
                opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4
                     </D:href>
                </D:locktoken>
           </D:activelock>
      </D:lockdiscovery>
    </D:prop>

 .. Note::
    The server must not honor the principles Timeout header!

 Example - Multi resource LOCK Request: ::

    >>Request

    LOCK /webdav/ HTTP/1.1
    Host: webdav.sb.aol.com
    Timeout: Infinite, Second-4100000000
    Depth: infinity
    Content-Type: text/xml; charset="utf-8"
    Content-Length: xxxx
    Authorization: Digest username="ejw",
       realm="ejw@webdav.sb.aol.com", nonce="...",
       uri="/workspace/webdav/proposal.doc",
       response="...", opaque="..."

    <?xml version="1.0" encoding="utf-8" ?>
    <D:lockinfo xmlns:D="DAV:">
      <D:locktype><D:write/></D:locktype>
      <D:lockscope><D:exclusive/></D:lockscope>
      <D:owner>
           <D:href>http://www.ics.uci.edu/~ejw/contact.html</D:href>
      </D:owner>
    </D:lockinfo>

    >>Response

    HTTP/1.1 207 Multi-Status
    Content-Type: text/xml; charset="utf-8"
    Content-Length: xxxx

    <?xml version="1.0" encoding="utf-8" ?>
    <D:multistatus xmlns:D="DAV:">
      <D:response>
           <D:href>http://webdav.sb.aol.com/webdav/secret</D:href>
           <D:status>HTTP/1.1 403 Forbidden</D:status>
      </D:response>
      <D:response>
           <D:href>http://webdav.sb.aol.com/webdav/</D:href>
           <D:propstat>
                <D:prop><D:lockdiscovery/></D:prop>
                <D:status>HTTP/1.1 424 Failed Dependency</D:status>
           </D:propstat>
      </D:response>
    </D:multistatus>

 UNLOCK
 ------

 The UNLOCK method handles the removal of a lock established via the LOCK
 method. A lock may also disappear by itself, for example when a timeout is
 reached. If the principle holding a lock finished its operation on the locked
 resources it should use the UNLOCK method to release the lock.

 The UNLOCK method only receives a lock token via the Lock-Token header. The
 lock identified by this token is to be released.

 .. Note::
    Not only the resource identified by the resource URI must be unlocked, but
    all other resources that are locked by the given lock token!

 Example - UNLOCK request: ::

    >>Request

    UNLOCK /workspace/webdav/info.doc HTTP/1.1
    Host: webdav.sb.aol.com
    Lock-Token: <opaquelocktoken:a515cfa4-5da4-22e1-f5b5-00a0451e6bf7>
    Authorization: Digest username="ejw",
       realm="ejw@webdav.sb.aol.com", nonce="...",
       uri="/workspace/webdav/proposal.doc",
       response="...", opaque="..."

    >>Response

    HTTP/1.1 204 No Content

 Affected base methods
 ---------------------

 The following methods may only be performed on a locked resource if the
 performing principle owns the specific lock.

 - PUT
 - POST
 - PROPPATCH
 - MOVE
 - COPY
 - DELETE
 - MKCOL

 In addition the PROPFIND request is affected by lock support, since lock
 information is visible to every principle through the LOCKDISCOVERY and
 SUPPORTEDLOCK properties.

 Locking resources
 =================

 Both types of resources (non-collection and collection resources) can be
 locked. This section describes differences for both types and other points
 directly related to locking resources.

 Non-collection resources
 ------------------------

 A non collection resource can be affected directly or indirectly by a lock. In
 the first case a principle has issued a LOCK request explicitly for this resource,
 only locking this single resource. The second case occurs, if a principle locked a
 collection resource and the non-collection resource is a direct or in-direct
 descendant of it. For detailed information on this topic see the next section
 about locking if collection resources.

 Collections
 -----------

 The LOCK request allows the 'Depth' header to be set to specify the depth of
 the created lock. A depth value of ZERO means, that only the affected
 collection itself is locked. This might be sensible to add new resources to
 this collection. The depth value INFINITY means that the created lock
 recursively affects all descendants of the collection. This way it is possible
 to lock a complete sub-tree of the WebDAV repository.

 Any lock (no matter which depth) on a collection prevents the addition and
 removal of direct members of this collection by non-lock-owners. This affects
 the following methods:

 - PUT
 - POST
 - MKCOL
 - DELETE

 If a collection should be locked and any of its members is already locked, this
 conflicts with the lock to be set and must result in an 423 error (Locked).
 Members that are newly created inside a locked collection or copied/moved to
 it are automatically included to the lock. This affects infinity-depth locks as
 well as zero-depth ones for direct children of the locked collection.

 Lock null resources
 -------------------

 A write lock might be acquired to a resource that does not (yet) exist. This
 is called a "lock null resource". A lock null resource only supports the
 methods:

 - PUT
 - MKCOL
 - OPTIONS
 - PROPFIND
 - LOCK
 - UNLOCK

 All other methods must return 404 (Not Found) or 405 (Method Not Allowed).

 The properties of a null resource are mostly empty, except there must be
 LOCKDISCOVERY and SUPPORTEDLOCK properties. If PUT or MKCOL are issued, a null
 resource becomes a normal one. The RFC does not state if the lock stays on this
 newly created (real) resource or if it is removed.

 COPY/MOVE
 ---------

 Both methods destroy locks. If a resource is moved to a locked collection, it
 is automatically added to the lock (same principle assumed). Both will not work,
 if the destination is locked but no lock is owned by the principle.

 Refresh
 =======

 A LOCK request must not occur twice. To refresh a lock, principles send a LOCK
 request with empty body and an If header that specifies the lock tokens to
 refresh locks for. If this occurs, the timers of the lock must be reset. A
 Timeout header might be send by the principle, but the server may safely ignore
 these and simply perform a refresh as it desires.

 If header
 ~~~~~~~~~

 In addition to the headers If-Match and If-None-Match, which are described in
 the HTTP/1.1 RFC, RFC 2518 (WebDAV) describes the If-Header. The If header is
 used to define conditional actions by the client, similar to the 2 headers
 named before. However, it is constructed in a much more complex and weird way.

 The If header is described with the following pseudo EBNF: ::

    If = "If" ":" ( 1*No-tag-list | 1*Tagged-list)
    No-tag-list = List
    Tagged-list = Resource 1*List
    Resource = Coded-URL
    List = "(" 1*(["Not"](State-token | "[" entity-tag "]")) ")"
    State-token = Coded-URL
    Coded-URL = "<" absoluteURI ">"

 It may contain entity tags (see `Entity tag support`_), lock tokens (see `Lock
 tokens`_) and combination of both. In addition the header may contain
 additional resource URIs to affect not only the main request URI. Luckily, the
 If header either containes a tagged list (including affected resource URIs) or
 a no-tag list (without resource URIs). It cannot contain a combination of
 those.

 Both lists (tagged and no-tag) contain a not limmited number of lock tokens
 and/or entity tags and maybe prefixed by the keyword "Not". This indicates that the
 affected method may only be executed if the condition defined in the list does
 not match. This works similar to the If-None-Match header, specified by the
 HTTP/1.1 RFC.

 To illustrate this complex definition some more, some examples are presented
 and explained in following.

 No-tag list
 ------------

 ::

    If: (<locktoken:a-write-lock-token> ["I am an ETag"]) (["I am another ETag"])

 This If header consists of 2 no-tag lists (it does not contain any resource
 URIs). The first list consists of a lock token and an entity tag, the second
 only contains an entity tag. The semantics of this example is, that the method
 containing this If header may only be executed if

 - either the first combination of lock token and entity tag is matched
 - or if the second entity tag is matched.

 Note that the first list item describes a logical AND operation, while the
 whole list concatenates its items by logical OR.

 Tagged list
 -----------

 ::

    COPY /resource1 HTTP/1.1
    Host: www.foo.bar
    Destination: http://www.foo.bar/resource2
    If: <http://www.foo.bar/resource1> (<locktoken:a-write-lock-token>
    [W/"A weak ETag"]) (["strong ETag"])
    <http://www.bar.bar/random>(["another strong ETag"])

 This example does not only show an If header with a tagged list, but also a
 context where this could make some sense: The COPY method affects several
 resources at once. It works at least on a source (the request URI) and a
 destination (see Destination header), Additionally it can affect whole
 sub-tress, using the Depth header.

 The If header in this case affects 2 resources, while one of the defined
 conditions will be checked and the other won't. The first list affects the
 request URI and contains 2 elements concatenated with logical OR. The first
 item consists of an AND-combination of a lock token and an entity tag. The
 second only contains an entity tag. This is similar to the example shown above
 for the no-tag list. Except that it contains the "tagging" URI. For the
 second tag only an entity tag is listed. Anyway, this condition would not be
 checked in the request but simply be ignored, since the resource in the tag is
 not affected by the resource.

 Not
 ---

 ::

    If: (Not <locktoken:write1> <locktoken:write2>)

 This simple If header only shows the definition of a "Not" affected list. The
 keyword must occur at the very begining of the affected list item. This item
 contains 2 lock tokens combined with logical AND. In clear words the requested
 affected by this If header will be executed if non of the affected resources is
 locked by either of the specified lock tokens.


 ..
    Local Variables:
    mode: rst
    fill-column: 79
    End:
    vim: et syn=rst tw=79