doc/handbook/appendices.texi - subversion - Git at Google

 @node Appendices
 @chapter Appendices

 A number of other useful documents relevant to Subversion.

 @menu
 * SVN for CVS users::
 * Directory versioning::
 * Compiling and installing::
 * Quick reference sheet::
 * FAQ::
 * Contributing::
 * License::
 @end menu

 @c ------------------------------------------------------------------
 @node SVN for CVS users
 @section SVN for CVS users

 This document is meant to be a quick-start guide for CVS users new to
 Subversion.  It's not a substitute for real documentation and manuals;
 but it should give you a quick conceptual ``diff'' when switching over.

 The goal of Subversion is to take over the current and future CVS user
 base.  Subversion not only includes new features, but attempts to fix
 certain ``broken'' behaviors that CVS had.  This means that you may be
 encouraged to break certain habits -- ones that you forgot were odd to
 begin with.

 @menu
 * Revision numbers are different now::
 * More disconnected operations::
 * Distinction between status and update::
 * Meta-data properties::
 * Directory versions::
 * Conflicts::
 * Binary files::
 * Authorization::
 * Versioned Modules::
 * Branches and tags::
 @end menu


 @node Revision numbers are different now
 @subsection Revision numbers are different now

 In CVS, revision numbers are per-file.  This is because CVS uses RCS
 as a backend; each file has a corresponding RCS file in the
 repository, and the repository is roughly laid out according to
 structure of your project tree.

 In Subversion, the repository looks like a single filesystem.  Each
 commit results in an entirely new filesystem tree; in essence, the
 repository is an array of trees.  Each of these trees is labeled with
 a single revision number.  When someone talks about ``revision 54,''
 they're talking about a particular tree (and indirectly, the way the
 filesystem looked after the 54th commit).

 Technically, it's not valid to talk about ``revision 5 of @file{foo.c}''.
 Instead, one would say ``@file{foo.c} as it appears in revision 5.''
 Also, be careful when making assumptions about the evolution of a file.
 In CVS, revisions 5 and 6 of @file{foo.c} are always different.  In
 Subversion, it's most likely that @file{foo.c} did *not* change between
 revisions 5 and 6.

 @node More disconnected operations
 @subsection More disconnected operations

 In recent years, disk space has become outrageously cheap and
 abundant, but network bandwidth has not.  Therefore, the Subversion
 working copy has been optimized around the scarcer resource.

 The @file{.svn} administrative directory serves the same purpose as the
 @file{CVS} directory, except that it also stores ``pristine'' copies of files.
 This allows you to do many things off-line:

 @itemize @bullet
 @item @samp{svn status}
 shows you local modifications (see below)
 @item @samp{svn diff}
 shows you the details of your modifications
 @item @samp{svn ci}
 sends differences to the repository (CVS only sends fulltexts!)
 @item @samp{svn revert}
 removes your modifications
 @end itemize

 This last subcommand is new; it will not only remove local mods, but
 it will un-schedule operations such as adds and deletes.  It's the
 preferred way to revert a file; running @samp{rm file; svn up} will
 still work, but it blurs the purpose of updating.  And, while we're on
 this subject@dots{}


 @node Distinction between status and update
 @subsection Distinction between status and update

 In Subversion, we've tried to erase a lot of the confusion between the
 @samp{status} and @samp{update} subcommands.

 The @samp{status} command has two purposes: (1) to show the user any local
 modifications in the working copy, and (2) to show the user which
 files are out-of-date.  Unfortunately, because of CVS's hard-to-read
 output, many CVS users don't take advantage of this command at all.
 Instead, they've developed a habit of running @samp{cvs up} to quickly see
 their mods.  Of course, this has the side effect of merging repository
 changes that you may not be ready to deal with!

 With Subversion, we've tried to remove this muddle by making the
 output of @samp{svn status} easy to read for humans and parsers.  Also,
 @samp{svn update} only prints information about files that are updated,
 @emph{not} local modifications.

 Here's a quick guide to @samp{svn status}.  We encourage all new
 Subversion users to use it early and often:

 @itemize @bullet
 @item @samp{svn status}
 prints all files that have local modifications; the network is not
 accessed by default.
 @itemize @bullet
 @item @option{-u} switch
 add out-of-dateness information from repository
 @item @option{-v} switch
 show @emph{all} entries under version control
 @item @option{-n} switch
 nonrecursive
 @end itemize
 @end itemize

 The status command has two output formats.  In the default ``short''
 format, local modifications look like this:

 @example
     % svn status
     M     ./foo.c
     M     ./bar/baz.c
 @end example

 If you specify either the @option{-u} or @option{-v} switch, a ``long''
 format is used:

 @example
     % svn status
     M             1047    ./foo.c
     _      *      1045    ./faces.html
     _      *         -    ./bloo.png
     M             1050    ./bar/baz.c
     Head revision:   1066
 @end example

 In this case, two new columns appear.  The second column
 contains an asterisk if the file or directory is
 out-of-date.  The third column shows the working-copy's revision
 number of the item.  In the example above, the asterisk indicates that
 @file{faces.html} would be patched if we updated, and that
 @file{bloo.png} is a newly added file in the repository.  (The @samp{-} next
 to bloo.png means that it doesn't yet exist in the working copy.)

 Lastly, here's a quick summary of status codes that you may see:

 @example
    A    Add
    D    Delete
    R    Replace  (delete, then re-add)
    M    local Modification
    U    Updated
    G    merGed
    C    Conflict
 @end example

 Subversion has combined the CVS @samp{P} and @samp{U} codes into just
 @samp{U}.  When a merge or conflict occurs, Subversion simply prints
 @samp{G} or @samp{C}, rather than a whole sentence about it.


 @node Meta-data properties
 @subsection Meta-data properties

 A new feature of Subversion is that you can attach arbitrary metadata to
 files and directories.  We refer to this data as @dfn{properties}, and
 they can be thought of as collections of name/value pairs (hashtables)
 attached to each item in your working copy.

 To set or get a property name, use the @samp{svn propset} and @samp{svn
 propget} subcommands.  To list all properties on an object, use
 @samp{svn proplist}.

 For more information, @xref{Properties}.


 @node Directory versions
 @subsection Directory versions

 Subversion tracks tree structures, not just file contents.  It's one
 of the biggest reasons Subversion was written to replace CVS.

 Here's what this means to you:

 @itemize @bullet
 @item
 the @samp{svn add} and @samp{svn rm} commands work on directories now, just as
 they work on files.  So do @samp{svn cp} and @samp{svn mv}.  However, these
 commands do *not* cause any kind of immediate change in the
 repository.  Instead, the working directory is recursively ``scheduled''
 for addition or deletion.  No repository changes happen until you
 commit.
 @item
 Directories aren't dumb containers anymore; they have revision
 numbers like files.  (Or more properly, it's correct to talk
 about ``directory @file{foo/} in revision 5''.)
 @end itemize

 Let's talk more about that last point.  Directory versioning is a Hard
 Problem.  Because we want to allow mixed-revision working copies,
 there are some limitations on how far we can abuse this model.

 From a theoretical point of view, we define ``revision 5 of directory
 @file{foo}'' to mean a specific collection of directory-entries and
 properties.  Now suppose we start adding and removing files from @file{foo},
 and then commit.  It would be a lie to say that we still have revision
 5 of @file{foo}.  However, if we bumped @file{foo}'s revision number after the
 commit, that would be a lie too; there may be other changes to @file{foo} we
 haven't yet received, because we haven't updated yet.

 Subversion deals with this problem by quietly tracking committed adds
 and deletes in the @file{.svn} area.  When you eventually run @samp{svn
 update}, all accounts are settled with the repository, and the directory's new
 revision number is set correctly.  @b{Therefore, only after an update is
 it truly safe to say that you have a ``perfect'' revision of a directory.}
 Most of the time, your working copy will contain ``imperfect'' directory
 revisions.

 Similarly, a problem arises if you attempt to commit property changes on
 a directory.  Normally, the commit would bump the working directory's
 local revision number.  But again, that would be a lie, because there
 may be adds or deletes that the directory doesn't yet have, because no
 update has happened.  @b{Therefore, you are not allowed to commit
 property-changes on a directory unless the directory is up-to-date.}

 For more specific examples and discussion: @xref{Directory versioning}.


 @node Conflicts
 @subsection Conflicts

 CVS marks conflicts with in-line ``conflict markers'', and prints a @samp{C}
 during an update.  Historically, this has caused problems.  Many users
 forget about (or don't see) the @samp{C} after it whizzes by on their
 terminal.  They often forget that the conflict-markers are even
 present, and then accidentally commit garbaged files.

 Subversion solves this problem by making conflicts more tangible.
 Read about it: @xref{Basic Work Cycle}.  In particular, read the
 section about ``Merging others' changes''.


 @node Binary files
 @subsection Binary files

 CVS users have to mark binary files with @option{-kb} flags, to prevent data
 from being munged (due to keyword expansion and line-ending
 translations).  They sometimes forget to do this.

 Subversion examines the @samp{svn:mime-type} property to decide if a file
 is text or binary.  If the file has no @samp{svn:mime-type} property,
 Subversion assumes it is text.  If the file has the @samp{svn:mime-type}
 property set to anything other than @samp{text/*}, it assumes the file is
 binary.

 Subversion also helps users by running a binary-detection algorithm in
 the @samp{svn import} and @samp{svn add} subcommands.  These subcommands will
 make a good guess and then (possibly) set a binary @samp{svn:mime-type}
 property on the file being added.  (If Subversion guesses wrong, you
 can always remove or hand-edit the property.)

 As in CVS, binary files are not subject to keyword expansion or
 line-ending conversions.  Also, when a binary file is ``merged'' during
 update, no real merge occurs.  Instead, Subversion creates two files
 side-by-side in your working copy; the one containing your local
 modifications is renamed with an @file{.orig} extension.


 @node Authorization
 @subsection Authorization

 Unlike CVS, SVN can handle anonymous and authorized users in the same
 repository.  There is no need for an anonymous user or a separate
 repository.  If the SVN server requests authorization when committing,
 the client should prompt you for your authorization (password).


 @node Versioned Modules
 @subsection Versioned Modules

 Unlike CVS, a Subversion working copy is aware that it has checked out
 a module.  That means that if somebody changes the definition of a
 module, then a call to @samp{svn up} will update the working copy
 appropriately.

 Subversion defines modules as a list of directories within a directory
 property.  @xref{Modules}.


 @node Branches and tags
 @subsection Branches and tags

 Subversion doesn't distinguish between filesystem space and ``branch''
 space; branches and tags are ordinary directories within the
 filesystem.  This is probably the single biggest mental hurdle a CVS
 user will need to climb.  Read all about it: @xref{Branches and Tags}.


 @c ------------------------------------------------------------------
 @node Directory versioning
 @section Directory versioning

 @quotation
 @emph{"The three cardinal virtues of a master technologist are:
 laziness, impatience, and hubris." -- Larry Wall}
 @end quotation

 This appendix describes some of the theoretical pitfalls around the
 (possibly arrogant) notion that one can simply version directories
 just as one versions files.

 @subsection Directory Revisions

 To begin, recall that the Subversion repository is an array of trees.
 Each tree represents the application of a new atomic commit, and is
 called a @dfn{revision}.  This is very different from a CVS repository,
 which stores file histories in a collection of RCS files (and doesn't
 track tree-structure.)

 So when we refer to ``revision 4 of @file{foo.c}'' (written @dfn{foo.c:4}) in
 CVS, this means the fourth distinct version of @file{foo.c} -- but in
 Subversion this means ``the version of @file{foo.c} in the fourth revision
 (tree)''.  It's quite possible that @file{foo.c} has never changed at all
 since revision 1!  In other words, in Subversion, different revision
 numbers of the same versioned item do @emph{not} imply different
 contents.

 Nevertheless, the contents of @file{foo.c:4} is still well-defined.  The
 file @file{foo.c} in revision 4 has a specific text and properties.

 Suppose, now, that we extend this concept to directories.  If we have a
 directory @file{DIR}, define @dfn{DIR:N} to be ``the directory DIR in the
 fourth revision.''  The contents are defined to be a particular set of
 directory entries (@dfn{dirents}) and properties.

 So far, so good.  The concept of versioning directories seems fine in
 the repository -- the repository is very theoretically pure anyway.
 However, because working copies allow mixed revisions, it's easy to
 create problematic use-cases.


 @subsection The Lagging Directory

 @subsubsection Problem

 @c This is the first part of the ``Greg Hudson'' problem, so named
 @c because he was the first one to bring it up and define it well.  :-)

 Suppose our working copy has directory @samp{DIR:1} containing file
 @samp{foo:1}, along with some other files.  We remove @file{foo} and
 commit.

 Already, we have a problem: our working copy still claims to have
 @samp{DIR:1}.  But on the repository, revision 1 of @file{DIR} is
 @emph{defined} to contain @samp{foo} -- and our working copy @file{DIR} clearly
 does not have it anymore.  How can we truthfully say that we still have
 @samp{DIR:1}?

 One answer is to force @file{DIR} to be updated when we commit
 @file{foo}'s deletion.  Assuming that our commit created revision 2, we
 would immediately update our working copy to @samp{DIR:2}.  Then the
 client and server would both agree that @samp{DIR:2} does not contain
 foo, and that @samp{DIR:2} is indeed exactly what is in the working
 copy.

 This solution has nasty, un-user-friendly side effects, though.  It's
 likely that other people may have committed before us, possibly adding
 new properties to @file{DIR}, or adding a new file @file{bar}.  Now pretend our
 committed deletion creates revision 5 in the repository.  If we
 instantly update our local @file{DIR} to 5, that means unexpectedly receiving a
 copy of @file{bar} and some new propchanges.  This clearly violates a UI
 principle: ``the client will never change your working copy until you ask
 it to.''  Committing changes to the repository is a server-write
 operation only; it should @emph{not} modify your working data!

 Another solution is to do the naive thing: after committing the
 deletion of @file{foo}, simply stop tracking the file in the @file{.svn}
 administrative directory.  The client then loses all knowledge of the
 file.

 But this doesn't work either: if we now update our working copy, the
 communication between client and server is incorrect.  The client still
 believes that it has @samp{DIR:1} -- which is false, since a ``true''
 @samp{DIR:1} contains @file{foo}.  The client gives this incorrect
 report to the repository, and the repository decides that in order to
 update to revision 2, @file{foo} must be deleted.  Thus the repository
 sends a bogus (or at least unnecessary) deletion command.

 @subsubsection Solution

 After deleting @file{foo} and committing, the file is @emph{not}
 totally forgotten by the @file{.svn} directory.  While the file is no
 longer considered to be under revision control, it is still secretly
 remembered as having been `deleted'.

 When the user updates the working copy, the client correctly informs the
 server that the file is already missing from its local @samp{DIR:1};
 therefore the repository doesn't try to re-delete it when patching the
 client up to revision 2.

 @c Notes, for coders, about how the `deleted' flag works under the hood:

 @c   * the @samp{svn status} command won't display a deleted item, unless
 @c     you make the deleted item the specific target of status.
 @c
 @c   * when a deleted item's parent is updated, one of two things will happen:
 @c
 @c       (1) the repository will re-add the item, thereby overwriting
 @c           the entire entry.  (no more `deleted' flag)
 @c
 @c       (2) the repository will say nothing about the item, which means
 @c           that it's fully aware that your item is gone, and this is
 @c           the correct state to be in.  In this case, the entire entry
 @c           is removed.  (no more `deleted' flag)
 @c
 @c   * if a user schedules an item for addition that has the same name
 @c     as a `deleted' entry, then entry will have both flags
 @c     simultaneously.  This is perfectly fine:
 @c
 @c         * the commit-crawler will notice both flags and do a delete()
 @c           and then an add().  This ensures that the transaction is
 @c           built correctly. (without the delete(), the add() would be
 @c           on top of an already-existing  item.)
 @c
 @c         * when the commit completes, the client rewrites the entry as
 @c           normal.  (no more `deleted' flag)


 @subsection The Overeager Directory

 @c This is the 2nd part of the ``Greg Hudson'' problem.

 @subsubsection Problem

 Again, suppose our working copy has directory @samp{DIR:1} containing
 file @samp{foo:1}, along with some other files.

 Now, unbeknownst to us, somebody else adds a new file @file{bar} to this
 directory, creating revision 2 (and @samp{DIR:2}).

 Now we add a property to @file{DIR} and commit, which creates revision
 3.  Our working-copy @file{DIR} is now marked as being at revision 3.

 Of course, this is false; our working copy does @emph{not} have
 @samp{DIR:3}, because the ``true'' @samp{DIR:3} on the repository contains
 the new file @file{bar}.  Our working copy has no knowledge of
 @file{bar} at all.

 Again, we can't follow our commit of @file{DIR} with an automatic update
 (and addition of @file{bar}).  As mentioned previously, commits are a
 one-way write operation; they must not change working copy data.


 @subsubsection Solution

 Let's enumerate exactly those times when a directory's local revision
 number changes:

 @itemize @bullet
 @item
 @b{when a directory is updated}: if the directory is either the direct
 target of an update command, or is a child of an updated directory, it
 will be bumped (along with many other siblings and children) to a
 uniform revision number.
 @item
 @b{when a directory is committed}: a directory can only be considered a
 "committed object" if it has a new property change.  (Otherwise, to
 "commit a directory" really implies that its modified children are being
 committed, and only such children will have local revisions bumped.)
 @end itemize

 In this light, it's clear that our ``overeager directory'' problem only
 happens in the second situation -- those times when we're committing
 directory propchanges.

 Thus the answer is simply not to allow property-commits on directories
 that are out-of-date.  It sounds a bit restrictive, but there's no other
 way to keep directory revisions accurate.

 @c  Note to developers: this restriction is enforced by the filesystem
 @c  merge() routine.

 @c  Once merge() has established that {ancestor, source, target} are all
 @c  different node-rev-ids, it examines the property-keys of ancestor
 @c  and target.  If they're *different*, it returns a conflict error.


 @subsection User impact

 Really, the Subversion client seems to have two difficult---almost
 contradictory---goals.

 First, it needs to make the user experience friendly, which generally
 means being a bit ``sloppy'' about deciding what a user can or cannot do.
 This is why it allows mixed-revision working copies, and why it tries to
 let users execute local tree-changing operations (delete, add, move,
 copy) in situations that aren't always perfectly, theoretically ``safe''
 or pure.

 Second, the client tries to keep the working copy in correctly in sync
 with the repository using as little communication as possible.  Of
 course, this is made much harder by the first goal!

 So in the end, there's a tension here, and the resolutions to problems
 can vary.  In one case (the ``lagging directory''), the problem can be
 solved through a bit of clever entry tracking in the client.  In the
 other case ("the overeager directory"), the only solution is to
 restrict some of the theoretical laxness allowed by the client.


 @c ------------------------------------------------------------------
 @node Compiling and installing
 @section Compiling and installing

 The latest instructions for compiling and installing Subversion (and
 httpd-2.0) are maintained in the @file{INSTALL} file at the top of the
 Subversion source tree.

 In general, you should also be able to find the latest version of this
 file by grabbing it directly from Subversion's own repository:
 @uref{http://svn.collab.net/repos/svn/trunk/INSTALL}

 @c ------------------------------------------------------------------
 @node Quick reference sheet
 @section Quick reference sheet

 A latex quick-reference sheet exists on Subversion's website for
 download, which is compiled from the source file in
 @file{doc/user/svn-ref.tex} directory.  Any volunteers to rewrite it here
 in texinfo?


 @c ------------------------------------------------------------------
 @node FAQ
 @section FAQ

 The main FAQ for the project can viewed directly in Subversion's
 repository:

 @uref{http://svn.collab.net/repos/svn/trunk/www/project_faq.html}


 @c ------------------------------------------------------------------
 @node Contributing
 @section Contributing

 For a full description of how to contribute to Subversion, read the
 @file{HACKING} file at the top of Subversion's source tree.  It's also
 available at @uref{http://svn.collab.net/repos/svn/trunk/HACKING}.

 In a nutshell: Subversion behaves like many open-source projects.  One
 begins by participating in discussion on mailing lists, then by
 submitting patches for review.  Eventually, rights are granted direct
 commit access to the repository.


 @c ------------------------------------------------------------------
 @node License
 @section License

 Copyright @copyright{} 2002 Collab.Net.  All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are
 met:

 @enumerate
 @item
 Redistributions of source code must retain the above copyright notice,
 this list of conditions and the following disclaimer.

 @item
 Redistributions in binary form must reproduce the above copyright
 notice, this list of conditions and the following disclaimer in the
 documentation and/or other materials provided with the distribution.

 @item
 The end-user documentation included with the redistribution, if
 any, must include the following acknowledgment: ``This product includes
 software developed by CollabNet (@uref{http://www.Collab.Net/}).''
 Alternately, this acknowledgment may appear in the software itself, if
 and wherever such third-party acknowledgments normally appear.

 @item
 The hosted project names must not be used to endorse or promote
 products derived from this software without prior written
 permission. For written permission, please contact info@@collab.net.

 @item
 Products derived from this software may not use the ``Tigris'' name
 nor may ``Tigris'' appear in their names without prior written
 permission of CollabNet.

 @item
 THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
 WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 IN NO EVENT SHALL COLLABNET OR ITS CONTRIBUTORS BE LIABLE FOR ANY
 DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
 GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
 IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
 OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
 ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

 @end enumerate

 This software consists of voluntary contributions made by many
 individuals on behalf of CollabNet.