blob: 16da8a8e4c9eb67f8bea2f06a4d334a68585b677 [file] [log] [blame]
@node Appendices
@chapter Appendices
A number of other useful documents relevant to Subversion.
@menu
* SVN for CVS users::
* Directory versioning::
* Compiling and installing::
* Quick reference sheet::
* FAQ::
* Contributing::
* License::
@end menu
@c ------------------------------------------------------------------
@node SVN for CVS users
@section SVN for CVS users
This document is meant to be a quick-start guide for CVS users new to
Subversion. It's not a substitute for real documentation and manuals;
but it should give you a quick conceptual ``diff'' when switching over.
The goal of Subversion is to take over the current and future CVS user
base. Subversion not only includes new features, but attempts to fix
certain ``broken'' behaviors that CVS had. This means that you may be
encouraged to break certain habits -- ones that you forgot were odd to
begin with.
@menu
* Revision numbers are different now::
* More disconnected operations::
* Distinction between status and update::
* Meta-data properties::
* Directory versions::
* Conflicts::
* Binary files::
* Authorization::
* Versioned Modules::
* Branches and tags::
@end menu
@node Revision numbers are different now
@subsection Revision numbers are different now
In CVS, revision numbers are per-file. This is because CVS uses RCS
as a backend; each file has a corresponding RCS file in the
repository, and the repository is roughly laid out according to
structure of your project tree.
In Subversion, the repository looks like a single filesystem. Each
commit results in an entirely new filesystem tree; in essence, the
repository is an array of trees. Each of these trees is labeled with
a single revision number. When someone talks about ``revision 54,''
they're talking about a particular tree (and indirectly, the way the
filesystem looked after the 54th commit).
Technically, it's not valid to talk about ``revision 5 of @file{foo.c}''.
Instead, one would say ``@file{foo.c} as it appears in revision 5.''
Also, be careful when making assumptions about the evolution of a file.
In CVS, revisions 5 and 6 of @file{foo.c} are always different. In
Subversion, it's most likely that @file{foo.c} did *not* change between
revisions 5 and 6.
@node More disconnected operations
@subsection More disconnected operations
In recent years, disk space has become outrageously cheap and
abundant, but network bandwidth has not. Therefore, the Subversion
working copy has been optimized around the scarcer resource.
The @file{.svn} administrative directory serves the same purpose as the
@file{CVS} directory, except that it also stores ``pristine'' copies of files.
This allows you to do many things off-line:
@itemize @bullet
@item @samp{svn status}
shows you local modifications (see below)
@item @samp{svn diff}
shows you the details of your modifications
@item @samp{svn ci}
sends differences to the repository (CVS only sends fulltexts!)
@item @samp{svn revert}
removes your modifications
@end itemize
This last subcommand is new; it will not only remove local mods, but
it will un-schedule operations such as adds and deletes. It's the
preferred way to revert a file; running @samp{rm file; svn up} will
still work, but it blurs the purpose of updating. And, while we're on
this subject@dots{}
@node Distinction between status and update
@subsection Distinction between status and update
In Subversion, we've tried to erase a lot of the confusion between the
@samp{status} and @samp{update} subcommands.
The @samp{status} command has two purposes: (1) to show the user any local
modifications in the working copy, and (2) to show the user which
files are out-of-date. Unfortunately, because of CVS's hard-to-read
output, many CVS users don't take advantage of this command at all.
Instead, they've developed a habit of running @samp{cvs up} to quickly see
their mods. Of course, this has the side effect of merging repository
changes that you may not be ready to deal with!
With Subversion, we've tried to remove this muddle by making the
output of @samp{svn status} easy to read for humans and parsers. Also,
@samp{svn update} only prints information about files that are updated,
@emph{not} local modifications.
Here's a quick guide to @samp{svn status}. We encourage all new
Subversion users to use it early and often:
@itemize @bullet
@item @samp{svn status}
prints all files that have local modifications; the network is not
accessed by default.
@itemize @bullet
@item @option{-u} switch
add out-of-dateness information from repository
@item @option{-v} switch
show @emph{all} entries under version control
@item @option{-n} switch
nonrecursive
@end itemize
@end itemize
The status command has two output formats. In the default ``short''
format, local modifications look like this:
@example
% svn status
M ./foo.c
M ./bar/baz.c
@end example
If you specify either the @option{-u} or @option{-v} switch, a ``long''
format is used:
@example
% svn status
M 1047 ./foo.c
_ * 1045 ./faces.html
_ * - ./bloo.png
M 1050 ./bar/baz.c
Head revision: 1066
@end example
In this case, two new columns appear. The second column
contains an asterisk if the file or directory is
out-of-date. The third column shows the working-copy's revision
number of the item. In the example above, the asterisk indicates that
@file{faces.html} would be patched if we updated, and that
@file{bloo.png} is a newly added file in the repository. (The @samp{-} next
to bloo.png means that it doesn't yet exist in the working copy.)
Lastly, here's a quick summary of status codes that you may see:
@example
A Add
D Delete
R Replace (delete, then re-add)
M local Modification
U Updated
G merGed
C Conflict
@end example
Subversion has combined the CVS @samp{P} and @samp{U} codes into just
@samp{U}. When a merge or conflict occurs, Subversion simply prints
@samp{G} or @samp{C}, rather than a whole sentence about it.
@node Meta-data properties
@subsection Meta-data properties
A new feature of Subversion is that you can attach arbitrary metadata to
files and directories. We refer to this data as @dfn{properties}, and
they can be thought of as collections of name/value pairs (hashtables)
attached to each item in your working copy.
To set or get a property name, use the @samp{svn propset} and @samp{svn
propget} subcommands. To list all properties on an object, use
@samp{svn proplist}.
For more information, @xref{Properties}.
@node Directory versions
@subsection Directory versions
Subversion tracks tree structures, not just file contents. It's one
of the biggest reasons Subversion was written to replace CVS.
Here's what this means to you:
@itemize @bullet
@item
the @samp{svn add} and @samp{svn rm} commands work on directories now, just as
they work on files. So do @samp{svn cp} and @samp{svn mv}. However, these
commands do *not* cause any kind of immediate change in the
repository. Instead, the working directory is recursively ``scheduled''
for addition or deletion. No repository changes happen until you
commit.
@item
Directories aren't dumb containers anymore; they have revision
numbers like files. (Or more properly, it's correct to talk
about ``directory @file{foo/} in revision 5''.)
@end itemize
Let's talk more about that last point. Directory versioning is a Hard
Problem. Because we want to allow mixed-revision working copies,
there are some limitations on how far we can abuse this model.
From a theoretical point of view, we define ``revision 5 of directory
@file{foo}'' to mean a specific collection of directory-entries and
properties. Now suppose we start adding and removing files from @file{foo},
and then commit. It would be a lie to say that we still have revision
5 of @file{foo}. However, if we bumped @file{foo}'s revision number after the
commit, that would be a lie too; there may be other changes to @file{foo} we
haven't yet received, because we haven't updated yet.
Subversion deals with this problem by quietly tracking committed adds
and deletes in the @file{.svn} area. When you eventually run @samp{svn
update}, all accounts are settled with the repository, and the directory's new
revision number is set correctly. @b{Therefore, only after an update is
it truly safe to say that you have a ``perfect'' revision of a directory.}
Most of the time, your working copy will contain ``imperfect'' directory
revisions.
Similarly, a problem arises if you attempt to commit property changes on
a directory. Normally, the commit would bump the working directory's
local revision number. But again, that would be a lie, because there
may be adds or deletes that the directory doesn't yet have, because no
update has happened. @b{Therefore, you are not allowed to commit
property-changes on a directory unless the directory is up-to-date.}
For more specific examples and discussion: @xref{Directory versioning}.
@node Conflicts
@subsection Conflicts
CVS marks conflicts with in-line ``conflict markers'', and prints a @samp{C}
during an update. Historically, this has caused problems. Many users
forget about (or don't see) the @samp{C} after it whizzes by on their
terminal. They often forget that the conflict-markers are even
present, and then accidentally commit garbaged files.
Subversion solves this problem by making conflicts more tangible.
Read about it: @xref{Basic Work Cycle}. In particular, read the
section about ``Merging others' changes''.
@node Binary files
@subsection Binary files
CVS users have to mark binary files with @option{-kb} flags, to prevent data
from being munged (due to keyword expansion and line-ending
translations). They sometimes forget to do this.
Subversion examines the @samp{svn:mime-type} property to decide if a file
is text or binary. If the file has no @samp{svn:mime-type} property,
Subversion assumes it is text. If the file has the @samp{svn:mime-type}
property set to anything other than @samp{text/*}, it assumes the file is
binary.
Subversion also helps users by running a binary-detection algorithm in
the @samp{svn import} and @samp{svn add} subcommands. These subcommands will
make a good guess and then (possibly) set a binary @samp{svn:mime-type}
property on the file being added. (If Subversion guesses wrong, you
can always remove or hand-edit the property.)
As in CVS, binary files are not subject to keyword expansion or
line-ending conversions. Also, when a binary file is ``merged'' during
update, no real merge occurs. Instead, Subversion creates two files
side-by-side in your working copy; the one containing your local
modifications is renamed with an @file{.orig} extension.
@node Authorization
@subsection Authorization
Unlike CVS, SVN can handle anonymous and authorized users in the same
repository. There is no need for an anonymous user or a separate
repository. If the SVN server requests authorization when committing,
the client should prompt you for your authorization (password).
@node Versioned Modules
@subsection Versioned Modules
Unlike CVS, a Subversion working copy is aware that it has checked out
a module. That means that if somebody changes the definition of a
module, then a call to @samp{svn up} will update the working copy
appropriately.
Subversion defines modules as a list of directories within a directory
property. @xref{Modules}.
@node Branches and tags
@subsection Branches and tags
Subversion doesn't distinguish between filesystem space and ``branch''
space; branches and tags are ordinary directories within the
filesystem. This is probably the single biggest mental hurdle a CVS
user will need to climb. Read all about it: @xref{Branches and Tags}.
@c ------------------------------------------------------------------
@node Directory versioning
@section Directory versioning
@quotation
@emph{"The three cardinal virtues of a master technologist are:
laziness, impatience, and hubris." -- Larry Wall}
@end quotation
This appendix describes some of the theoretical pitfalls around the
(possibly arrogant) notion that one can simply version directories
just as one versions files.
@subsection Directory Revisions
To begin, recall that the Subversion repository is an array of trees.
Each tree represents the application of a new atomic commit, and is
called a @dfn{revision}. This is very different from a CVS repository,
which stores file histories in a collection of RCS files (and doesn't
track tree-structure.)
So when we refer to ``revision 4 of @file{foo.c}'' (written @dfn{foo.c:4}) in
CVS, this means the fourth distinct version of @file{foo.c} -- but in
Subversion this means ``the version of @file{foo.c} in the fourth revision
(tree)''. It's quite possible that @file{foo.c} has never changed at all
since revision 1! In other words, in Subversion, different revision
numbers of the same versioned item do @emph{not} imply different
contents.
Nevertheless, the contents of @file{foo.c:4} is still well-defined. The
file @file{foo.c} in revision 4 has a specific text and properties.
Suppose, now, that we extend this concept to directories. If we have a
directory @file{DIR}, define @dfn{DIR:N} to be ``the directory DIR in the
fourth revision.'' The contents are defined to be a particular set of
directory entries (@dfn{dirents}) and properties.
So far, so good. The concept of versioning directories seems fine in
the repository -- the repository is very theoretically pure anyway.
However, because working copies allow mixed revisions, it's easy to
create problematic use-cases.
@subsection The Lagging Directory
@subsubsection Problem
@c This is the first part of the ``Greg Hudson'' problem, so named
@c because he was the first one to bring it up and define it well. :-)
Suppose our working copy has directory @samp{DIR:1} containing file
@samp{foo:1}, along with some other files. We remove @file{foo} and
commit.
Already, we have a problem: our working copy still claims to have
@samp{DIR:1}. But on the repository, revision 1 of @file{DIR} is
@emph{defined} to contain @samp{foo} -- and our working copy @file{DIR} clearly
does not have it anymore. How can we truthfully say that we still have
@samp{DIR:1}?
One answer is to force @file{DIR} to be updated when we commit
@file{foo}'s deletion. Assuming that our commit created revision 2, we
would immediately update our working copy to @samp{DIR:2}. Then the
client and server would both agree that @samp{DIR:2} does not contain
foo, and that @samp{DIR:2} is indeed exactly what is in the working
copy.
This solution has nasty, un-user-friendly side effects, though. It's
likely that other people may have committed before us, possibly adding
new properties to @file{DIR}, or adding a new file @file{bar}. Now pretend our
committed deletion creates revision 5 in the repository. If we
instantly update our local @file{DIR} to 5, that means unexpectedly receiving a
copy of @file{bar} and some new propchanges. This clearly violates a UI
principle: ``the client will never change your working copy until you ask
it to.'' Committing changes to the repository is a server-write
operation only; it should @emph{not} modify your working data!
Another solution is to do the naive thing: after committing the
deletion of @file{foo}, simply stop tracking the file in the @file{.svn}
administrative directory. The client then loses all knowledge of the
file.
But this doesn't work either: if we now update our working copy, the
communication between client and server is incorrect. The client still
believes that it has @samp{DIR:1} -- which is false, since a ``true''
@samp{DIR:1} contains @file{foo}. The client gives this incorrect
report to the repository, and the repository decides that in order to
update to revision 2, @file{foo} must be deleted. Thus the repository
sends a bogus (or at least unnecessary) deletion command.
@subsubsection Solution
After deleting @file{foo} and committing, the file is @emph{not}
totally forgotten by the @file{.svn} directory. While the file is no
longer considered to be under revision control, it is still secretly
remembered as having been `deleted'.
When the user updates the working copy, the client correctly informs the
server that the file is already missing from its local @samp{DIR:1};
therefore the repository doesn't try to re-delete it when patching the
client up to revision 2.
@c Notes, for coders, about how the `deleted' flag works under the hood:
@c * the @samp{svn status} command won't display a deleted item, unless
@c you make the deleted item the specific target of status.
@c
@c * when a deleted item's parent is updated, one of two things will happen:
@c
@c (1) the repository will re-add the item, thereby overwriting
@c the entire entry. (no more `deleted' flag)
@c
@c (2) the repository will say nothing about the item, which means
@c that it's fully aware that your item is gone, and this is
@c the correct state to be in. In this case, the entire entry
@c is removed. (no more `deleted' flag)
@c
@c * if a user schedules an item for addition that has the same name
@c as a `deleted' entry, then entry will have both flags
@c simultaneously. This is perfectly fine:
@c
@c * the commit-crawler will notice both flags and do a delete()
@c and then an add(). This ensures that the transaction is
@c built correctly. (without the delete(), the add() would be
@c on top of an already-existing item.)
@c
@c * when the commit completes, the client rewrites the entry as
@c normal. (no more `deleted' flag)
@subsection The Overeager Directory
@c This is the 2nd part of the ``Greg Hudson'' problem.
@subsubsection Problem
Again, suppose our working copy has directory @samp{DIR:1} containing
file @samp{foo:1}, along with some other files.
Now, unbeknownst to us, somebody else adds a new file @file{bar} to this
directory, creating revision 2 (and @samp{DIR:2}).
Now we add a property to @file{DIR} and commit, which creates revision
3. Our working-copy @file{DIR} is now marked as being at revision 3.
Of course, this is false; our working copy does @emph{not} have
@samp{DIR:3}, because the ``true'' @samp{DIR:3} on the repository contains
the new file @file{bar}. Our working copy has no knowledge of
@file{bar} at all.
Again, we can't follow our commit of @file{DIR} with an automatic update
(and addition of @file{bar}). As mentioned previously, commits are a
one-way write operation; they must not change working copy data.
@subsubsection Solution
Let's enumerate exactly those times when a directory's local revision
number changes:
@itemize @bullet
@item
@b{when a directory is updated}: if the directory is either the direct
target of an update command, or is a child of an updated directory, it
will be bumped (along with many other siblings and children) to a
uniform revision number.
@item
@b{when a directory is committed}: a directory can only be considered a
"committed object" if it has a new property change. (Otherwise, to
"commit a directory" really implies that its modified children are being
committed, and only such children will have local revisions bumped.)
@end itemize
In this light, it's clear that our ``overeager directory'' problem only
happens in the second situation -- those times when we're committing
directory propchanges.
Thus the answer is simply not to allow property-commits on directories
that are out-of-date. It sounds a bit restrictive, but there's no other
way to keep directory revisions accurate.
@c Note to developers: this restriction is enforced by the filesystem
@c merge() routine.
@c Once merge() has established that {ancestor, source, target} are all
@c different node-rev-ids, it examines the property-keys of ancestor
@c and target. If they're *different*, it returns a conflict error.
@subsection User impact
Really, the Subversion client seems to have two difficult---almost
contradictory---goals.
First, it needs to make the user experience friendly, which generally
means being a bit ``sloppy'' about deciding what a user can or cannot do.
This is why it allows mixed-revision working copies, and why it tries to
let users execute local tree-changing operations (delete, add, move,
copy) in situations that aren't always perfectly, theoretically ``safe''
or pure.
Second, the client tries to keep the working copy in correctly in sync
with the repository using as little communication as possible. Of
course, this is made much harder by the first goal!
So in the end, there's a tension here, and the resolutions to problems
can vary. In one case (the ``lagging directory''), the problem can be
solved through a bit of clever entry tracking in the client. In the
other case ("the overeager directory"), the only solution is to
restrict some of the theoretical laxness allowed by the client.
@c ------------------------------------------------------------------
@node Compiling and installing
@section Compiling and installing
The latest instructions for compiling and installing Subversion (and
httpd-2.0) are maintained in the @file{INSTALL} file at the top of the
Subversion source tree.
In general, you should also be able to find the latest version of this
file by grabbing it directly from Subversion's own repository:
@uref{http://svn.collab.net/repos/svn/trunk/INSTALL}
@c ------------------------------------------------------------------
@node Quick reference sheet
@section Quick reference sheet
A latex quick-reference sheet exists on Subversion's website for
download, which is compiled from the source file in
@file{doc/user/svn-ref.tex} directory. Any volunteers to rewrite it here
in texinfo?
@c ------------------------------------------------------------------
@node FAQ
@section FAQ
The main FAQ for the project can viewed directly in Subversion's
repository:
@uref{http://svn.collab.net/repos/svn/trunk/www/project_faq.html}
@c ------------------------------------------------------------------
@node Contributing
@section Contributing
For a full description of how to contribute to Subversion, read the
@file{HACKING} file at the top of Subversion's source tree. It's also
available at @uref{http://svn.collab.net/repos/svn/trunk/HACKING}.
In a nutshell: Subversion behaves like many open-source projects. One
begins by participating in discussion on mailing lists, then by
submitting patches for review. Eventually, rights are granted direct
commit access to the repository.
@c ------------------------------------------------------------------
@node License
@section License
Copyright @copyright{} 2002 Collab.Net. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
@enumerate
@item
Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
@item
Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
@item
The end-user documentation included with the redistribution, if
any, must include the following acknowledgment: ``This product includes
software developed by CollabNet (@uref{http://www.Collab.Net/}).''
Alternately, this acknowledgment may appear in the software itself, if
and wherever such third-party acknowledgments normally appear.
@item
The hosted project names must not be used to endorse or promote
products derived from this software without prior written
permission. For written permission, please contact info@@collab.net.
@item
Products derived from this software may not use the ``Tigris'' name
nor may ``Tigris'' appear in their names without prior written
permission of CollabNet.
@item
THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL COLLABNET OR ITS CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@end enumerate
This software consists of voluntary contributions made by many
individuals on behalf of CollabNet.