blob: d79af24c5fbf51be60c09ed1d5a813a9b03e270e [file] [log] [blame]
@node Model
@chapter Model
This chapter explains the user's view of Subversion --- what ``objects''
you interact with, how they behave, and how they relate to each other.
@menu
* Working Directories and Repositories::
* Transactions and Version Numbers::
* How Working Directories Track the Repository::
* Subversion Does Not Lock Files::
* Properties::
* Merging and Ancestry::
@end menu
@c -----------------------------------------------------------------------
@node Working Directories and Repositories
@section Working Directories and Repositories
Suppose you are using Subversion to manage a software project. There
are two things you will interact with: your working directory, and the
repository.
Your @dfn{working directory} is an ordinary directory tree, on your
local system, containing your project's sources. You can edit these
files and compile your program from them in the usual way. Your working
directory is your own private work area: Subversion never changes the
files in your working directory, or publishes the changes you make
there, until you explicitly tell it to do so.
After you've made some changes to the files in your working directory,
and verified that they work properly, Subversion provides commands to
publish your changes to the other people working with you on your
project. If they publish their own changes, Subversion provides
commands to incorporate those changes into your working directory.
A working directory contains some extra files, created and maintained by
Subversion, to help it carry out these commands. In particular, these
files help Subversion recognize which files contain unpublished changes,
and which files are out-of-date with respect to others' work.
While your working directory is for your use alone, the @dfn{repository}
is the common public record you share with everyone else working on the
project. To publish your changes, you use Subversion to put them in the
repository. (What this means, exactly, we explain below.) Once your
changes are in the repository, others can tell Subversion to incorporate
your changes into their working directories. In a collaborative
environment like this, each user will typically have their own working
directory (or perhaps more than one), and all the working directories
will be backed by a single repository, shared amongst all the users.
A Subversion repository holds a single directory tree, and records the
history of changes to that tree. The repository retains enough
information to recreate any prior state of the tree, compute the
differences between any two prior trees, and report the relations
between files in the tree --- which files are derived from which other
files.
A Subversion repository can hold the source code for several projects;
usually, each project is a subdirectory in the tree. In this
arrangement, a working directory will usually correspond to a particular
subtree of the repository.
For example, suppose you have a repository laid out like this:
@example
/trunk/paint/Makefile
canvas.c
brush.c
write/Makefile
document.c
search.c
@end example
In other words, the repository's root directory has a single
subdirectory named @file{trunk}, which itself contains two
subdirectories: @file{paint} and @file{write}.
To get a working directory, you must @dfn{check out} some subtree of the
repository. If you check out @file{/trunk/write}, you will get a working
directory like this:
@example
write/Makefile
document.c
search.c
SVN/
@end example
This working directory is a copy of the repository's @file{/trunk/write}
directory, with one additional entry --- @file{SVN} --- which holds the
extra information needed by Subversion, as mentioned above.
Suppose you make changes to @file{search.c}. Since the @file{SVN}
directory remembers the file's modification date and original contents,
Subversion can tell that you've changed the file. However, Subversion
does not make your changes public until you explicitly tell it to.
To publish your changes, you can use Subversion's @samp{commit} command:
@example
$ pwd
/home/jimb/write
$ ls
Makefile SVN/ document.c search.c
$ svn commit search.c
$
@end example
Now your changes to @file{search.c} have been committed to the
repository; if another user checks out a working copy of
@file{/trunk/write}, they will see your text.
Suppose you have a collaborator, Felix, who checked out a working
directory of @file{/trunk/write} at the same time you did. When you
commit your change to @file{search.c}, Felix's working copy is left
unchanged; Subversion only modifies working directories at the user's
request.
To bring his working directory up to date, Felix can use the Subversion
@samp{update} command. This will incorporate your changes into his
working directory, as well as any others that have been committed since
he checked it out.
@example
$ pwd
/home/felix/write
$ ls
Makefile SVN/ document.c search.c
$ svn update
U search.c
$
@end example
The output from the @samp{svn update} command indicates that Subversion
updated the contents of @file{search.c}. Note that Felix didn't need to
specify which files to update; Subversion uses the information in the
@file{SVN} directory, and further information in the repository, to
decide which files need to be brought up to date.
We explain below what happens when both you and Felix make changes to
the same file.
@c -----------------------------------------------------------------------
@node Transactions and Version Numbers
@section Transactions and Version Numbers
A Subversion @samp{commit} operation can publish changes to any number
of files and directories as a single atomic transaction. In your
working directory, you can change files' contents, create, delete,
rename and copy files and directories, and then commit the completed set
of changes as a unit.
In the repository, each commit is treated as an atomic transaction:
either all the commit's changes take place, or none of them take place.
Subversion tries to retain this atomicity in the face of program
crashes, system crashes, network problems, and other users' actions. We
may call a commit a @dfn{transaction} when we want to emphasize its
indivisible nature.
Each time the repository accepts a transaction, this creates a new state
of the tree, called a @dfn{version}. Each version is assigned a unique
natural number, one greater than the number of the previous version.
The initial version of a freshly created repository is numbered zero,
and consists of an empty root directory.
Since each transaction creates a new version, with its own number, we
can also use these numbers to refer to transactions; transaction @var{n}
is the transaction which created version @var{n}. There is no
transaction numbered zero.
Unlike those of many other systems, Subversion's version numbers apply
to an entire tree, not individual files. Each version number selects an
entire tree.
It's important to note that working directories do not always correspond
to any single version in the repository; they may contain files from
several different versions. For example, suppose you check out a
working directory from a repository whose most recent version is 4:
@example
write/Makefile:4
document.c:4
search.c:4
@end example
At the moment, this working directory corresponds exactly to version 4
in the repository. However, suppose you make a change to
@file{search.c}, and commit that change. Assuming no other commits have
taken place, your commit will create version 5 of the repository, and
your working directory will look like this:
@example
write/Makefile:4
document.c:4
search.c:5
@end example
Suppose that, at this point, Felix commits a change to
@file{document.c}, creating version 6. If you use @samp{svn update} to
bring your working directory up to date, then it will look like this:
@example
write/Makefile:6
document.c:6
search.c:6
@end example
Felix's changes to @file{document.c} will appear in your working copy of
that file, and your change will still be present in @file{search.c}. In
this example, the text of @file{Makefile} is identical in versions 4, 5,
and 6, but Subversion will mark your working copy with version 6 to
indicate that it is still current. So, after you do a clean update at
the root of your working directory, your working directory will
generally correspond exactly to some version in the repository.
@c -----------------------------------------------------------------------
@node How Working Directories Track the Repository
@section How Working Directories Track the Repository
For each file in a working directory, Subversion records two essential
pieces of information:
@itemize @bullet
@item
what version of what repository file your working copy is based on (this is called the file's @dfn{base version}), and
@item
a timestamp recording when the local copy was last updated.
@end itemize
Given this information, by talking to the repository, Subversion can
tell which of the following four states a file is in:
@itemize
@item
@b{Unchanged, and current.} The file is unchanged in the working
directory, and no changes to that file have been committed to the
repository since its base version.
@item
@b{Locally changed, and current}. The file has been changed in the
working directory, and no changes to that file have been committed to
the repository since its base version. There are local changes that
have not been committed to the repository.
@item
@b{Unchanged, and out-of-date}. The file has not been changed in the
working directory, but it has been changed in the repository. The file
should eventually be updated, to make it current with the public
version.
@item
@b{Locally changed, and out-of-date}. The file has been changed both
in the working directory, and in the repository. The file should be
updated; Subversion will attempt to merge the public changes with the
local changes. If it can't complete the merge in a plausible way
automatically, Subversion leaves it to the user to resolve the conflict.
@end itemize
@c -----------------------------------------------------------------------
@node Subversion Does Not Lock Files
@section Subversion Does Not Lock Files
Subversion does not prevent two users from making changes to the same
file at the same time. For example, if both you and Felix have checked
out working directories of @file{/trunk/write}, Subversion will allow
both of you to change @file{write/search.c} in your working directories.
Then, the following sequence of events will occur:
@itemize @bullet
@item
Suppose Felix tries to commit his changes to @file{search.c} first. His
commit will succeed, and his text will appear in the latest version in
the repository.
@item
When you attempt to commit your changes to @file{search.c}, Subversion
will reject your commit, and tell you that you must update
@file{search.c} before you can commit it.
@item
When you update @file{search.c}, Subversion will try to merge Felix's
changes from the repository with your local changes. By default,
Subversion merges as if it were applying a patch: if your local changes
do not overlap textually with Felix's, then all is well; otherwise,
Subversion leaves it to you to resolve the overlapping
changes. In either case,
Subversion carefully preserves a copy of the original pre-merge text.
@item
Once you have verified that Felix's changes and your changes have been
merged correctly, you can commit the new version of @file{search.c},
which now contains everyone's changes.
@end itemize
Some version control systems provide ``locks'', which prevent others
from changing a file once one person has begun working on it. In our
experience, merging is preferable to locks, because:
@itemize @bullet
@item
changes usually do not conflict, so Subversion's behavior does the right
thing by default, while locking can interfere with legitimate work;
@item
locking can prevent conflicts within a file, but not conflicts between
files (say, between a C header file and another file that includes it),
so it doesn't really solve the problem; and finally,
@item
people often forget that they are holding locks, resulting in
unnecessary delays and friction.
@end itemize
Of course, the merge process needs to be under the users' control.
Patch is not appropriate for files with rigid formats, like images or
executables. Subversion allows users to customize its merging behavior
on a per-file basis. You can direct Subversion to refuse to merge
changes to certain files, and simply present you with the two original
texts to choose from. Or, you can direct Subversion to merge using a
tool which respects the semantics of the file format.
@c -----------------------------------------------------------------------
@node Properties
@section Properties
Files generally have interesting attributes beyond their contents:
owners and groups, access permissions, creation and modification times,
and so on. Subversion attempts to preserve these attributes, or at
least record them, when doing so would be meaningful. However,
different operating systems support very different sets of file
attributes: Windows NT supports access control lists, while Linux
provides only the simpler traditional Unix permission bits.
In order to interoperate well with clients on many different operating
systems, Subversion supports @dfn{property lists}, a simple,
general-purpose mechanism which clients can use to store arbitrary
out-of-band information about files.
A property list is a set of name / value pairs. A property name is an
arbitrary text string, expressed as a Unicode UTF-8 string, canonically
decomposed and ordered. A property value is an arbitrary string of
bytes. Property values may be of any size, but Subversion may not
handle very large property values efficiently. No two properties in a
given a property list may have the same name. Although the word `list'
usually denotes an ordered sequence, there is no fixed order to the
properties in a property list; the term `property list' is historical.
Each version number, file, directory, and directory entry in the
Subversion repository, has its own property list. Subversion puts these
property lists to several uses:
@itemize @bullet
@item
Clients use properties to store file attributes, as described above.
For example, the Unix Subversion client records the files' permission
bits as the value of a property called
@samp{svn-posix-access-permission}. Operating systems which allow files
to have more than one name, like Windows 95, can use directory entry
property lists to record files' alternative names.
@item
The Subversion server uses properties to hold attributes of its own, and
allow clients to read and modify them. For example, the @samp{svn-acl}
property holds an access control list which the Subversion server uses
to regulate access to repository files.
@item
Users can invent properties of their own, to store arbitrary information
for use by scripts, build environments, and so on. Names of user
properties should be URI's, to avoid conflicts between organizations.
@end itemize
Property lists are versioned, just like file contents. You can change
properties in your working directory, but those changes are not visible
in the repository until you commit your local changes. If you do commit
a change to a property value, other users will see your change when they
update their working directories.
@c -----------------------------------------------------------------------
@node Merging and Ancestry
@section Merging and Ancestry
Subversion defines merges the same way CVS does: to merge means to take
a set of previously committed changes and apply them, as a patch, to a
working copy. This change can then be committed, like any other change.
(In Subversion's case, the patch may include changes to directory trees,
not just file contents.)
As defined thus far, merging is equivalent to hand-editing the working
copy into the same state as would result from the patch application. In
fact, in CVS there @emph{is} no difference -- it is equivalent to just
editing the files, and there is no record of which ancestors these
particular changes came from. Unfortunately, this leads to conflicts
when users unintentionally merge the same changes again. (Experienced
CVS users avoid this problem by using branch- and merge-point tags, but
that involves a lot of unwieldy bookkeeping.)
In Subversion, merges are remembered by recording @dfn{ancestry sets}.
A version's ancestry set is the set of all changes "accounted for" in
that version. By maintaining ancestry sets, and consulting them when
doing merges, Subversion can detect when it would apply the same patch
twice, and spare users much bookkeeping. Ancestry sets are stored as
properties.
In the examples below, bear in mind that version numbers usually refer
to changes, rather than the full contents of that version. For example,
"the change A:4" means "the delta that resulted in A:4", not "the full
contents of A:4".
The simplest ancestor sets are associated with linear histories. For
example, here's the history of a file A:
@example
@group
_____ _____ _____ _____ _____
| | | | | | | | | |
| A:1 |----->| A:2 |----->| A:3 |----->| A:4 |----->| A:5 |
|_____| |_____| |_____| |_____| |_____|
@end group
@end example
The ancestor set of A:5 is:
@example
@group
@{ A:1, A:2, A:3, A:4, A:5 @}
@end group
@end example
That is, it includes the change that brought A from nothing to A:1, the
change from A:1 to A:2, and so on to A:5. From now on, ranges like this
will be represented with a more compact notation:
@example
@group
@{ A:1-5 @}
@end group
@end example
Now assume there's a branch B based, or "rooted", at A:2. (This
postulates an entirely different version history, of course, and the
global version numbers in the diagrams will change to reflect it.)
Here's what the project looks like with the branch:
@example
@group
_____ _____ _____ _____ _____ _____
| | | | | | | | | | | |
| A:1 |----->| A:2 |----->| A:4 |----->| A:6 |----->| A:8 |----->| A:9 |
|_____| |_____| |_____| |_____| |_____| |_____|
\
\
\ _____ _____ _____
\| | | | | |
| B:3 |----->| B:5 |----->| B:7 |
|_____| |_____| |_____|
@end group
@end example
If we produce A:9 by merging the B branch back into the trunk
@example
@group
_____ _____ _____ _____ _____ _____
| | | | | | | | | | | |
| A:1 |----->| A:2 |----->| A:4 |----->| A:6 |----->| A:8 |---.->| A:9 |
|_____| |_____| |_____| |_____| |_____| / |_____|
\ |
\ |
\ _____ _____ _____ /
\| | | | | | /
| B:3 |----->| B:5 |----->| B:7 |--->-'
|_____| |_____| |_____|
@end group
@end example
then what will A:9's ancestor set be?
@example
@group
@{ A:1, A:2, A:4, A:6, A:8, A:9, B:3, B:5, B:7@}
@end group
@end example
or more compactly:
@example
@group
@{ A:1-9, B:3-7 @}
@end group
@end example
(It's all right that each file's ranges seem to include non-changes;
this is just a notational convenience, and you can think of the
non-changes as either not being included, or being included but being
null deltas as far as that file is concerned).
All changes along the B line are accounted for (changes B:3-7), and so
are all changes along the A line, including both the merge and any
non-merge-related edits made before the commit.
Although this merge happened to include all the branch changes, that
needn't be the case. For example, the next time we merge the B line
@example
@group
_____ _____ _____ _____ _____ _____ _____
| | | | | | | | | | | | | |
| A:1 |-->| A:2 |-->| A:4 |-->| A:6 |-->| A:8 |-.->| A:9 |-.->|A:11 |
|_____| |_____| |_____| |_____| |_____| | |_____| | |_____|
\ / |
\ / |
\ _____ _____ _____ / _____ |
\| | | | | | / | | /
| B:3 |-->| B:5 |-->| B:7 |-->|B:10 |->-'
|_____| |_____| |_____| |_____|
@end group
@end example
Subversion will know that A's ancestry set already contains B:3-7, so
only the difference between B:7 and B:10 will be applied. A's new
ancestry will be
@example
@group
@{ A:1-11, B:3-10 @}
@end group
@end example
But why limit ourselves to contiguous ranges? An ancestry set is truly
a set -- it can be any subset of the changes available:
@example
@group
_____ _____ _____ _____ _____ _____
| | | | | | | | | | | |
| A:1 |----->| A:2 |----->| A:4 |----->| A:6 |----->| A:8 |--.-->|A:10 |
|_____| |_____| |_____| |_____| |_____| / |_____|
| /
| ______________________.__/
| / |
| / |
\ __/_ _|__
\ @{ @} @{ @}
\ _____ _____ _____ _____
\| | | | | | | |
| B:3 |----->| B:5 |----->| B:7 |----->| B:9 |----->
|_____| |_____| |_____| |_____|
@end group
@end example
In this diagram, the change from B:3-5 and the change from B:7-9 are
merged into a working copy whose ancestry set (so far) is @w{@{ A:1-8
@}} plus any local changes. After committing, A:10's ancestry set is
@example
@group
@{ A:1-10, B:5, B:9 @}
@end group
@end example
Clearly, saying "Let's merge branch B into A" is a little ambiguous. It
usually means "Merge all the changes accounted for in B's tip into A",
but it @emph{might} mean "Merge the single change that resulted in B's
tip into A".
Any merge, when viewed in detail, is an application of a particular set
of changes -- not necessarily adjacent ones -- to a working copy. The
user-level interface may allow some of these changes to be specified
implicitly. For example, many merges involve a single, contiguous range
of changes, with one or both ends of the range easily deducible from
context (i.e., branch root to branch tip). These inference rules are
not specified here, but it should be clear in most contexts how they
work.
Because each node knows its ancestors, Subversion never merges the same
change twice (unless you force it to). For example, if after the above
merge, you tell Subversion to merge all B changes into A, Subversion
will notice that two of them have already been merged, and so merge only
the other two changes, resulting in a final ancestry set of:
@example
@group
@{ A:1-10, B:3-9 @}
@end group
@end example
@c Heh, what about this:
@c
@c B:3 adds line 3, with the text "foo".
@c B:5 deletes line 3.
@c B:7 adds line 3, with the text "foo".
@c B:9 deletes line 3.
@c
@c The user first merges B:5 and B:9 into A. If A had that line, it
@c goes away now, nothing more.
@c
@c Next, user merges B:3 and B:7 into A. The second merge must
@c conflict.
@c
@c I'm not sure we need to care about this, I just thought I'd note how
@c even merges that seem like they ought to be easily composable can
@c still suck. :-)
This description of merging and ancestry applies to both intra- and
inter-repository merges. However, inter-repository merging will
probably not be implemented until a future release of Subversion
(@pxref{Future}).