blob: 2f2af8ed8d158d8f7395844eea561f70c553aeae [file] [log] [blame]
@node Repository Administration
@chapter Repository Administration
How to administer a Subversion repository.
In this section, we'll mainly focus on how to use the
@command{svnadmin} and @command{svnlook} programs to work with repositories.
@menu
* Creating a repository::
* Examining a repository::
* Repository hooks::
* Repository maintenance::
* Networking a repository::
* Migrating a repository::
* WebDAV::
@end menu
@c ------------------------------------------------------------------
@node Creating a repository
@section Creating a repository
Creating a repository is incredibly simple:
@example
$ svnadmin create path/to/myrepos
@end example
This creates a new repository in a subdirectory @file{myrepos}.
(Note that the @command{svnadmin} and @command{svnlook} programs
operate @emph{directly} on a repository, by linking to @file{libsvn_fs.so}.
So these tools expect ordinary, local paths to the repositories. This
is in contrast with the @command{svn} client program, which always
accesses a repository via some URL, whether it be via @url{http://}
or @url{file:///} schemas.)
A new repository always begins life at revision 0, which is defined to
be nothing but the root (@file{/}) directory.
As mentioned earlier, repository revisions can have unversioned
properties attached to them. In particular, every revision is created
with a @samp{svn:date} timestamp property. (Other common properties
include @samp{svn:author} and @samp{svn:log})
For a newly created repository, revision 0 has nothing but a
@samp{svn:date} property attached.
Here is a quick run-down of the anatomy of a repository:
@example
$ ls myrepos
conf/
dav/
db/
hooks/
locks/
@end example
@table @samp
@item conf
Currently unused; repository-side config files will go in here someday.
@item dav
If the repository is being accessed by Apache and mod_dav_svn, some
private housekeeping databases are stored here.
@item db
The main Berkeley DB environment, full of DB tables that comprise the
data store for libsvn_fs. This is where all of your data is! In
particular, most of your files' contents end up in the ``strings'' table.
Logfiles accumulate here as well, so transactions can be recovered.
@item hooks
Where pre-commit and post-commit hook scripts live. (And someday, read-hooks.)
@item locks
A single file lives here; repository readers and writers take out
shared locks on this file. Do not remove this file.
@end table
Once the repository has been created, it's very likely that you'll
want to use the svn client to import an initial tree. (Try
@samp{svn help import}, or @xref{Other Commands}.)
You may want to give your repository an initial directory structure
that reflects the trunk, branches, and tags of your project(s)
(@xref{Branches and Tags}.) You can do this via @samp{svn mkdir}:
@example
$ svnadmin create /path/to/repos
$ svn mkdir file:///path/to/repos/projectA -m 'Base dir for A'
Committed revision 1.
$ svn mkdir file:///path/to/repos/projectA/trunk -m 'Main dir for A'
Committed revision 2.
$ svn mkdir file:///path/to/repos/projectA/branches -m 'Branches for A'
Committed revision 3.
$ svn mkdir file:///path/to/repos/projectA/tags -m 'Tags for A'
Committed revision 4.
$ svn co file:///path/to/repos/projectA/trunk projectA
Checked out revision 4.
# ... now work on projectA ...
@end example
With @samp{svn import}, you can create the structure with a single
commit:
@example
$ svnadmin create /path/to/repos
$ mkdir projectA
$ mkdir projectA/trunk
$ mkdir projectA/branches
$ mkdir projectA/tags
$ svn import file:///path/to/repos projectA projectA -m 'Dir layout for A'
Adding projectA/trunk
Adding projectA/branches
Adding projectA/tags
Committed revision 1.
$ rm -rf projectA/
$ svn co file:///path/to/repos/projectA/trunk projectA
Checked out revision 1.
# ... now work on projectA ...
@end example
@c ------------------------------------------------------------------
@node Examining a repository
@section Examining a repository
@subsection Transactions and Revisions
A Subversion repository is essentially a sequence of trees; each tree
is called a @dfn{revision}. (If this is news to you, it might be good
for you to read @ref{Transactions and Revision Numbers}.)
Every revision begins life as a @dfn{transaction} tree. When doing a
commit, a client builds a transaction that mirrors their local changes,
and when the commit succeeds, the transaction is effectively ``promoted''
into a new revision tree, and is assigned a new revision number.
At the moment, updates work in a similar way: the client builds a
transaction tree that is a ``mirror'' of their working copy. The
repository then compares the transaction tree with some revision tree,
and sends back a tree-delta. After the update completes, the
transaction is deleted.
Transaction trees are the only way to ``write'' to the repository's
versioned filesystem; all users of libsvn_fs will do this. However,
it's important to understand that the lifetime of a transaction is
completely flexible. In the case of updates, transactions are temporary
trees that are immediately destroyed. In the case of commits,
transactions are transformed into permanent revisions (or aborted if the
commit fails.) In the case of an error or bug, it's possible that a
transaction can be accidentally left lying around -- the libsvn_fs
caller might die before deleting it. And in theory, someday whole
workflow applications might revolve around the creation of transactions;
they might be examined in turn by different managers before being
deleted or promoted to revisions.
The point is: if you're administering a Subversion repository, you're
going to have to examine revisions and transactions. It's part of
monitoring the health of the repository.
@subsection @command{svnlook}
@command{svnlook} is a read-only@footnote{Why read-only? Because if a
pre-commit hook script changed the transaction before commit, the
working copy would have no way of knowing what happened, and would
therefore be out of sync and not know it. Subversion currently has no
way to handle this situation, and maybe never will.} tool that can be
used to examine the revision and transaction trees within a repository.
It's useful for system administrators, and can be used by the
@file{pre-commit} and @file{post-commit} hook scripts as well.
The simplest usage is
@example
$ svnlook repos
@end example
This will print information about the HEAD revision in the repository
``repos.'' In particular, it will show the log message, author, date, and
a diagram of the tree.
To look at a particular revision or transaction:
@example
$ svnlook repos rev 522
$ svnlook repos txn 340
@end example
Or, if you only want to see certain types of information,
@command{svnlook} accepts a number of subcommands. For example,
@example
$ svnlook repos rev 522 log
$ svnlook repos rev 559 diff
@end example
Available subcommands are:
@table @samp
@item @samp{log}
Print the tree's log message.
@item @samp{author}
Print the tree's author.
@item @samp{date}
Print the tree's datestamp.
@item @samp{dirs-changed}
List the directories that changed in the tree.
@item @samp{changed}
List all files and directories that changed in the tree.
@item @samp{diff}
Print unified diffs of changed files.
@end table
@subsection the shell
The @command{svnadmin} tool has a toy ``shell'' mode as well. It doesn't
do much, but it allows you to poke around the repository as if it were
an imaginary mounted filesystem. The basic commands @samp{cd},
@samp{ls}, @samp{exit}, and @samp{help} are available, as well
as the very special command @samp{cr} -- ``change revision.'' The last
command allows you to move @emph{between} revision trees.
@example
$ svnadmin shell repos
<609: />$
<609: />$ ls
< 1.0.2i7> [ 601] 1 0 trunk/
<nh.0.2i9> [ 588] 0 0 branches/
<jz.0.18c> [ 596] 0 0 tags/
<609: />$ cd trunk
<609: /trunk>$ cr 500
<500: /trunk>$ ls
< 2.0.1> [ 1] 0 3462 svn_config.dsp
< 4.0.dj> [ 487] 0 3856 PORTING
< 3.0.cr> [ 459] 0 7886 Makefile.in
< d.0.ds> [ 496] 0 9736 build.conf
< 5.0.d9> [ 477] 1 0 ac-helpers/
< y.0.1> [ 1] 0 1805 subversion.dsp
@dots{}
<500: />$ exit
@end example
The output of @samp{ls} has only a few columns:
@example
NODE-ID CREATED-REV HAS_PROPS? SIZE NAME
< 1.0.2i7> [ 601] 1 0 trunk/
<nh.0.2i9> [ 588] 0 0 branches/
<jz.0.18c> [ 596] 0 0 tags/
@end example
@c ------------------------------------------------------------------
@node Repository hooks
@section Repository hooks
A @dfn{hook} is a program triggered by a repository read or write
access. The hook is handed enough information to tell what the action
is, what target(s) it's operating on, and who is doing it. Depending on
the hook's output or return status, the hook program may continue the
action, stop it, or suspend it in some way.
Subversion's hooks are programs that live in the repository's @file{hooks}
directory:
@example
$ ls repos/hooks/
post-commit.tmpl* read-sentinels.tmpl write-sentinels.tmpl
pre-commit.tmpl* start-commit.tmpl*
@end example
This is how the @file{hooks} directory appears after a repository is first
created. It doesn't contain any hook programs -- just templates.
The actual hooks need to be named @file{start-commit}, @file{pre-commit} and
@file{post-commit}. The template (.tmpl) files are example shell scripts to
get you started; read them for details about how each hook works. To
make your own hook, just copy @file{foo.tmpl} to @file{foo} and edit.
(The @file{read-sentinels} and @file{write-sentinels} are not yet implemented.
They are intended to be more like daemons than hooks. A sentinel is
started up at the beginning of a user operation. The Subversion
server communicates with the sentinel using a protocol yet to be
defined. Depending on the sentinel's responses, Subversion may stop
or otherwise modify the operation.)
Here is a description of the hook programs:
@table @samp
@item @file{start-commit}
This is run before the committer's transaction is even created. It is
typically used to decide if the user has commit privileges at all. The
repository passes two arguments to this program: the path to the
repository, and username which is attempting to commit. If the program
returns a non-zero exit value, the commit is stopped before the
transaction is even created.
@item @file{pre-commit}
This is run when the transaction is complete, but before it is
committed. Typically, this hook is used to protect against commits that
are disallowed due to content or location (for example, your site might
require that all commits to a certain branch include a ticket number
from the bug tracker, or that the incoming log message is
non-empty.)@footnote{At this time, this is the only method by which
users can implement finer-grained access control beyond what
@file{httpd.conf} offers. In a future version of Subversion, we plan to
implement ACLs directly in the filesystem.} The repository passes two
arguments to this program: the path to the repository, and the name of
the transaction being committed. If the program returns a non-zero exit
value, the commit is aborted and transaction is removed.
The Subversion distribution includes a
@file{tools/hook-scripts/commit-access-control.pl} script that can be
called from @file{pre-commit} to implement fine-grained access control.
@item post-commit
This is run after the transaction is committed, and we have a new
revision. Most people use this hook to send out descriptive
commit-emails or to make a hot-backup of the repository. The repository
passes two arguments to this program: the path to the repository, and
the new revision number that was created. The exit code of the program
is ignored.
The Subversion distribution includes a
@file{tools/hook-scripts/commit-email.pl} script that can be used to
send out the differences applied in the commit to any number of email
addresses. Also included is @file{tools/backup/hot-backup.py}, which is
a script that perform hot backups of your Subversion repository after
every commit.
@end table
Note that the hooks must be executable by the user who will invoke them
(commonly the user httpd runs as), and that same user needs to be able
to access the repository.
The @file{pre-commit} and @file{post-commit} hooks need to know things
about the change about to be committed (or that has just been
committed). The solution is a standalone program, @command{svnlook}
(@xref{Examining a repository}.) which was installed in the same place
as the @command{svn} binary. Have the script use @command{svnlook} to
examine a transaction or revision tree. It produces output that is both
human- and machine-readable, so hook scripts can easily parse it. Note
that @command{svnlook} is read-only -- it can only inspect, not change
the repository.
@c ------------------------------------------------------------------
@node Repository maintenance
@section Repository maintenance
@subsection Berkeley DB management
At the time of writing, the subversion repository has only one
database back-end: Berkeley DB. All of your filesystem's structure
and data live in a set of tables within @file{repos/db/}.
Berkeley DB comes with a number of tools for managing these files, and
they have their own excellent documentation. (See
@uref{http://www.sleepycat.com/}, or just read man pages.) We won't
cover all of these tools here; rather, we'll mention just a few of the
more common procedures that repository administrators might need.
First, remember that Berkeley DB has genuine transactions. Every
attempt to change the DB is first logged. If anything ever goes
wrong, the DB can back itself up to a previous `checkpoint' and
replay transactions to get the data back into a sane state.
In our experience, we have seen situations where a bug in Subversion
(which causes a crash) can sometimes have a side-effect of leaving the
DB environment in a `locked' state. Any further attempts to read or
write to the repository just sit there, waiting on the lock.
To `unwedge' the repository:
@enumerate
@item
Shut down the Subversion server, to make sure nobody is accessing the
repository's Berkeley DB files.
@item
Switch to the user who owns and manages the database.
@item
Run the command @command{db_recover -v -h @var{repos}/db}, where
@var{repos} is the repository's directory name. You should see
output like this:
@example
db_recover: Finding last valid log LSN: file: 40 offset 4080873
db_recover: Checkpoint at: [40][4080333]
db_recover: Checkpoint LSN: [40][4080333]
db_recover: Previous checkpoint: [40][4079793]
db_recover: Checkpoint at: [40][4079793]
db_recover: Checkpoint LSN: [40][4079793]
db_recover: Previous checkpoint: [40][4078761]
db_recover: Recovery complete at Sun Jul 14 07:15:42 2002
db_recover: Maximum transaction id 80000000 Recovery checkpoint [40][4080333]
@end example
Make sure that the @command{db_recover} program you invoke is the one
distributed with the same version of Berkeley DB you're using in your
Subversion server.
@item
Restart the Subversion server.
@end enumerate
Make sure you run this command as the user that owns and manages the
database --- typically your Apache process --- and @emph{not} as root.
Running @command{db_recover} as root leaves files owned by root in the
@file{db} directory, which the non-root user that manages the database
cannot open. If you do this, you'll get ``permission denied'' error
messages when you try to access the repository.
Second, a repository administrator may need to manage the growth of
logfiles. At any given time, the DB environment is using at least one
logfile to log transactions; when the `current' logfile grows to 10
megabytes, a new logfile is started, and the old one continues to
exist.
Thus, after a while, you may see a whole group of 1MB logfiles lying
around the environment. At this point, you can make a choice: if you
leave every single logfile behind, it's guaranteed that
@command{db_recover} will always be able to replay every single DB
transaction, all the way back to the first commit. (This is the
`safe', or perhaps paranoid, route.) On the other hand, you can ask
Berkeley DB to tell you which logfiles are no longer being actively
written to:
@example
$ db_archive -a -h repos/db
log.0000000023
log.0000000024
log.0000000029
@end example
Subversion's own repository uses a @file{post-commit} hook script, which,
after performing a `hot-backup' of the repository, removes these
excess logfiles. (In the Subversion source tree, see
@file{tools/backup/hot-backup.py})
This script also illustrates the safe way to perform a backup of the
repository while it's still up and running: recursively copy the
entire repository directory, then re-copy the logfiles listed by
@samp{db_archive -l}.
To start using a repository backup that you've restored, be sure to
run @samp{db_recover -v} command in the @file{db} area first.
This guarantees that any unfinished log transactions are fully played
before the repository goes live again. (The @file{hot-backup.py}
script does that for you during backup, so you can skip this step
if you decide to use it.)
Finally, note that Berkeley DB has a whole locking subsystem; in
extremely intensive svn operations, we have seen situations where the
DB environment runs out of locks. The maximum number of locks can be
adjusted by changing the values in the @file{repos/db/DB_CONFIG}
file. Don't change the default values unless you know what you're
doing; be sure to read
@uref{http://www.sleepycat.com/docs/ref/lock/max.html} first.
@subsection Tweaking with svnadmin
The @command{svnadmin} tool has some subcommands that are specifically
useful to repository administrators. Be careful with
@command{svnadmin}! Unlike @command{svnlook}, which is read-only,
@command{svnadmin} has the ability to modify the repository.
The most-used feature is probably @samp{svnadmin setlog}. A
commit's log message is an unversioned property directly attached to
the revision object; there's only one log message per revision.
Sometimes a user screws up the message, and it needs to be replaced:
@example
$ echo "Here is the new, correct log message" > newlog.txt
$ svnadmin setlog myrepos 388 newlog.txt
@end example
There's a nice CGI script in @file{tools/cgi/} that allows people
(with commit-access passwords) to tweak existing log messages via web
browser.
Another common use of @command{svnadmin} is to inspect and clean up
old, dead transactions. Commits and updates both create transaction
trees, but occasionally a bug or crash can leave them lying around.
By inspecting the datestamp on a transaction, an administrator can
make a judgment call and remove it:
@example
$ svnadmin lstxns myrepos
319
321
$ svnadmin lstxns --long myrepos
Transaction 319
Created: 2002-07-14T12:57:22.748388Z
@dots{}
$ svnadmin rmtxns myrepos 319 321
@end example
@c ### Hey guys, are going to continue to support @samp{svnadmin undeltify}??
Another useful subcommand: @samp{svnadmin undeltify}. Remember
that the latest version of each file is stored as fulltext in the
repository, but that earlier revisions of files are stored as ``deltas''
against each next-most-recent revisions. When a user attempts to
access an earlier revision, the repository must apply a sequence of
backwards-deltas to the newest fulltexts in order to derive the older
data.
If a particular revision tree is extremely popular, the administrator
can speed up the access time to this tree by ``undeltifying'' any path
within the revision -- that is, by converting every file to fulltext:
@example
$ svnadmin undeltify myrepos 230 /project/tags/release-1.3
Undeltifying `/project/tags/release-1.3' in revision 230...done.
@end example
@c ------------------------------------------------------------------
@node Networking a repository
@section Networking a repository
Okay, so now you have a repository, and you want to make it available
over a network.
Subversion's primary network server is Apache httpd speaking
WebDAV/deltaV protocol, which is a set of extension methods to http.
(For more information on DAV, see @uref{http://www.webdav.org/}.)
To network your repository, you'll need to
@itemize @bullet
@item
get Apache httpd 2.0 up and running with the mod_dav module
@item
install the mod_dav_svn plugin to mod_dav, which uses
Subversion's libraries to access the repository
@item
configure your @file{httpd.conf} file to export the repository
@end itemize
You can accomplish the first two items by either building httpd and
Subversion from source code, or by installing a binary packages on
your system. The second appendix of this document contains more
detailed instructions on doing this. (@xref{Compiling and
installing}.) Instructions are also available in the @file{INSTALL}
file in Subversion's source tree.
In this section, we focus on configuring your @file{httpd.conf}.
Somewhere near the bottom of your configuration file, define a new
@samp{Location} block:
@example
<Location /repos/myrepo>
DAV svn
SVNPath /absolute/path/to/myrepo
</Location>
@end example
This now makes your @file{myrepo} repository available at the URL
@url{http://hostname/repos/myrepo}.
Alternately, you can use the @samp{SVNParentPath} directive to
indicate a ``parent'' directory whose immediate subdirectories are
are assumed to be independent repositories:
@example
<Location /repos>
DAV svn
SVNParentPath /absolute/path/to/parent/dir
</Location>
@end example
If you were to run @samp{svnadmin create foorepo} within this parent
directory, then the url @url{http://hostname/repos/foorepo} would
automatically be accessible without having to change @file{httpd.conf}
or restart httpd.
Note that this simple @samp{<Location>} setup starts life with no
access restrictions at all:
@itemize @bullet
@item
Anyone can use their svn client to checkout either a working copy of a
repository URL, or of any URL that corresponds to a subdirectory of a
repository.
@item
By pointing an ordinary web browser at a repository URL, anyone can
interactively browse the repository's latest revision.
@item
Anyone can commit to a repository.
@end itemize
If you want to restrict either read or write access to a repository as
a whole, you can use Apache's built-in access control features.
First, create an empty file that will hold httpd usernames and
passwords. Place names and crypted passwords into this file like so:
@example
joe:Msr3lKOsYMkpc
frank:Ety6rZX6P.Cqo
mary:kV4/mQbu0iq82
@end example
You can generate the crypted passwords by using the standard
@samp{crypt(3)} command, or using the @command{htpasswd} tool
supplied in Apache's @file{bin} directory:
@example
$ /usr/local/apache2/bin/htpasswd -n sussman
New password:
Re-type new password:
sussman:kUqncD/TBbdC6
@end example
Next, add lines within your @samp{<Location>} block that point to the
user file:
@example
AuthType Basic
AuthName "Subversion repository"
AuthUserFile /path/to/users/file
@end example
If you want to restrict @emph{all} access to the repository, add one
more line:
@example
Require valid-user
@end example
This line make Apache require user authentication for every single
type of http request to your repository.
To restrict write-access only, you need to require a valid user for
all request methods @emph{except} those that are read-only:
@example
<LimitExcept GET PROPFIND OPTIONS REPORT>
Require valid-user
</LimitExcept>
@end example
Or, if you want to get fancy, you can create two separate user files,
one for readers, and one for writers:
@example
AuthGroupFile /my/svn/group/file
<LimitExcept GET PROPFIND OPTIONS REPORT>
Require group svn_committers
</LimitExcept>
<Limit GET PROPFIND OPTIONS REPORT>
Require group svn_committers
Require group svn_readers
</Limit>
@end example
These are only a few simple examples. For a complete tutorial on
Apache access control, please consider taking a look at the
``Security'' tutorials found at
@uref{http://httpd.apache.org/docs-2.0/misc/tutorials.html}.
Another note: in order for @samp{svn cp} to work (which is actually
implemented as a DAV COPY request), mod_dav needs to be able to be
able to determine the hostname of the server. A standard way of doing
this is to use Apache's ServerName directive to set the server's
hostname. Edit your @file{httpd.conf} to include:
@example
ServerName svn.myserver.org
@end example
If you are using virtual hosting through Apache's @samp{NameVirtualHost}
directive, you may need to use the @samp{ServerAlias} directive to specify
additional names that your server is known by.
(If you are unfamiliar with an Apache directive, or not exactly sure
about what it does, don't hesitate to look it up in the documentation:
@uref{http://httpd.apache.org/docs-2.0/mod/directives.html}.)
You can test your exported repository by firing up httpd:
@example
$ /usr/local/apache2/bin/apachectl stop
$ /usr/local/apache2/bin/apachectl start
@end example
Check @file{/usr/local/apache2/logs/error_log} to make sure it started up
okay. Try doing a network checkout from the repository:
@example
$ svn co http://localhost/repos wc
@end example
The most common reason this might fail is permission problems reading
the repository db files. Make sure that the user ``nobody'' (or
whatever UID the httpd process runs as) has permission to read and
write the Berkeley DB files! This is a very common problem.
You can see all of mod_dav_svn's complaints in the Apache error
logfile, @file{/usr/local/apache2/logs/error_log}, or wherever you
installed Apache. For more information about tracing problems, see
"Debugging the server" in the @file{HACKING} file.
@c ------------------------------------------------------------------
@node Migrating a repository
@section Migrating a repository
Sometimes special situations arise where you need to move all of your
filesystem data from one repository to another. Perhaps the internal
fs database schema has changed in some way in a new release of
Subversion, or perhaps you'd like to start using a different database
``back end''.
Either way, your data needs to be migrated to a new repository. To do
this, we have the @samp{svnadmin dump} and @samp{svnadmin load}
commands.
@samp{svnadmin dump} writes a stream of your repository's data to
stdout:
@example
$ svnadmin dump myrepos > dumpfile
* Dumped revision 0.
* Dumped revision 1.
* Dumped revision 2.
@dots{}
@end example
This stream describes every revision in your repository as a list of
changes to nodes. It's mostly human-readable text; but when a file's
contents change, the entire fulltext is dumped into the stream. If
you have binary files or binary property-values in your repository,
those parts of the stream may be unfriendly to human readers.
After dumping your data, you would then move the file to a different
system (or somehow alter the environment to use a different version of
@command{svnadmin} and/or @file{libsvn_fs.so}), and create a
``new''-style repository that has a new schema or DB back-end:
@example
$ svnadmin create newrepos
@end example
The @samp{svnadmin load} command attempts to read a dumpstream from
stdin, and effectively replays each commit:
@example
$ svnadmin load newrepos < dumpfile
<<< Started new txn, based on original revision 1
* adding path : A ... done.
* adding path : A/B ... done.
@dots{}
------- Committed new rev 1 (loaded from original rev 1) >>>
<<< Started new txn, based on original revision 2
* editing path : A/mu ... done.
* editing path : A/D/G/rho ... done.
------- Committed new rev 2 (loaded from original rev 2) >>>
@end example
Voila, your revisions have been recommitted into the new repository.
And because @command{svnadmin} uses standand input and output streams for
the repository dump and load process, people who are feeling saucy with
Unix can try things like this:
@example
$ svnadmin create newrepos
$ svnadmin dump myrepos | svnadmin load newrepos
@end example
@subsection Partial dump/load
You can also create a dumpfile that represents a specific range of
revisions. @command{svnadmin dump} takes optional starting and ending
revisions to accomplish just that task.
@example
$ svnadmin dump myrepos 23 > rev-23.dumpfile
$ svnadmin dump myrepos 100 200 > revs-100-200.dumpfile
@end example
Now, regardless of the range of revisions used when dumping the
repository, the default behavior is for the first revision dumped to
always be compared against revision 0, which is just the empty root
directory @file{/}. This means that the first revision in any dumpfile
will always look like a gigantic list of ``added'' nodes. We do this so
that a file like @file{revs-100-200.dumpfile} can be directly loaded
into an empty repository.
However, if you add the @option{--incremental} option when you dump your
repository, this tells @command{svnadmin} to compare the first dumped
revision against the previous revision in the repository, the same way
it treats every other revision that gets dumped. The benefit of this is
that you can create several small dumpfiles that can be loaded in
succession, instead of one large one, like so:
@example
$ svnadmin dump myrepos 0 1000 > dumpfile1
$ svnadmin dump myrepos 1001 2000 --incremental > dumpfile2
$ svnadmin dump myrepos 2001 3000 --incremental > dumpfile3
@end example
These dumpfiles could be loaded into a new repository with the following
command sequence:
@example
$ svnadmin load newrepos < dumpfile1
$ svnadmin load newrepos < dumpfile2
$ svnadmin load newrepos < dumpfile3
@end example
Another neat trick you can perform with this @option{--incremental}
option involves appending to an existing dumpfile a new range of
revisions. For example, you might have a post-commit hook that simply
appends the repository dump of the single revision that triggered the
hook. Or you might have a script like the following that runs nightly
to append dumpfile data for all the revisions that were added to the
repository since the last time the script ran.
@example
#!/usr/bin/perl
$repos_path = '/path/to/repos';
$dumpfile = '/usr/backup/svn-dumpfile';
$last_dumped = '/var/log/svn-last-dumped';
# Figure out the starting revision (0 if we cannot read the last-dumped file,
# else use the revision in that file incremented by 1).
if (open LASTDUMPED, "$last_dumped")
@{
$new_start = <LASTDUMPED>;
chomp $new_start;
$new_start++;
close LASTDUMPED;
@}
else
@{
$new_start = 0;
@}
# Query the youngest revision in the repos.
$youngest = `svnadmin youngest $repos_path`;
chomp $youngest;
# Do the backup.
`svnadmin dump $repos_path $new_start $youngest --incremental >> $dumpfile`;
# Store a new last-dumped revision
open LASTDUMPED, "> $last_dumped" or die;
print LASTDUMPED "$youngest\n";
close LASTDUMPED;
# All done!
@end example
As you can see, the Subversion repository dumpfile format, and specifically
@command{svnadmin}'s use of that format, can be a valuable means by
which to backup changes to your repository over time in case of a system
crash or some other catastrophic event.
@c ------------------------------------------------------------------
@node WebDAV
@section WebDAV
Subversion uses WebDAV (Distributed Authoring and Versioning) as its
primary network protocol, and here we discuss what this means to you,
both present and future.
WebDAV was designed to make the web into a read/write medium, instead
of a read-only medium (as it mainly exists today.) The theory is that
directories and files can be shared over the web, using standardized
extensions to HTTP. RFC 2518 describes the WebDAV extensions to HTTP,
and is available (along with a lot of other useful information) at
@uref{http://www.webdav.org/}.
Already, a number of operating system file-browsers are able to mount
networked directories using WebDAV. On Win32, the Windows Explorer
can browse what it calls ``WebFolders'', just like any other share.
Mac OS X also has this capability, as does the Nautilus browser for
GNOME.
However, RFC 2518 doesn't fully implement the ``versioning'' aspect of
WebDAV. A separate committee has created RFC 3253, known as the
@dfn{DeltaV} extensions to WebDAV, available at
@uref{http://www.webdav.org/deltav/}. These extensions add
version-control concepts to HTTP, and this is what Subversion uses.
It's important to understand that while Subversion uses DeltaV for
communication, the Subversion client is @emph{not} a general-purpose
DeltaV client. In fact, it expects some custom features from the
server. Further, the Subversion server is not a general-purpose DeltaV
server. It implements a strict subset of the DeltaV specification. A
WebDAV or DeltaV client may very well be able to interoperate with it,
but only if that client operates within the narrow confines of those
features the server has implemented. Future versions of Subversion
will address more complete WebDAV interoperability.
At the moment, most DAV browsers and clients do not yet support
DeltaV; this means that a Subversion repository can viewed or mounted
only as a read-only resource. (An HTTP ``PUT'' request is valid when
sent to a WebDAV-only server, but a DeltaV server such as mod_dav_svn
will not allow it. The client must use special version-control
methods to write to the server.) And on the flip side, a Subversion
client cannot checkout a working copy from a generic WebDAV server; it
expects a specific subset of DeltaV features.
For a detailed description of Subversion's WebDAV implementation, see
@uref{http://svn.collab.net/repos/svn-repos/trunk/www/webdav-usage.html}.