blob: fc1232370c56daceffd4b58400ceab85ff1a49ab [file] [log] [blame]
The Problem We're Solving
-------------------------
Subversion users typically edit sets of files in their working copy.
If a working copy contains a set of edited files which represents a
single logical change, then commands like 'svn diff', 'svn status',
'svn revert' and 'svn commit' automatically discover the edited files
and act on them.
A common problem, however, is that users often work on more than one
set of logical changes at a time. The user is required to remember
which edited file belongs to which set, and carefully run 'diff',
'revert', or 'commit' commands only on lists of files which belong
together.
One workaround for this problem is to checkout multiple working
copies, and have one task per working copy. Of course, this uses a
lot of disk space, and it's sometimes inconvenient to move around
between working copies.
The simple solution we're proposing here is to teach the svn client
(and working copy) do some simple management of local, human-named
sets of files, known as 'changesets'. The goal is to allow users to
create, view, and manipulate sets of files in a working copy by
referring to them by name.
Doesn't Perforce Do This?
-------------------------
Perforce performs changelist management, and it's a large motivation
for this new feature. But there's no way to emulate Perforce's
feature exactly; it has a different network model than Subversion. So
instead, we'll examine the use-cases that Perforce enables, and
discuss how to solve those same use-cases in Subversion.
Non-Problems
-------------
Here are problems/features that are NOT in our list of goals:
* Server management of changesets
Subversion prides itself on being disconnected; that's why it
scales so well. A changeset is an ephemeral thing created by a
single user in a single working copy, whose only purpose is to
make it easier to manipulate a change-in-progress. It's not a
"named revision", or a long-lived object in the repository.
That's what global revision numbers are for.
Some people aren't happy with the way tags work in Subversion, and
have asked for the ability to identify repository revisions by
human name. While everybody wants to see the ability to search
over revprops (for many reasons!), that whole issue is out of
scope for this changelist feature. It's been suggested that when
a changelist gets committed, it become a searchable revprop;
sounds fine, but lets get changelists and searchable revprops
implemented independently first!
* Enforcement of groupings
Changesets don't exist as a prescriptive SCM process. Some have
suggested that the client not allow people to commit individual
files in a changelist, or to do some side of server-side process
enforcement revolving around changelists. This is definitely not
in the Subversion spirit, which allows teams to create whatever
policies they wish. The only purpose of changelists is to do
provide some convenient bookkeeping to the user.
* Overlapping changelists
A number of people ask "but what if two different changes within a
single file belong to different logical changes?" My reply is:
either "tough luck" or "don't do that" or "checkout a separate
working copy". My feeling is that trying to create a UI to
manipulate individual diff-hunks within a file is a HUGE can of
worms, probably best suited for a GUI. While I wouldn't rule it
out as a future *enhancement* to a changelist feature, it's
certainly not worth the initial effort in the first draft of
changelist management. Overlapping changelists do occasionally
happen, but they're rare enough that's it's not worth spending 90%
of our time on a 10% case -- at least not in the beginning.
* "Shelving" of changes
Distributed version control systems don't have this sort of
problem; one could just do a 'local commit' of each
changeset-in-progress, create local branches, and magically swap
patches in and out as needed. To that end, many have talked about
making subversion working copies into "deep" objects containing
some degree of history, or to write a nice 'svn patch' command to
read custom 'svn diff' output. My response is: nice ideas, and
those sort of really advanced designs are certainly things that
simple changelist management can grow to take advantage of, but
aren't prerequisites for tackling this problem.
Use-Cases
---------
A. Define a changelist by explicitly adding/removing paths to it.
B. See all existing changelist names (and their member paths)
C. Destroy a changelist definition all at once.
D. Examine all edits within a changelist (svn diff)
E. Revert all edits within a changelist (svn revert)
F. Commit all edits within a changelist (svn commit)
G. Receive server changes only paths within a changelist (svn update)
H. See the history of all paths within a changelst (svn log)
I. Fetch or set props on every path within a changelist (svn pl/ps/pe/pg/pd)
How Perforce Tackles the Use-Cases
----------------------------------
A. Defining changelists
The Perforce server tracks each and every working copy, as well as
every changelist within every working copy. All working copy files
are read-only until the user declares the intent to edit ('p4 edit')
one. The server then makes the file read-write and places it into a
changelist with the name 'default'.
Users aren't allowed to invent their own names for changelists, as
this might lead to namespace overlaps. (This is a side effect of
having the server track all changelists.) 'p4 change' creates a new
changelist by prompting the user for a log message, at which point the
server yanks the 'next' global global revision number and assigns it
as a name for the changelist. The server not only tracks the
changelist via some number, but also tracks the
log-message-in-progress for the list. ('p4 describe' can show the log
message attached to a changelist.)
B. Viewing changelists
At any time, the 'p4 open' command shows all files that are being
edited, and which changelists they belong to. It's quite similar to
the 'svn status' command, except that the output is somewhat harder to
read, due to non-aligned columns.
The response time is also quite fast, since p4 doesn't need to crawl
the working copy to discover edited files. On the other hand, p4
doesn't scale so well when the server tries to track thousands of
users.
C. Destroying changelists
'p4 change -d' will delete a changelist, but only if the edited files
within the changelist have been reverted.
D. Viewing edits in a changelist
'p4 diff' shows contextual diffs for all edited files. This is
actually a bit weak, as it shows diffs for *all* changelists in a
working copy. Subversion should improve on this by allowing one to
'diff' just a single changelist.
E. Reverting a changelist
'p4 revert -c NNN' reverts all edited files within changelist #NNN.
Note that it's also possible to revert single files ('p4 revert
foo.c'). If a single file within a changelist is reverted, its path
is removed from the changelist.
F. Committing a changelist
'p4 submit -c NNN' atomically commits changelist #NNN to the
repository. If the commit succeeds, a *new* global revision number is
assigned to the final commit, and the old 'NNN' number is discarded.
(This means that p4 actually burns through global revnums at twice the
speed as subversion!) After the commit, the working copy no longer
has any record of the changelist.
G. Updating a changelist
'p4 sync' is equivalent to 'svn up'. Like subversion, 'p4 sync' can
be restricted to specific path targets, but amazingly not restricted
to a set of paths that make up a changelist. This may be something
subversion can improve upon.
H. Examining the history of changelist members
'p4 changes' is the closest thing to 'svn log'. With no arguments, it
shows all changelists ever submitted. With specific path arguments,
it limits the response to showing only changelists that affected those
paths. Again, a changelist number cannot be supplied, which is
surprising.
I. Propgets/sets on a changelist
Perforce has no versioned metadata.
Proposal for Subversion's Tackling of Use-Cases
-----------------------------------------------
A. Defining changelists
Subversion's changelist feature will be entirely client-side
bookkeeping. The purpose is to allow users to 'talk about' a set of
local paths via a convenient name, often restricting subcommands to
operate only on those paths.
The 'svn changelist' command allows a user to define a changelist with
an arbitrary UTF-8 name, as well as add member paths. (At the moment,
a --remove flag is used to remove member paths.) Unversioned items may
not be added to changelists.
$ svn changelist MYCHANGE foo.c bar.c
Path 'foo.c' is now part of changelist 'mychange'.
Path 'bar.c' is now part of changelist 'mychange'.
$ svn changelist bar.c --remove
Path 'bar.c' is no longer associated with a changelist.
### Open question: should we add a UI which allows the working copy to
manage a log-message-in-progress for each changelist, the way p4
does? This could be something stored in ~/.subversion/ area.
B. Viewing changelists
'svn status' currently shows changelist definitions by crawling the
working copy. Output is much more readable than perforce, because
we're still preserving column alignment.
$ svn st
? 1.2-backports.txt
M notes/wc-improvements
--- Changelist 'status-cleanup':
M subversion/svn/main.c
subversion/svn/revert-cmd.c
M subversion/svn/info-cmd.c
--- Changelist 'status-printing':
M subversion/svn/status-cmd.c
Note that unlike perforce, changelist membership is orthogonal to
whether or not the file has local modifications. So it's possible for
'svn status' to show a changelist containing unmodified files.
Conversely, it's possible for a file to be modified, but unassociated
with any changelist.
'svn status' considers changelist membership to be inherently
"interesting enough" to justify displaying a path, regardless of
whether it's modified.
Note that merely upgrading subversion won't break scripts that parse
'svn status' output. Such scripts might break *only* if users begin
to use the new changelist feature. This is a good balance between
allowing subversion's development to progress, while not automatically
punishing users for upgrading. (Either way, the "---" characters
should prevent scripts from accidentally detecting conflicts with "^C"
regular expressions.)
### Open question: at the moment, changelists are implemented by
simply storing a new attribute in the .svn/entries file. Rather
than having the svn client crawl and 'discover' changelists,
should we take a hint from p4 and have them centrally managed in
the ~/.subversion/ area?
Pros:
- much faster than crawling
- whole changelist definition available, regardless of CWD
Cons:
- breaks the 'portable WC' ideal. (If WC moves to another box,
changelist definition is lost.)
### Open question: should 'svn status' be able to restrict its output
to a single changelist, a la 'svn status --changelist mychange'?
C. Destroying changelists
Commands can be restricted to operate only on changelist members by
specifying the "--changelist NAME" flag. (Perhaps it can be shortened
to '--cl' also?)
To destroy a changelist, one would need to remove all member-paths
from it. There's no good UI for this yet, other than to use 'svn
changelist --remove path1 path2 path3 ...'. ### Improve this?
D. Viewing edits in a changelist
Improve on perforce by allowing 'svn diff' to restrict its output to
only members of a certain changelist:
$ svn diff --changelist mychange
[...]
E. Reverting a changelist
Allow 'svn revert' to restrict its effect just to members of a
changelist:
$ svn revert --changelist mychange
[...]
Again, note that this won't destroy the changelist. The changelist
would now contain just a set of unmodified paths, and 'svn status'
would continue to display them. (This differs from perforce, whereby
local-edits are intimately tied to changelist membership.)
F. Committing a changelist
'svn commit' should be able to commit only changelist members, just as
if the paths had been typed on the commandline individually:
$ svn commit --changelist mychange
Modifying foo.c
Adding bar.c
[...]
Committed revision YYY.
After the commit succeeds, the committed files are NO LONGER
associated with the changelist, and so the changelist definition
ceases to exist. (Note: we probably want to have a switch to
'preserve the changelist' after a commit, similar to the way in which
the '--no-unlock' switch preserves locks after a commit.)
If the user chooses to commit just a single member of a changelist,
that member is removed from the changelist after the commit.
G. Updating a changelist
### Open question: is this a useful use-case? Perforce doesn't have
it, and I've never missed it. I always want to update the entire
working copy, not just some small set of files.
H. Examining the history of changelist members
'svn log' should be able to restrict its history retrieval to only
revisions which affected members of the changelist. So running
$ svn log --changelist mychange
...should produce output equivalent to
$ svn log member1 member2 member3 ...
'svn log' already knows not to print log messages more than once
(i.e. it prints the union of all revisions).
Note that this feature would be an improvement over perforce, which
allows multiple targets on the commandline, but no changelist
shorthand for them.
I. Propgets/sets on a changelist
'svn proplist', 'svn propget', 'svn propset', 'svn propdel' should all
work with the --changelist switch as well, so that a user can quickly
perform metadata operations on a whole set of files.
### Open question: should we also allow 'svn lock/unlock' to operate
on changelists? It might be just as convenient in certain
scenarios.
----------------
### Open UI question:
If one's CWD is deep within a working copy, how should
$ svn subcommand --changlist mychange
...behave? Should it operate on *all* members of the changelist,
or only those members within the CWD (and recursively "below")?
--> malcolmr and dlr believe that it's perfectly fine to use only
parts of changelists 'below' the target path.
--------------------
==> Finished items:
* svn changelist [--remove]
* svn status shows grouped changelists
- 'svn status --changelist' works too
* 'svn info' shows changelists
* svn commit --changelist
* svn revert --changelist
* svn log --changelist
* svn diff --changelist (wc-wc and wc-repos cases)
* svn update --changelist
* svn lock/unlock --changelist
* svn propget/propset --changelist
### * svn proplist/propdel --changelist
==> TO-DO:
* make --cl the same as --changelist, for convenience?
* questions about commits:
- how does 'svn ci --changelist' interact with nonrecursive commits?
- how does it interact with a list of specific targets?
- how does it deal with a schedule-delete folder?
----------------------------
Commandline UI use-cases:
1. add path(s) to a CL:
svn cl CLNAME foo.c bar.c baz.c
2. remove path(s) from whatever CLs they each belong to.
svn cl --remove foo.c bar.c baz.c
3. move path(s) from CL1 to CL2.
svn cl CL2 foo.c
4. undefine a CL all at once (by removing all members)
svn cl --remove --changelist CLNAME
5. rename a CL
svn cl NEWNAME --changelist OLDNAME
==================================================================
Feature Revamp: sussman and cmpilato.
Goal: changelists should be treated as 'filters' everywhere, not as a
way to just add targets to a commandline.
The basic syntax of commands will be:
svn subcommand target1 target2 ... targetN \
--changelist foo1 --changelist foo2 ... --changelist fooM
The CLI parses the targets as usual: possibly inserting an implicit
'.' target, canonicalizing the list, etc.
The CLI now passes a list of changelist-names down into each
svn_client_subcommand() routine as a "bunch of filters" to apply while
working. If svn_client_subcommand() decides to process a target --
either one it got explicitly, or one it discovered through recursion
-- it first checks that the target is a member of one of the
changelists. If not, it skips the target and keeps going.
(This is the way 'svn commit' currently works: harvest_committables()
only harvests things that are committable *and* a member of the
passed-in changelist.)
This means that the UI use-cases listed above change slightly:
4. undefine a CL all at once (by removing all members)
svn cl TARGET --remove --changelist CLNAME
(TARGET might be implicit '.' or not, and depth is empty by
default; use --depth to override.)
5. rename a CL
svn cl NEWNAME TARGET --changelist OLDNAME
(TARGET might be implicit '.' or not, and depth is empty by
default; use --depth to override.)
TO-DO list:
[X] allow multiple --changelist args
[X] svn status should display grouped changelists
[X] 'svn info' should display a target's changelist field
[X] rename --keep-changelist option to --keep-changelists
[X] fix --changelist and allow multiple changelists in subcommands:
No-problem subcommands:
[X] svn changelist --changelist
[X] svn commit --changelist
[X] svn diff --changelist (only wc-wc and wc-repos cases)
[X] svn info --changelist
[X] svn propget --changelist
[X] svn proplist --changelist
[X] svn propset --changelist
[X] svn propdel --changelist
[X] svn revert --changelist
[X] svn status --changelist
Problem subcommands (see below):
[X] svn update --changelist
[X] svn lock --changelist ### removed changelist support
[X] svn log --changelist ### removed changelist support
[X] svn unlock --changelist ### removed changelist support
[ ] ensure that the bindings implementations of these APIs are up to snuff
[X] write tests!
Problem Subcommands:
Using a definition of --changelist as a filter means that
subcommands which are, by default, non-recursive in nature, have a
somewhat odd interface. For example, 'svn info --changelist FOO'
(which ultimately translates to 'svn info . --depth empty
--changelist FOO') will either return exactly one info result, or
exactly none, depending on whether or not the current working
directory is in changelist FOO. This is trivially worked around by
deepening the invocation: 'svn info -R --changelist FOO'. But what
about subcommands for which there is no --depth support, such as
'lock', 'log', 'unlock'? Do we lose the changelist support, or grow
some sort of depth-crawling ability for these things? [RESOLUTION:
We've removed changelist support from 'lock', 'unlock', and 'log'.]
'svn update' presents an interesting challenge, too. The public
svn_client_update3() API takes a list of paths, and returns a list
of revision numbers to which those paths were updated. Each path is
treated as, effectively, a separate update -- complete with output
line that notes the updated-to revision. So, if we do changelist
expansion outside the API, we might turn a single-target operation
into a multi-target one, and the user sees N full updates processes
happen. If we push 'changelists' down into the API, we can fake a
single update with notification tricks. But that starts to get
nasty when we look at non-file changelist support later and the
interactions with externals and such. And if we push 'changelists'
all the way down into the update editor, then we've got a mess of a
whole 'nuther type, downloading tons of server data we won't use,
and so on. [RESOLUTION: Let the command-line client do the
changelist path expansion.]