| ****************************************************************************** |
| REQUIREMENTS SPECIFICATION |
| FOR |
| ISSUE #516: OBLITERATE |
| ****************************************************************************** |
| |
| |
| TABLE OF CONTENTS |
| |
| OPEN ISSUES |
| |
| 1. INTRODUCTION |
| 1.1 Sources of Requirements |
| |
| 2. USER STORIES |
| 2.1 Added secrets in a new file |
| 2.2 Added secrets into an existing file |
| 2.3 Added a single huge file by accident |
| 2.4 Repeated modification of a huge file |
| |
| 3. REQUIREMENTS |
| 3.1 Levels of Obliteration |
| 3.2 Content of the Modified Repository |
| 3.3 Working Copies |
| 3.4 Access to the Modified Repository |
| 3.5 Audit Trail |
| 3.6 Svnsync Mirrors |
| 3.7 Permissions |
| 3.8 Time Taken |
| |
| |
| OPEN ISSUES |
| |
| (none) |
| |
| |
| 1. INTRODUCTION |
| |
| This document captures the requirements for the Subversion feature commonly |
| known as "Obliterate". It is intended to include all of the requirements |
| that could be deemed to fall within the scope of an Obliterate feature. The |
| set of requirements to be satisfied by a proposed development of such a |
| feature may be a specified sub-set of those listed here. |
| |
| The purpose of this document is to enable a design to be evaluated and an |
| implementation to be tested against specific criteria that are all written |
| down in one place. |
| |
| Section 2 lists requirements from a user's point of view. |
| |
| Section 3 lists requirements from a software design point of view. |
| |
| 1.1 Sources of Requirements |
| |
| The requirements are sourced from: |
| * Comments in issue #516. |
| * Comments on the Subversion developers' mailing list. |
| * Personal experience of the authors. |
| |
| |
| 2. USER STORIES |
| |
| The "user stories" are examples, described from a user's point of view, of |
| scenarios in which the Obliterate feature should or might be used. Their |
| purpose is to indicate the range and diversity of requirements, without |
| being an exhaustive list of combinations. They loosely define the high-level |
| requirements which the specific requirements in section 3 must satisfy. |
| |
| The following user stories are gathered from the sources in section 1 and |
| include both typical and unusual use cases. |
| |
| 2.1 Added secrets in a new file |
| |
| User U1 has just accidentally committed the addition of a new file F1 that |
| contains confidential data (let's say people's addresses). F1 is visible |
| to other users of the repository. The probability of anyone committing |
| another change before the administrator can intervene is low. The |
| probability of anyone updating their WC to this revision is low. |
| |
| U1 wants to restrict the visibility and propagation of the confidential |
| data as soon as possible. |
| |
| Possible solutions: |
| * hide the existence of F1 |
| * replace the content of F1 with empty content |
| * replace the content of F1 with its "previous" content (definition |
| required) |
| * replace the content of F1 with arbitrary other content |
| * roll back the entire head revision (definition required) |
| * something else. |
| |
| 2.2 Added secrets into an existing file |
| |
| User U1 has just accidentally committed a change that adds confidential |
| data (let's say people's addresses) into an existing file F1. F1 is |
| visible to other users of the repository. The existence and other content |
| of F1 is important to other users. |
| |
| U1 wants to restrict the visibility and propagation of the confidential |
| data as soon as possible. |
| |
| 2.3 Added a single huge file by accident |
| |
| User U1 has just accidentally committed the addition of a new file F1 that |
| is huge and unwanted, with no other changes included in the commit. |
| |
| U1 wants to get rid of the file in order to save space and time on |
| colleagues' WC updates. |
| |
| 2.4 Repeated modification of a huge file |
| |
| User U1 keeps checking in the latest version of a huge file F1, in order |
| to have them handy for testing. Nobody needs versions of F1 older than 2 |
| weeks; they can be re-generated from source if required. F1 is usually |
| checked in alongside some modifications to source files. |
| |
| U1 wants to prune old versions of F1 regularly in order to limit server |
| disk space usage. |
| |
| This use case is not directly what most people consider to be |
| "obliterate". It is really a separate feature that could use the |
| functionality of "obliterate" in its implementation, but could also be |
| implemented in other ways. |
| |
| |
| 3. REQUIREMENTS |
| |
| The requirements listed here are a set of design requirements that together |
| would satisfy all of the user-level requirements. A successful design will |
| satify most of these requirements to a large extent, but need not satisfy |
| all of them completely. A functional design document should specify which |
| of these requirements it satisfies, and to what extent. |
| |
| Each requirement can be designated for convenience as "functional" or |
| "non-functional". A functional requirement specifies what output is produced |
| from what input, where input and output include such things as repositories, |
| working copies and audit trails. A non-functional requirement is a |
| constraint on how the functional operation is performed, such as speed of |
| operation or memory usage. |
| |
| 3.1 Levels of Obliteration |
| |
| The requirements involve the following "levels" of obliteration: |
| |
| L1: hiding data from clients |
| (a) avoiding sending the data in any new communications |
| (b) removing data from repository mirrors that already have it |
| (c) removing data from clients that already have it |
| |
| L2: hiding data from people with direct access to the server disk |
| |
| L3: recovering space on the server disk |
| |
| NOTES: |
| |
| L1 and L3 are directly relevant to the common use cases. Requirements |
| for L2 are coneivable but appear not to be common. |
| |
| 3.2 Content of the Modified Repository |
| |
| * At revisions older than the obliteration, the repository should yield |
| exactly the same data that it used to. |
| |
| RATIONALE: A Subversion repository has no forward-looking metadata so |
| there is no reason for old revisions to be changed so they should not be |
| changed. |
| |
| EXCEPTIONS: Any manual adjustments to revision properties, such as to |
| forward-looking comments in log messages or to third-party data in |
| revision-0 properties. |
| |
| * At the revision of the obliterated data, the stored tree should be |
| modified in a way to be specified in a Functional Spec. Briefly, two |
| likely schemes are: |
| (scheme "dd") each node to be obliterated is deleted; or |
| (scheme "cc") each node to be obliterated becomes exactly like it was |
| in the previous revision. |
| |
| * At each revision younger than the obliteration, the repository file |
| system tree structure and content should look exactly as it used to. |
| However, any node with a "copied from" pointer that pointed to a node |
| which has been removed by obliteration should have this pointer adjusted |
| or removed, as defined by the Functional Spec. |
| |
| NOTES: |
| |
| This description assumes per-revision granularity of obliteration. |
| |
| 3.3 Working Copies |
| |
| * A WC managed by an obliterate-aware Subversion client and logically |
| unaffected should show no sign that anything has happened. |
| |
| * A WC managed by an obliterate-aware Subversion client and logically |
| affected by the change should behave in a friendly manner ... |
| |
| * A WC managed by an old (pre-obliterate) Subversion client and logically |
| unaffected should show little or no sign that anything has happened, and |
| should require no user intervention to continue working. |
| |
| * A WC managed by an old (pre-obliterate) Subversion client and logically |
| affected by the change should ... |
| |
| 3.4 Access to the Modified Repository |
| |
| * The modified repository should keep the same URL and UUID, and client |
| access should continue without manual intervention, after any required |
| down-time, for all working copies that are not logically affected by the |
| obliteration. |
| |
| Rationale: Obliteration is often required in large repositories having |
| large numbers of users, most of whom are not working near the |
| obliterated data. If all users were impacted each time, then |
| obliteration could become impractical. |
| |
| 3.5 Audit Trail |
| |
| * On the client side, no trace of the obliteration need be visible other |
| than the intended changes to versioned data and to revision properties. |
| |
| * On the server side, the administrator should be able to choose whether a |
| record of obliterations is stored. The form and storage location of this |
| record is not specified here. |
| |
| NOTES: |
| |
| Some customers are concerned about auditability and may want an audit |
| trail to be stored with the repository so that it is included in backups |
| and perpetually available for later examination. |
| |
| 3.6 Svnsync Mirrors |
| |
| * A read-only mirror of the repository maintained by an old |
| (pre-obliterate) version of "svnsync" should either keep all of its |
| already-copied revisions exactly as they were, and continue to copy new |
| revisions from the modified repository without any hiccup, or it should |
| stop working so that its administrator has to intervene. |
| |
| Rationale: An old svnsync has no way to re-synchronize old revisions. If |
| it behaves just like a regular client that had been taking snap-shots of |
| the master repository, that would be logical and self-consistent but not |
| propagating the obliteration; that's a problem for the secrecy use |
| cases. If it requires human intervention, that would disrupt its users |
| but would force a human to consider whether the mirrored data should be |
| kept or modified. Ideally the administrator of the master repository |
| would control which of these scenarios will occur. |
| |
| * A read-only mirror of the repository maintained by an obliterate-aware |
| version of "svnsync" should re-synchronize its old revisions to match |
| the modified master repository. |
| |
| 3.7 Permissions |
| |
| * The data-hiding part of an obliterate should be available to a user |
| with suitable permissions, from the client side, using a standard |
| Subversion client installation. |
| |
| * The space-saving part of an obliterate should be available to an |
| administrator, from the server side, using a standard Subversion server |
| installation. This may also be available in the same way as the |
| data-hiding part. |
| |
| 3.8 Time Taken |
| |
| * The time from when an administrator discovers an accidental secrecy |
| problem to when the data in question is unavailable to ordinary clients |
| (that don't already have it) should be within minutes, or at most hours, |
| on a large repository. |
| |
| * The time from when an administrator discovers an accidental large |
| check-in until the data can be removed from the repository should be at |
| most hours, on a large repository. (The intent here is that an |
| administrator should be able to avoid the data getting into a nightly |
| back-up, if desired.) |
| |