| <html> |
| <head> |
| <title>SVN Test</title> |
| </head> |
| |
| <body bgcolor="white"> |
| |
| <h1>Design goals for the SVN test suite</h1> |
| |
| <ul> |
| <li> |
| <A HREF="#WHY">Why Test?</A> |
| </li> |
| <li> |
| <A HREF="#AUDIENCE">Audience</A> |
| </li> |
| <li> |
| <A HREF="#REQUIREMENTS">Requirements</A> |
| </li> |
| <li> |
| <A HREF="#EASEOFUSE">Ease of Use</A> |
| </li> |
| <li> |
| <A HREF="#LOCATION">Location</A> |
| </li> |
| <li> |
| <A HREF="#EXTERNAL">External dependencies</A> |
| </li> |
| </ul> |
| |
| |
| |
| <A NAME="WHY"><h2>Why Test?</h2></A> |
| |
| <p> |
| Regression testing is an essential element of high quality software. |
| Unfortunately, some developers have not had first hand exposure to a |
| high quality testing framework. Lack of familiarity with the positive |
| effects of testing can be blamed for statements like: |
| <br> |
| </p> |
| <blockquote> |
| "I don't need to test my code, I know it works." |
| </blockquote> |
| <p> |
| It is safe to say that the idea that developers do not introduce |
| bugs has been disproved. |
| </p> |
| |
| |
| <A NAME="AUDIENCE"><h2>Audience</h2></A> |
| |
| <p> |
| The test suite will be used by both developers and end users. |
| </p> |
| |
| <p> |
| <b>Developers</b> need a test suite to help with: |
| </p> |
| |
| <p> |
| <b><i>Fixing Bugs:</i></b> |
| <br> |
| Each time a bug is fixed, a test case should be added to the test |
| suite. Creating a test case that reproduces a bug is a seemingly |
| obvious requirement. If a bug cannot be reproduced, there is no way to |
| be sure a given change will actually fix the problem. Once a test case |
| has been created, it can be used to validate the correctness of a |
| given patch. Adding a new test case for each bug also ensures that |
| the same bug will not be introduced again in the future. |
| </p> |
| |
| <p> |
| <b><i>Impact Analysis:</i></b> |
| <br> |
| A developer fixing a bug or adding a new feature needs to know if a |
| given change breaks other parts of the code. It may seem obvious, but |
| keeping a developer from introducing new bugs is one of the primary |
| benefits of a using a regression test system. |
| </p> |
| |
| <p> |
| <b><i>Regression Analysis:</i></b> |
| <br> |
| When a test regression occurs, a developer will need to manually |
| determine what has caused the failure. The test system is not able to |
| determine why a test case failed. The test system should simply report |
| exactly which test results changed and when the last results were |
| generated. |
| </p> |
| |
| <p> |
| <b>Users</b> need a test suite to help with: |
| </p> |
| |
| <p> |
| <b><i>Building:</i></b> |
| <br> |
| Building software can be a scary process. Users that have never built |
| software may be unwilling to try. Others may have tried to build a |
| piece of software in the past, only to be thwarted by a difficult |
| build process. Even if the build completed without an error, how can a |
| user be confident that the generated executable actually works? The |
| only workable solution to this problem is to provide an easily |
| accessible set of tests that the user can run after building. |
| </p> |
| |
| <p> |
| <b><i>Porting:</i></b> |
| <br> |
| Often, users become porters when the need to run on a previously |
| unsupported system arises. This porting process typically require some |
| minor tweaking of include files. It is absolutely critical that |
| testing be available when porting since the primary developers may not |
| have any way to test changes submitted by someone doing a port. |
| </p> |
| |
| |
| <p> |
| <b><i>Testing:</i></b> |
| <br> |
| Different installations of the exact same OS can contain subtle |
| differences that cause software to operate incorrectly. Only testing |
| on different systems will expose problems of this nature. A test suite |
| can help identify these sorts of problems before a program is actually |
| put to use. |
| </p> |
| |
| |
| |
| |
| <A NAME="REQUIREMENTS"><h2>Requirements</h2></A> |
| |
| <p> |
| Functional requirements of an acceptable test suite include: |
| </p> |
| |
| <p> |
| <b><i>Unique Test Identifiers:</i></b> |
| <br> |
| Each test case must have a globally unique test identifier, this |
| identifier is just a string. A globally unique string is |
| required so that test cases can be individually identified by |
| name, sorted, and even looked up on the web. It seems simple, |
| perhaps even blatantly obvious, but some other test packages |
| have failed to maintain uniqueness in test identifiers and |
| developers have suffered because of it. It is even desirable for |
| the system actively enforces this uniqueness requirement. |
| </p> |
| |
| <p> |
| <b><i>Exact Results:</i></b> |
| <br> |
| A test case must have one expected result. If the result of |
| running the tests does not exactly match the expected result, |
| the test must fail. |
| </p> |
| |
| <p> |
| <b><i>Reproducible Results:</b></i> |
| <br> |
| Test results should be reproducible. If a test result matches |
| the expected result, it should do so every time the test is |
| run. External factors like time stamps must not effect the |
| results of a test. |
| </p> |
| |
| <p> |
| <b><i>Self-Contained Tests:</b></i> |
| <br> |
| Each test should be self-contained. Results for one test should |
| not depend on side effects of previous tests. This is obviously |
| a good practice, since one is able to understand everything a |
| test is doing without having to look at other tests. The test |
| system should also support random access so that a single test |
| or set of tests can be run. If a test is not self-contained, it |
| cannot be run in isolation. |
| </p> |
| |
| <p> |
| <b><i>Selective Execution:</i></b> |
| <br> |
| It may not be possible to run a given set of tests on certain |
| systems. The suite must provide a means of selectively running |
| tests cases based on the environment. The test system must also |
| provide a way to selectively run a given test case or set of |
| test cases on a per invocation basis. It would be incredibly |
| tedious to run the entire suite to see the results for a single |
| test. |
| </p> |
| |
| <p> |
| <b><i>No Monitoring:</i></b> |
| <br> |
| The tests must run from start to end without operator |
| intervention. Test results must be generated automatically. It |
| is critical that an operator not need to manually compare test |
| results to figure out which tests failed and which ones passed. |
| </p> |
| |
| |
| <p> |
| <b><i>Automatic Logging of Results:</i></b> |
| <br> |
| The system must store test results so that they can be compared |
| later. This applies to machine readable results as well as human |
| readable results. For example, assume we have a test named |
| <code>client-1</code>, it expects a result of 1 but instead 0 is |
| returned by the test case. We should expect the system to store |
| two distinct pieces of information. First, that the test |
| failed. Second, how the test failed, meaning how the expected |
| result differed from the actual result. |
| <p> |
| |
| <p> |
| This following example shows the kind of results we might record |
| in a results log file. |
| </p> |
| |
| <p> |
| <code><pre> |
| client-1 FAILED |
| client-2 PASSED |
| client-3 PASSED |
| </pre></code> |
| </p> |
| |
| <p> |
| <b><i>Automatic Recovery:</i></b> |
| <br> |
| The test system must be able to recover from crashes and |
| unexpected delays. For example, a child process might go into a |
| infinite loop and would need to be killed. The test shell itself |
| might also crash or go into an infinite loop. In these cases, |
| the test run must automatically recover and continue with the |
| tests directly after the one that crashed. |
| </p> |
| |
| <p> |
| This is critical for a couple of reasons. Nasty crashes and |
| infinite loops most often appear on users (not developers) |
| systems. Users are not well equipped to deal with these sorts of |
| exceptional situations. It is unrealistic to expect that users |
| will be able to manually recover from disaster and restart |
| crashed test cases. It is an accomplishment just to get them to |
| run the tests in the first place! |
| </p> |
| |
| <p> |
| Ensuring that the test system actually runs each and every test |
| is critical, since a failing test near the end of the suite |
| might never be noticed if a crash halfway through kept all the |
| tests from being run. This process must be completely |
| automated, no operator intervention should be required. |
| </p> |
| |
| |
| <p> |
| <b><i>Report Results Only:</i></b> |
| <br> |
| When a regression is found, a developer will need to manually |
| determine the reason for the regression. The system should tell |
| the developer exactly what tests have failed, when the last set |
| of results were generated, and what the previous results |
| actually were. Any additional functionality is outside the |
| scope of the test system. |
| </p> |
| |
| <p> |
| <b><i>Platform Specific Results:</i></b> |
| <br> |
| Each supported platform should have an associated set of test |
| results. The naive approach would be to maintain a single set of |
| results and compare the output for any platform to the known |
| results. The problem with this approach is that is does not |
| provide a way to keep track of when changes differ from one |
| platform to another. The following example attempts to clarify |
| with an example. |
| </p> |
| |
| <p> |
| Assume you have the following tests results generated on a |
| reference platform before and after a set of changes were |
| committed. |
| </p> |
| |
| <table BORDER=1 COLS=2> |
| |
| <tr> |
| <td><b>Before</b> (Reference Platform)</td> |
| |
| <td><b>After</b> (Reference Platform)</td> |
| </tr> |
| |
| <tr> |
| <td><code>client-1 PASSED</code></td> |
| <td><code>client-1 PASSED</code></td> |
| </tr> |
| |
| <tr> |
| <td><code>client-2 PASSED</code></td> |
| <td><code>client-2 FAILED</code></td> |
| </tr> |
| |
| </table> |
| |
| <p> |
| It is clear that the change you made introduced a regression in |
| the <code>client-2</code> test. The problem shows up when you |
| try to compare results generated from this modified code on some |
| other platform. For example, assume you got the following |
| results: |
| </p> |
| |
| <table BORDER=1 COLS=2> |
| |
| <tr> |
| <td><b>Before</b> (Reference Platform)</td> |
| |
| <td><b>After</b> (Other Platform)</td> |
| </tr> |
| |
| <tr> |
| <td><code>client-1 PASSED</code></td> |
| <td><code>client-1 FAILED</code></td> |
| </tr> |
| |
| <tr> |
| <td><code>client-2 PASSED</code></td> |
| <td><code>client-2 PASSED</code></td> |
| </tr> |
| |
| </table> |
| |
| <p> |
| Now things are not at all clear. We know that |
| <code>client-1</code> is failing but we don't know if it is |
| related to the change we just made. We don't know if this test |
| failed the last time we ran the tests on this platform since we |
| only have results for the reference platform to compare to. We |
| might have fixed a bug in <code>client-2</code>, or we might |
| have done nothing to effect it. |
| </p> |
| |
| <p> |
| If we instead keep track of test results on a platform by |
| platform basis, we can avoid much of this pain. It is easy to |
| imagine how this problem could get considerably worse if there |
| were 50 or 100 tests that behaved differently from one platform |
| to the next. |
| </p> |
| |
| <p> |
| <b><i>Test Types:</i></b> |
| <br> |
| The test suite should support two types of tests. The first |
| makes use of an external program like the svn client. These |
| kinds of tests will need to exec an external program and check |
| the output and exit status of the child process. Note that it |
| will not be possible to run this sort of test on Mac OS. The |
| second type of test will load subversion shared libraries and |
| invoke methods in-process. |
| </p> |
| |
| <p> |
| This provides the ability to do extensive testing of the various |
| subversion APIs without using the svn client. This also has the |
| nice benefit that it will work on Mac OS, as well as Windows and |
| Unix. |
| </p> |
| |
| <A NAME="EASEOFUSE"><h2>Ease of Use</h2></A> |
| |
| <p> |
| Developers will tend to avoid using a test suite if it is not |
| easy to add new tests and maintain old ones. If developers are |
| uninterested in using the test suite, it will quickly fall into |
| disrepair and become a burden instead of an aide. |
| </p> |
| |
| <p> |
| Users will simply avoid running the test suite if it is not |
| extremely simple to use. A user should be able to build the |
| software and then run: |
| </p> |
| |
| <blockquote> |
| <code> |
| % make check |
| </code> |
| </blockquote> |
| |
| <p> |
| This should run the test suite and provide a very high level set |
| of results that include how many tests results have changed |
| since the last run. |
| </p> |
| |
| <p> |
| While this high level report is useful to developers, they will |
| often need to examine results in more detail. The system should |
| provide a means to manually examine results, compare output, |
| invoke a debugger, and other sorts of low level operations. |
| </p> |
| |
| <p> |
| The next example shows how a developer might run a specific |
| subset of tests from the command line. The pattern given would |
| be used to do a glob style match on the test case identifiers, |
| and run any that matched. |
| </p> |
| |
| <blockquote> |
| <code> |
| % svntest "client-*" |
| </code> |
| </blockquote> |
| |
| <A NAME="LOCATION"><h2>Location</h2></A> |
| |
| <p> |
| The test suite should be packaged along with the source code |
| instead of being made available as a separate download. This |
| significantly simplifies the process of running tests since they |
| are already incorporated into the build tree. |
| </p> |
| |
| <p> |
| The test suite must support building and running inside and |
| outside of the source directory. For example, a developer might |
| want to run tests on both Solaris and Linux. The developer |
| should be able to run the tests concurrently in two different |
| build directories without having the tests interfere with each |
| other. |
| </p> |
| |
| |
| <A NAME="EXTERNAL"><h2>External program dependencies</h2></A> |
| |
| <p> |
| As much as possible, the test suite should avoid depending on |
| external programs or libraries. |
| |
| Of course, there is a nasty bootstrap problem with a test suite |
| implemented in a scripting language. A wide variety of systems |
| provide no support for modern scripting languages. We will avoid |
| this issue for now and assume that the scripting language of |
| choice is supported by the system. |
| </p> |
| |
| <p> |
| For example, the test suite should not depend on CVS to generate |
| test results. Many users will not have access to CVS on the |
| system they want to test subversion on. |
| </p> |
| |
| </body> |
| </html> |