src/test/R/README.txt - commons-math - Git at Google

 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
 # The ASF licenses this file to You under the Apache License, Version 2.0
 # (the "License"); you may not use this file except in compliance with
 # the License.  You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
 #------------------------------------------------------------------------------

 INTRODUCTION

 The purpose of the R programs included in this directory is to validate
 the target values used in Apache commons math unit tests. Success running the
 R and commons-math tests on a platform (OS and R version) means that R and
 commons-math give results for the test cases that are close in value.  The
 tests include configurable tolerance levels; but care must be taken in changing
 these, since in most cases the pre-set tolerance is close to the number of
 decimal digits used in expressing the expected values (both here and in the
 corresponding commons-math unit tests).

 Of course it is always possible that both R and commons-math give incorrect
 values for test cases, so these tests should not be interpreted as definitive
 in any absolute sense. The value of developing and running the tests is really
 to generate questions (and answers!) when the two systems give different
 results.

 Contributions of additional test cases (both R and Junit code) or just
 R programs to validate commons-math tests that are not covered here would be
 greatly appreciated.

 SETUP

 0) Download and install R.  You can get R here
 http://www.r-project.org/
 Follow the install instructions and make sure that you can launch R from this
 (i.e., either explitly add R to your OS path or let the install package do it
 for you).

 1) Launch R from this directory and type
 > source("testAll")
 to an R prompt.  This should produce output to the console similar to this:

 Binomial test cases
 Density test n = 10, p = 0.7...........................................SUCCEEDED
 Distribution test n = 10, p = 0.7......................................SUCCEEDED
 Inverse Distribution test n = 10, p = 0.7..............................SUCCEEDED
 Density test n = 5, p = 0..............................................SUCCEEDED
 Distribution test n = 5, p = 0.........................................SUCCEEDED
 Density test n = 5, p = 1..............................................SUCCEEDED
 Distribution test n = 5, p = 1.........................................SUCCEEDED
 --------------------------------------------------------------------------------
 Normal test cases
 Distribution test mu = 2.1, sigma = 1.4................................SUCCEEDED
 Distribution test mu = 2.1, sigma = 1.4................................SUCCEEDED
 Distribution test mu = 0, sigma = 1....................................SUCCEEDED
 Distribution test mu = 0, sigma = 0.1..................................SUCCEEDED
 --------------------------------------------------------------------------------
 ...
 <more test reports>


 WORKING WITH THE TESTS

 The R distribution comes with online manuals that you can view by launching
 a browser instance and then entering

 > help.start()

 at an R prompt. Poking about in the test case files and the online docs should
 bring you up to speed fairly quickly.  Here are some basic things to get
 you started. I should note at this point that I am by no means an expert R
 programmer, so some things may not be implemented in the the nicest way.
 Comments / suggestions for improvement are welcome!

 All of the test cases use some basic functions and global constants (screen
 width and success / failure strings) defined in "testFunctions." The
 R "source" function is used to "import" these functions into each of the test
 programs.  The "testAll" program pulls together and executes all of the
 individual test programs.  You can execute any one of them by just entering

 > source(<program-name>).

 The "assertEquals" function in the testFunctions file mimics the similarly
 named function used by Junit:

 assertEquals <- function(expected, observed, tol, message) {
     if(any(abs(expected - observed) > tol)) {
         cat("FAILURE: ",message,"\n")
         cat("EXPECTED: ",expected,"\n")
         cat("OBSERVED: ",observed,"\n")
         return(0)
     } else {
         return(1)
     }
 }

 The <expected> and <observed> arguments can be scalar values, vectors or
 matrices. If the arguments are vectors or matrices, corresponding entries
 are compared.

 The standard pattern used throughout the tests looks like this (from
 binomialTestCases):

 Start by defining a "verification function" -- in this example a function to
 verify computation of binomial probabilities. The <points> argument is a vector
 of integer values to feed into the density function, <expected> is a vector of
 the computed probabilies from the commons-math Junit tests, <n> and <p> are
 parameters of the distribution and <tol> is the error tolerance of the test.
 The function computes the probabilities using R and compares the values that
 R produces with those in the <expected> vector.

 verifyDensity <- function(points, expected, n, p, tol) {
     rDensityValues <- rep(0, length(points))
     i <- 0
     for (point in points) {
         i <- i + 1
         rDensityValues[i] <- dbinom(point, n, p, log = FALSE)
     }
     output <- c("Density test n = ", n, ", p = ", p)
     if (assertEquals(expected,rDensityValues,tol,"Density Values")) {
         displayPadded(output, SUCCEEDED, WIDTH)
     } else {
         displayPadded(output, FAILED, WIDTH)
     }
 }

 The displayPadded function just displays its first and second arguments with
 enough dots in between to make the whole string WIDTH characters long. It is
 defined in testFunctions.

 Then call this function with different parameters corresponding to the different
 Junit test cases:

 size <- 10.0
 probability <- 0.70

 densityPoints <- c(-1,0,1,2,3,4,5,6,7,8,9,10,11)
 densityValues <- c(0, 0.0000, 0.0001, 0.0014, 0.0090, 0.0368, 0.1029,
                 0.2001, 0.2668, 0.2335, 0.1211, 0.0282, 0)
 ...
 verifyDensity(densityPoints, densityValues, size, probability, tol)

 If the values computed by R match the target values in densityValues, this will
 produce one line of output to the console:

 Density test n = 10, p = 0.7...........................................SUCCEEDED

 If you modify the value of tol set at the top of binomialTestCases to make the
 test more sensitive than the number of digits specified in the densityValues
 vector, it will fail, producing the following output, showing the failure and
 the expected and observed values:

 FAILURE:  Density Values
 EXPECTED:  0 0 1e-04 0.0014 0.009 0.0368 0.1029 0.2001 0.2668 0.2335 0.1211 /
  0.0282 0
 OBSERVED:  0 5.9049e-06 0.000137781 0.0014467005 0.009001692 0.036756909 /
 0.1029193452 0.200120949 0.266827932 0.2334744405 0.121060821 0.0282475249 0
 Density test n = 10, p = 0.7..............................................FAILED
	# Licensed to the Apache Software Foundation (ASF) under one or more
	# contributor license agreements. See the NOTICE file distributed with
	# this work for additional information regarding copyright ownership.
	# The ASF licenses this file to You under the Apache License, Version 2.0
	# (the "License"); you may not use this file except in compliance with
	# the License. You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,
	# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	# See the License for the specific language governing permissions and
	# limitations under the License.
	#
	#------------------------------------------------------------------------------

	INTRODUCTION

	The purpose of the R programs included in this directory is to validate
	the target values used in Apache commons math unit tests. Success running the
	R and commons-math tests on a platform (OS and R version) means that R and
	commons-math give results for the test cases that are close in value. The
	tests include configurable tolerance levels; but care must be taken in changing
	these, since in most cases the pre-set tolerance is close to the number of
	decimal digits used in expressing the expected values (both here and in the
	corresponding commons-math unit tests).

	Of course it is always possible that both R and commons-math give incorrect
	values for test cases, so these tests should not be interpreted as definitive
	in any absolute sense. The value of developing and running the tests is really
	to generate questions (and answers!) when the two systems give different
	results.

	Contributions of additional test cases (both R and Junit code) or just
	R programs to validate commons-math tests that are not covered here would be
	greatly appreciated.

	SETUP

	0) Download and install R. You can get R here
	http://www.r-project.org/
	Follow the install instructions and make sure that you can launch R from this
	(i.e., either explitly add R to your OS path or let the install package do it
	for you).

	1) Launch R from this directory and type
	> source("testAll")
	to an R prompt. This should produce output to the console similar to this:

	Binomial test cases
	Density test n = 10, p = 0.7...........................................SUCCEEDED
	Distribution test n = 10, p = 0.7......................................SUCCEEDED
	Inverse Distribution test n = 10, p = 0.7..............................SUCCEEDED
	Density test n = 5, p = 0..............................................SUCCEEDED
	Distribution test n = 5, p = 0.........................................SUCCEEDED
	Density test n = 5, p = 1..............................................SUCCEEDED
	Distribution test n = 5, p = 1.........................................SUCCEEDED
	--------------------------------------------------------------------------------
	Normal test cases
	Distribution test mu = 2.1, sigma = 1.4................................SUCCEEDED
	Distribution test mu = 2.1, sigma = 1.4................................SUCCEEDED
	Distribution test mu = 0, sigma = 1....................................SUCCEEDED
	Distribution test mu = 0, sigma = 0.1..................................SUCCEEDED
	--------------------------------------------------------------------------------
	...
	<more test reports>


	WORKING WITH THE TESTS

	The R distribution comes with online manuals that you can view by launching
	a browser instance and then entering

	> help.start()

	at an R prompt. Poking about in the test case files and the online docs should
	bring you up to speed fairly quickly. Here are some basic things to get
	you started. I should note at this point that I am by no means an expert R
	programmer, so some things may not be implemented in the the nicest way.
	Comments / suggestions for improvement are welcome!

	All of the test cases use some basic functions and global constants (screen
	width and success / failure strings) defined in "testFunctions." The
	R "source" function is used to "import" these functions into each of the test
	programs. The "testAll" program pulls together and executes all of the
	individual test programs. You can execute any one of them by just entering

	> source(<program-name>).

	The "assertEquals" function in the testFunctions file mimics the similarly
	named function used by Junit:

	assertEquals <- function(expected, observed, tol, message) {
	if(any(abs(expected - observed) > tol)) {
	cat("FAILURE: ",message,"\n")
	cat("EXPECTED: ",expected,"\n")
	cat("OBSERVED: ",observed,"\n")
	return(0)
	} else {
	return(1)
	}
	}

	The <expected> and <observed> arguments can be scalar values, vectors or
	matrices. If the arguments are vectors or matrices, corresponding entries
	are compared.

	The standard pattern used throughout the tests looks like this (from
	binomialTestCases):

	Start by defining a "verification function" -- in this example a function to
	verify computation of binomial probabilities. The <points> argument is a vector
	of integer values to feed into the density function, <expected> is a vector of
	the computed probabilies from the commons-math Junit tests, <n> and <p> are
	parameters of the distribution and <tol> is the error tolerance of the test.
	The function computes the probabilities using R and compares the values that
	R produces with those in the <expected> vector.

	verifyDensity <- function(points, expected, n, p, tol) {
	rDensityValues <- rep(0, length(points))
	i <- 0
	for (point in points) {
	i <- i + 1
	rDensityValues[i] <- dbinom(point, n, p, log = FALSE)
	}
	output <- c("Density test n = ", n, ", p = ", p)
	if (assertEquals(expected,rDensityValues,tol,"Density Values")) {
	displayPadded(output, SUCCEEDED, WIDTH)
	} else {
	displayPadded(output, FAILED, WIDTH)
	}
	}

	The displayPadded function just displays its first and second arguments with
	enough dots in between to make the whole string WIDTH characters long. It is
	defined in testFunctions.

	Then call this function with different parameters corresponding to the different
	Junit test cases:

	size <- 10.0
	probability <- 0.70

	densityPoints <- c(-1,0,1,2,3,4,5,6,7,8,9,10,11)
	densityValues <- c(0, 0.0000, 0.0001, 0.0014, 0.0090, 0.0368, 0.1029,
	0.2001, 0.2668, 0.2335, 0.1211, 0.0282, 0)
	...
	verifyDensity(densityPoints, densityValues, size, probability, tol)

	If the values computed by R match the target values in densityValues, this will
	produce one line of output to the console:

	Density test n = 10, p = 0.7...........................................SUCCEEDED

	If you modify the value of tol set at the top of binomialTestCases to make the
	test more sensitive than the number of digits specified in the densityValues
	vector, it will fail, producing the following output, showing the failure and
	the expected and observed values:

	FAILURE: Density Values
	EXPECTED: 0 0 1e-04 0.0014 0.009 0.0368 0.1029 0.2001 0.2668 0.2335 0.1211 /
	0.0282 0
	OBSERVED: 0 5.9049e-06 0.000137781 0.0014467005 0.009001692 0.036756909 /
	0.1029193452 0.200120949 0.266827932 0.2334744405 0.121060821 0.0282475249 0
	Density test n = 10, p = 0.7..............................................FAILED