IMPALA-5310: Part 2: Add SAMPLED_NDV() function.
Adds a new SAMPLED_NDV() aggregate function that is
intended to be used in COMPUTE STATS TABLESAMPLE.
This patch only adds the function itself. Integration
with COMPUTE STATS will come in a separate patch.
SAMPLED_NDV() estimates the number of distinct values (NDV)
based on a sample of data and the corresponding sampling rate.
The main idea is to collect several x/y data points where x is
the number of rows and y is the corresponding NDV estimate.
These data points are used to fit an objective function to the
data such that the true NDV can be extrapolated.
The aggregate function maintains a fixed number of HyperLogLog
intermediates to compute the x/y points.
Several objective functions are fit and the best-fit one is
used for extrapolation.
Adds the MPFIT C library to perform curve fitting:
https://www.physics.wisc.edu/~craigm/idl/cmpfit.html
The library is a C port from Fortran. Scipy uses the
Fortran version of the library for curve fitting.
Testing:
- added functional tests
- core/hdfs run passed
Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820
Reviewed-on: http://gerrit.cloudera.org:8080/8569
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
diff --git a/LICENSE.txt b/LICENSE.txt
index 419a726..aa211ad 100644
--- a/LICENSE.txt
+++ b/LICENSE.txt
@@ -946,3 +946,84 @@
OTHER DEALINGS IN THE SOFTWARE.
--------------------------------------------------------------------------------
+
+be/src/thirdparty/mpfit
+
+MPFIT: A MINPACK-1 Least Squares Fitting Library in C
+
+Original public domain version by B. Garbow, K. Hillstrom, J. More'
+ (Argonne National Laboratory, MINPACK project, March 1980)
+ Copyright (1999) University of Chicago
+ (see below)
+
+Tranlation to C Language by S. Moshier (moshier.net)
+ (no restrictions placed on distribution)
+
+Enhancements and packaging by C. Markwardt
+ (comparable to IDL fitting routine MPFIT
+ see http://cow.physics.wisc.edu/~craigm/idl/idl.html)
+ Copyright (C) 2003, 2004, 2006, 2007 Craig B. Markwardt
+
+ This software is provided as is without any warranty whatsoever.
+ Permission to use, copy, modify, and distribute modified or
+ unmodified copies is granted, provided this copyright and disclaimer
+ are included unchanged.
+
+
+Source code derived from MINPACK must have the following disclaimer
+text provided.
+
+===========================================================================
+Minpack Copyright Notice (1999) University of Chicago. All rights reserved
+
+Redistribution and use in source and binary forms, with or
+without modification, are permitted provided that the
+following conditions are met:
+
+1. Redistributions of source code must retain the above
+copyright notice, this list of conditions and the following
+disclaimer.
+
+2. Redistributions in binary form must reproduce the above
+copyright notice, this list of conditions and the following
+disclaimer in the documentation and/or other materials
+provided with the distribution.
+
+3. The end-user documentation included with the
+redistribution, if any, must include the following
+acknowledgment:
+
+ "This product includes software developed by the
+ University of Chicago, as Operator of Argonne National
+ Laboratory.
+
+Alternately, this acknowledgment may appear in the software
+itself, if and wherever such third-party acknowledgments
+normally appear.
+
+4. WARRANTY DISCLAIMER. THE SOFTWARE IS SUPPLIED "AS IS"
+WITHOUT WARRANTY OF ANY KIND. THE COPYRIGHT HOLDER, THE
+UNITED STATES, THE UNITED STATES DEPARTMENT OF ENERGY, AND
+THEIR EMPLOYEES: (1) DISCLAIM ANY WARRANTIES, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO ANY IMPLIED WARRANTIES
+OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE
+OR NON-INFRINGEMENT, (2) DO NOT ASSUME ANY LEGAL LIABILITY
+OR RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR
+USEFULNESS OF THE SOFTWARE, (3) DO NOT REPRESENT THAT USE OF
+THE SOFTWARE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS, (4)
+DO NOT WARRANT THAT THE SOFTWARE WILL FUNCTION
+UNINTERRUPTED, THAT IT IS ERROR-FREE OR THAT ANY ERRORS WILL
+BE CORRECTED.
+
+5. LIMITATION OF LIABILITY. IN NO EVENT WILL THE COPYRIGHT
+HOLDER, THE UNITED STATES, THE UNITED STATES DEPARTMENT OF
+ENERGY, OR THEIR EMPLOYEES: BE LIABLE FOR ANY INDIRECT,
+INCIDENTAL, CONSEQUENTIAL, SPECIAL OR PUNITIVE DAMAGES OF
+ANY KIND OR NATURE, INCLUDING BUT NOT LIMITED TO LOSS OF
+PROFITS OR LOSS OF DATA, FOR ANY REASON WHATSOEVER, WHETHER
+SUCH LIABILITY IS ASSERTED ON THE BASIS OF CONTRACT, TORT
+(INCLUDING NEGLIGENCE OR STRICT LIABILITY), OR OTHERWISE,
+EVEN IF ANY OF SAID PARTIES HAS BEEN WARNED OF THE
+POSSIBILITY OF SUCH LOSS OR DAMAGES.
+
+--------------------------------------------------------------------------------
diff --git a/NOTICE.txt b/NOTICE.txt
index 3760589..c2875a2 100644
--- a/NOTICE.txt
+++ b/NOTICE.txt
@@ -13,3 +13,7 @@
This product includes cryptographic software written by Eric Young
(eay@cryptsoft.com). This product includes software written by Tim
Hudson (tjh@cryptsoft.com).
+
+This product includes software developed by the University of Chicago,
+as Operator of Argonne National Laboratory.
+Copyright (C) 1999 University of Chicago. All rights reserved.
\ No newline at end of file
diff --git a/be/src/exprs/aggregate-functions-ir.cc b/be/src/exprs/aggregate-functions-ir.cc
index d5f946a..499188c 100644
--- a/be/src/exprs/aggregate-functions-ir.cc
+++ b/be/src/exprs/aggregate-functions-ir.cc
@@ -17,24 +17,25 @@
#include "exprs/aggregate-functions.h"
-#include <math.h>
#include <algorithm>
#include <map>
#include <sstream>
#include <utility>
+#include <cmath>
#include <boost/random/ranlux.hpp>
#include <boost/random/uniform_int.hpp>
#include "codegen/impala-ir.h"
#include "common/logging.h"
+#include "exprs/anyval-util.h"
+#include "exprs/hll-bias.h"
#include "runtime/decimal-value.inline.h"
#include "runtime/runtime-state.h"
#include "runtime/string-value.inline.h"
#include "runtime/timestamp-value.h"
#include "runtime/timestamp-value.inline.h"
-#include "exprs/anyval-util.h"
-#include "exprs/hll-bias.h"
+#include "util/mpfit-util.h"
#include "common/names.h"
@@ -42,6 +43,7 @@
using boost::ranlux64_3;
using std::make_pair;
using std::map;
+using std::min_element;
using std::nth_element;
using std::pop_heap;
using std::push_heap;
@@ -1480,6 +1482,186 @@
return estimate;
}
+/// Intermediate aggregation state for the SampledNdv() function.
+/// Stores NUM_HLL_BUCKETS of the form <row_count, hll_state>.
+/// The 'row_count' keeps track of how many input rows were aggregated into that
+/// bucket, and the 'hll_state' is an intermediate aggregation state of HyperLogLog.
+/// See the header comments on the SampledNdv() function for more details.
+class SampledNdvState {
+ public:
+ /// Empirically determined number of HLL buckets. Power of two for fast modulo.
+ static const uint32_t NUM_HLL_BUCKETS = 32;
+
+ /// A bucket contains an update count and an HLL intermediate state.
+ static constexpr int64_t BUCKET_SIZE = sizeof(int64_t) + AggregateFunctions::HLL_LEN;
+
+ /// Sampling percent which was given as the second argument to SampledNdv().
+ /// Stored here to avoid existing issues with passing constant arguments to all
+ /// aggregation phases and because we convert the sampling percent argument from
+ /// decimal to double. See IMPALA-6179.
+ double sample_perc;
+
+ /// Counts the number of Update() calls. Used for determining which bucket to update.
+ int64_t total_row_count;
+
+ /// Array of buckets.
+ struct {
+ int64_t row_count;
+ uint8_t hll[AggregateFunctions::HLL_LEN];
+ } buckets[NUM_HLL_BUCKETS];
+};
+
+void AggregateFunctions::SampledNdvInit(FunctionContext* ctx, StringVal* dst) {
+ // Uses a preallocated FIXED_UDA_INTERMEDIATE intermediate value.
+ DCHECK_EQ(dst->len, sizeof(SampledNdvState));
+ memset(dst->ptr, 0, sizeof(SampledNdvState));
+
+ DoubleVal* sample_perc = reinterpret_cast<DoubleVal*>(ctx->GetConstantArg(1));
+ if (sample_perc == nullptr) return;
+ // Guaranteed by the FE.
+ DCHECK(!sample_perc->is_null);
+ DCHECK_GE(sample_perc->val, 0.0);
+ DCHECK_LE(sample_perc->val, 1.0);
+ SampledNdvState* state = reinterpret_cast<SampledNdvState*>(dst->ptr);
+ state->sample_perc = sample_perc->val;
+}
+
+/// Incorporate the 'src' into one of the intermediate HLLs, which will be used by
+/// Finalize() to generate a set of the (x,y) data points.
+template <typename T>
+void AggregateFunctions::SampledNdvUpdate(FunctionContext* ctx, const T& src,
+ const DoubleVal& sample_perc, StringVal* dst) {
+ SampledNdvState* state = reinterpret_cast<SampledNdvState*>(dst->ptr);
+ int64_t bucket_idx = state->total_row_count % SampledNdvState::NUM_HLL_BUCKETS;
+ StringVal hll_dst = StringVal(state->buckets[bucket_idx].hll, HLL_LEN);
+ HllUpdate(ctx, src, &hll_dst);
+ ++state->buckets[bucket_idx].row_count;
+ ++state->total_row_count;
+}
+
+void AggregateFunctions::SampledNdvMerge(FunctionContext* ctx, const StringVal& src,
+ StringVal* dst) {
+ SampledNdvState* src_state = reinterpret_cast<SampledNdvState*>(src.ptr);
+ SampledNdvState* dst_state = reinterpret_cast<SampledNdvState*>(dst->ptr);
+ for (int i = 0; i < SampledNdvState::NUM_HLL_BUCKETS; ++i) {
+ StringVal src_hll = StringVal(src_state->buckets[i].hll, HLL_LEN);
+ StringVal dst_hll = StringVal(dst_state->buckets[i].hll, HLL_LEN);
+ HllMerge(ctx, src_hll, &dst_hll);
+ dst_state->buckets[i].row_count += src_state->buckets[i].row_count;
+ }
+ // Total count. Not really needed after Update() but kept for sanity checking.
+ dst_state->total_row_count += src_state->total_row_count;
+ // Propagate sampling percent to Finalize().
+ dst_state->sample_perc = src_state->sample_perc;
+}
+
+BigIntVal AggregateFunctions::SampledNdvFinalize(FunctionContext* ctx,
+ const StringVal& src) {
+ SampledNdvState* state = reinterpret_cast<SampledNdvState*>(src.ptr);
+
+ // Generate 'num_points' data points with x=row_count and y=ndv_estimate. These points
+ // are used to fit a function for the NDV growth and estimate the real NDV.
+ constexpr int num_points =
+ SampledNdvState::NUM_HLL_BUCKETS * SampledNdvState::NUM_HLL_BUCKETS;
+ int64_t counts[num_points] = { 0 };
+ int64_t ndvs[num_points] = { 0 };
+
+ int64_t min_ndv = numeric_limits<int64_t>::max();
+ int64_t min_count = numeric_limits<int64_t>::max();
+ // We have a fixed number of HLL intermediates to generate data points. Any unique
+ // subset of intermediates can be combined to create a new data point. It was
+ // empirically determined that 'num_data' points is typically sufficient and there are
+ // diminishing returns from generating additional data points.
+ // The generation method below was chosen for its simplicity. It successively merges
+ // buckets in a rolling window of size NUM_HLL_BUCKETS. Repeating the last data point
+ // where all buckets are merged biases the curve fitting to hit that data point which
+ // makes sense because that's likely the most accurate one. The number of data points
+ // are sufficient for reasonable accuracy.
+ int pidx = 0;
+ for (int i = 0; i < SampledNdvState::NUM_HLL_BUCKETS; ++i) {
+ uint8_t merged_hll_data[HLL_LEN];
+ memset(merged_hll_data, 0, HLL_LEN);
+ StringVal merged_hll(merged_hll_data, HLL_LEN);
+ int64_t merged_count = 0;
+ for (int j = 0; j < SampledNdvState::NUM_HLL_BUCKETS; ++j) {
+ int bucket_idx = (i + j) % SampledNdvState::NUM_HLL_BUCKETS;
+ merged_count += state->buckets[bucket_idx].row_count;
+ counts[pidx] = merged_count;
+ StringVal hll = StringVal(state->buckets[bucket_idx].hll, HLL_LEN);
+ HllMerge(ctx, hll, &merged_hll);
+ ndvs[pidx] = HllFinalEstimate(merged_hll.ptr, HLL_LEN);
+ ++pidx;
+ }
+ min_count = std::min(min_count, state->buckets[i].row_count);
+ min_ndv = std::min(min_ndv, ndvs[i * SampledNdvState::NUM_HLL_BUCKETS]);
+ }
+ // Based on the point-generation method above the last elements represent the data
+ // point where all buckets are merged.
+ int64_t max_count = counts[num_points - 1];
+ int64_t max_ndv = ndvs[num_points - 1];
+
+ // Scale all values to [0,1] since some objective functions require it (e.g., Sigmoid).
+ double count_scale = max_count - min_count;
+ double ndv_scale = max_ndv - min_ndv;
+ if (count_scale == 0) count_scale = 1.0;
+ if (ndv_scale == 0) ndv_scale = 1.0;
+ double scaled_counts[num_points];
+ double scaled_ndvs[num_points];
+ for (int i = 0; i < num_points; ++i) {
+ scaled_counts[i] = counts[i] / count_scale;
+ scaled_ndvs[i] = ndvs[i] / ndv_scale;
+ }
+
+ // List of objective functions. Curve fitting will select the best values for the
+ // parameters a, b, c, d.
+ vector<ObjectiveFunction> ndv_fns;
+ // Linear function: f(x) = a + b * x
+ ndv_fns.push_back(ObjectiveFunction("LIN", 2,
+ [](double x, const double* params) -> double {
+ return params[0] + params[1] * x;
+ }
+ ));
+ // Logarithmic function: f(x) = a + b * log(x)
+ ndv_fns.push_back(ObjectiveFunction("LOG", 2,
+ [](double x, const double* params) -> double {
+ return params[0] + params[1] * log(x);
+ }
+ ));
+ // Power function: f(x) = a + b * pow(x, c)
+ ndv_fns.push_back(ObjectiveFunction("POW", 3,
+ [](double x, const double* params) -> double {
+ return params[0] + params[1] * pow(x, params[2]);
+ }
+ ));
+ // Sigmoid function: f(x) = a + b * (c / (c + pow(d, -x)))
+ ndv_fns.push_back(ObjectiveFunction("SIG", 4,
+ [](double x, const double* params) -> double {
+ return params[0] + params[1] * (params[2] / (params[2] + pow(params[3], -x)));
+ }
+ ));
+
+ // Perform least mean squares fitting on all objective functions.
+ vector<ObjectiveFunction> valid_ndv_fns;
+ for (ObjectiveFunction& f: ndv_fns) {
+ if(f.LmsFit(scaled_counts, scaled_ndvs, num_points)) {
+ valid_ndv_fns.push_back(std::move(f));
+ }
+ }
+
+ // Select the best-fit function for estimating the NDV.
+ auto best_fit_fn = min_element(valid_ndv_fns.begin(), valid_ndv_fns.end(),
+ [](const ObjectiveFunction& a, const ObjectiveFunction& b) -> bool {
+ return a.GetError() < b.GetError();
+ }
+ );
+
+ // Compute the extrapolated NDV based on the extrapolated row count.
+ double extrap_count = max_count / state->sample_perc;
+ double scaled_extrap_count = extrap_count / count_scale;
+ double scaled_extrap_ndv = best_fit_fn->GetY(scaled_extrap_count);
+ return round(scaled_extrap_ndv * ndv_scale);
+}
+
// An implementation of a simple single pass variance algorithm. A standard UDA must
// be single pass (i.e. does not scan the table more than once), so the most canonical
// two pass approach is not practical.
@@ -2140,6 +2322,27 @@
template void AggregateFunctions::HllUpdate(
FunctionContext*, const DecimalVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const BooleanVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const TinyIntVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const SmallIntVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const IntVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const BigIntVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const FloatVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const DoubleVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const StringVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const TimestampVal&, const DoubleVal&, StringVal*);
+template void AggregateFunctions::SampledNdvUpdate(
+ FunctionContext*, const DecimalVal&, const DoubleVal&, StringVal*);
+
template void AggregateFunctions::KnuthVarUpdate(
FunctionContext*, const TinyIntVal&, StringVal*);
template void AggregateFunctions::KnuthVarUpdate(
diff --git a/be/src/exprs/aggregate-functions.h b/be/src/exprs/aggregate-functions.h
index 5ebcb97..81884c1 100644
--- a/be/src/exprs/aggregate-functions.h
+++ b/be/src/exprs/aggregate-functions.h
@@ -204,6 +204,23 @@
/// estimates.
static uint64_t HllFinalEstimate(const uint8_t* buckets, int32_t num_buckets);
+ /// Estimates the number of distinct values (NDV) based on a sample of data and the
+ /// corresponding sampling rate. The main idea of this function is to collect several
+ /// (x,y) data points where x is the number of rows and y is the corresponding NDV
+ /// estimate. These data points are used to fit an objective function to the data such
+ /// that the true NDV can be extrapolated.
+ /// This aggregate function maintains a fixed number of HyperLogLog intermediates.
+ /// The Update() phase updates the intermediates in a round-robin fashion.
+ /// The Merge() phase combines the corresponding intermediates.
+ /// The Finalize() phase generates (x,y) data points, performs curve fitting, and
+ /// computes the estimated true NDV.
+ static void SampledNdvInit(FunctionContext*, StringVal* dst);
+ template <typename T>
+ static void SampledNdvUpdate(FunctionContext*, const T& src,
+ const DoubleVal& sample_perc, StringVal* dst);
+ static void SampledNdvMerge(FunctionContext*, const StringVal& src, StringVal* dst);
+ static BigIntVal SampledNdvFinalize(FunctionContext*, const StringVal& src);
+
/// Knuth's variance algorithm, more numerically stable than canonical stddev
/// algorithms; reference implementation:
/// http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm
diff --git a/be/src/thirdparty/mpfit/DISCLAIMER b/be/src/thirdparty/mpfit/DISCLAIMER
new file mode 100644
index 0000000..3e1b76f
--- /dev/null
+++ b/be/src/thirdparty/mpfit/DISCLAIMER
@@ -0,0 +1,77 @@
+
+MPFIT: A MINPACK-1 Least Squares Fitting Library in C
+
+Original public domain version by B. Garbow, K. Hillstrom, J. More'
+ (Argonne National Laboratory, MINPACK project, March 1980)
+ Copyright (1999) University of Chicago
+ (see below)
+
+Tranlation to C Language by S. Moshier (moshier.net)
+ (no restrictions placed on distribution)
+
+Enhancements and packaging by C. Markwardt
+ (comparable to IDL fitting routine MPFIT
+ see http://cow.physics.wisc.edu/~craigm/idl/idl.html)
+ Copyright (C) 2003, 2004, 2006, 2007 Craig B. Markwardt
+
+ This software is provided as is without any warranty whatsoever.
+ Permission to use, copy, modify, and distribute modified or
+ unmodified copies is granted, provided this copyright and disclaimer
+ are included unchanged.
+
+
+Source code derived from MINPACK must have the following disclaimer
+text provided.
+
+===========================================================================
+Minpack Copyright Notice (1999) University of Chicago. All rights reserved
+
+Redistribution and use in source and binary forms, with or
+without modification, are permitted provided that the
+following conditions are met:
+
+1. Redistributions of source code must retain the above
+copyright notice, this list of conditions and the following
+disclaimer.
+
+2. Redistributions in binary form must reproduce the above
+copyright notice, this list of conditions and the following
+disclaimer in the documentation and/or other materials
+provided with the distribution.
+
+3. The end-user documentation included with the
+redistribution, if any, must include the following
+acknowledgment:
+
+ "This product includes software developed by the
+ University of Chicago, as Operator of Argonne National
+ Laboratory.
+
+Alternately, this acknowledgment may appear in the software
+itself, if and wherever such third-party acknowledgments
+normally appear.
+
+4. WARRANTY DISCLAIMER. THE SOFTWARE IS SUPPLIED "AS IS"
+WITHOUT WARRANTY OF ANY KIND. THE COPYRIGHT HOLDER, THE
+UNITED STATES, THE UNITED STATES DEPARTMENT OF ENERGY, AND
+THEIR EMPLOYEES: (1) DISCLAIM ANY WARRANTIES, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO ANY IMPLIED WARRANTIES
+OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE
+OR NON-INFRINGEMENT, (2) DO NOT ASSUME ANY LEGAL LIABILITY
+OR RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR
+USEFULNESS OF THE SOFTWARE, (3) DO NOT REPRESENT THAT USE OF
+THE SOFTWARE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS, (4)
+DO NOT WARRANT THAT THE SOFTWARE WILL FUNCTION
+UNINTERRUPTED, THAT IT IS ERROR-FREE OR THAT ANY ERRORS WILL
+BE CORRECTED.
+
+5. LIMITATION OF LIABILITY. IN NO EVENT WILL THE COPYRIGHT
+HOLDER, THE UNITED STATES, THE UNITED STATES DEPARTMENT OF
+ENERGY, OR THEIR EMPLOYEES: BE LIABLE FOR ANY INDIRECT,
+INCIDENTAL, CONSEQUENTIAL, SPECIAL OR PUNITIVE DAMAGES OF
+ANY KIND OR NATURE, INCLUDING BUT NOT LIMITED TO LOSS OF
+PROFITS OR LOSS OF DATA, FOR ANY REASON WHATSOEVER, WHETHER
+SUCH LIABILITY IS ASSERTED ON THE BASIS OF CONTRACT, TORT
+(INCLUDING NEGLIGENCE OR STRICT LIABILITY), OR OTHERWISE,
+EVEN IF ANY OF SAID PARTIES HAS BEEN WARNED OF THE
+POSSIBILITY OF SUCH LOSS OR DAMAGES.
diff --git a/be/src/thirdparty/mpfit/README b/be/src/thirdparty/mpfit/README
new file mode 100644
index 0000000..29a9cef
--- /dev/null
+++ b/be/src/thirdparty/mpfit/README
@@ -0,0 +1,705 @@
+
+MPFIT: A MINPACK-1 Least Squares Fitting Library in C
+
+Original public domain version by B. Garbow, K. Hillstrom, J. More'
+ (Argonne National Laboratory, MINPACK project, March 1980)
+
+Tranlation to C Language by S. Moshier (moshier.net)
+
+Enhancements, documentation and packaging by C. Markwardt
+ (comparable to IDL fitting routine MPFIT
+ see http://cow.physics.wisc.edu/~craigm/idl/idl.html)
+ Copyright (C) 2003, 2004, 2006, 2007, 2009, 2010, 2013 Craig B. Markwardt
+
+
+SUMMARY of CHANGES
+ 16 Feb 2009 - version 1.0 - initial release
+ 18 Feb 2009 - version 1.1 - add 'version' field to 'results' struct
+ 22 Nov 2009 - - allow to compile with C++ compiler
+ - change to memset() instead of nonstandard bzero()
+ - for Microsoft, proprietary equivalent of finite()
+ 04 Oct 2010 - - add static declarations, remove some compiler warnings
+ (reported by Lars Kr. Lundin)
+ 13 Nov 2010 - version 1.2 - additional documentation, cleanup of mpfit.h
+ 23 Apr 2013 - version 1.3 - add MP_NO_ITER; bug fix mpside=2 when debugging
+ (thanks to M. Wojdyr)
+
+$Id: README,v 1.18 2016/06/02 19:14:16 craigm Exp $
+
+INTRODUCTION
+
+MPFIT uses the Levenberg-Marquardt technique to solve the
+least-squares problem. In its typical use, MPFIT will be used to
+fit a user-supplied function (the "model") to user-supplied data
+points (the "data") by adjusting a set of parameters. MPFIT is
+based upon MINPACK-1 (LMDIF.F) by More' and collaborators.
+
+For example, a researcher may think that a set of observed data
+points is best modelled with a Gaussian curve. A Gaussian curve is
+parameterized by its mean, standard deviation and normalization.
+MPFIT will, within certain constraints, find the set of parameters
+which best fits the data. The fit is "best" in the least-squares
+sense; that is, the sum of the weighted squared differences between
+the model and data is minimized.
+
+The Levenberg-Marquardt technique is a particular strategy for
+iteratively searching for the best fit. This particular
+implementation is drawn from a robust routine called MINPACK-1 (see
+NETLIB). This version allows upper and lower bounding constraints
+to be placed on each parameter, or the parameter can be held fixed.
+
+The user-supplied function should compute an array of weighted
+deviations between model and data. In a typical scientific problem
+the residuals should be weighted so that each deviate has a
+gaussian sigma of 1.0. If X represents values of the independent
+variable, Y represents a measurement for each value of X, and ERR
+represents the error in the measurements, then the deviates could
+be calculated as follows:
+
+ for (i=0; i<m; i++) {
+ deviates[i] = (y[i] - f(x[i])) / err[i];
+ }
+
+where m is the number of data points, and where F is the
+function representing the model evaluated at X. If ERR are the
+1-sigma uncertainties in Y, then the sum of deviates[i] squared will
+be the total chi-squared value, which MPFIT will seek to minimize.
+
+Simple constraints can be placed on parameter values by using the
+"pars" parameter to MPFIT, and other parameter-specific options can be
+set. See below for a description of this optional parameter
+descriptor.
+
+MPFIT does not perform more general optimization tasks. MPFIT is
+customized, based on MINPACK-1, to the least-squares minimization
+problem.
+
+BASIC CALLING INTERFACE of MPFIT
+
+At the very least, the user must supply a 'user function,' which
+calculates the residuals as described above, and one or more
+parameters. The calling interface to mpfit is:
+
+ int mpfit(mp_func funct, int m, int npar,
+ double *xall, mp_par *pars, mp_config *config,
+ void *private_data,
+ mp_result *result);
+
+The user function <b>funct</b> is a C function which computes the
+residuals using any user data that is required. The number of
+residuals is specified by the integer <b>m</b>. The nature and
+properties of user function are described in greater detail below.
+
+The user function parameters are passed to mpfit as the array
+ double xall[npar];
+where <b>npar</b> is the number of parameters of the user function.
+It must be the case that m > npar, i.e. there are more data points
+than free parameters. The user must pass an initial "best guess" of
+the user parameters to mpfit(), and upon successful return, the xall[]
+array will contain the "best fit" parameters of the fit (and the original
+values are overwritten).
+
+The user function is responsible to compute an array of generic
+residuals. Usually these residuals will depend on user-dependent
+quantities such as measured data or independent variables such "x"
+when fitting a function of the form y(x). The user should pass these
+quantities using the optional <b>private_data</b> pointer. If
+private_data is not used then a null pointer should be passed.
+Otherwise, it can be any C pointer, typically a pointer to a structure
+such as this:
+ struct example_private_data { /* EXAMPLE: fitting y(x) */
+ double *x; /* x - independent variable of model */
+ double *y; /* y - measured "y" values */
+ double *y_error; /* y_error - measurement uncertainty in y */
+ };
+The user would be responsible to declare such a structure, and to fill
+it with pointers to the relevant data arrays before calling mpfit().
+mpfit() itself does not inspect or modify the private_data contents,
+but merely passes it along to the user function
+
+The structure <b>pars</b> is optional, and allows the user to specify
+additional information and constraints about each user function
+parameter. For example, the user can specify simple bounding
+constraints. If passed, then pars[] must be dimensioned as,
+ mp_par pars[npar];
+where mp_par is a structure defined in mpfit.h. If no special
+parameter information is necessary, then the user can pass
+pars == 0.
+
+The optional structure <b>config</b> configures how mpfit() behaves,
+and is described in greater detail below. By passing a null pointer,
+the user obtains the default behavior.
+
+The structure <b>result</b> contains results of the fit, returned by
+mpfit(). The user should pass a pointer to a structure of type
+'mp_result' (which should be zeroed), and upon return, the structure
+is filled with various diagnostic quantities about the fitting run.
+These quantities are described in greater detail below. If these
+diagnostics are not required, then the user may pass a null pointer.
+
+
+USER FUNCTION
+
+The user must define a function which computes the appropriate values
+as specified above. The function must compute the weighted deviations
+between the model and the data. The user function may also optionally
+compute explicit derivatives (see below). The user function should
+be of the following form:
+
+ int myfunct(int m, int n, double *p, double *deviates,
+ double **derivs, void *private)
+ {
+ int i;
+ /* Compute function deviates */
+ for (i=0; i<m; i++) {
+ deviates[i] = {function of p[] and private data};
+ }
+
+ return 0;
+ }
+
+The user function parameters are defined as follows:
+ int m - number of data points
+ int n - number of parameters
+ double *p - array of n parameters
+ double *deviates - array of m deviates to be returned by myfunct()
+ double **derivs - used for user-computed derivatives (see below)
+ (= 0 when automatic finite differences are computed)
+
+User functions may also indicate a fatal error condition by returning
+a negative error code. Error codes from -15 to -1 are reserved for
+the user function.
+
+
+EXAMPLE 1 - USER FUNCTION y(x)
+
+Here is a sample user function which computes the residuals for a simple
+model fit of the form y = f(x). The residuals are defined as
+(y[i] - f(x[i]))/y_error[i]. We will use the example_private_data
+structure defined above. The user function would appear like this:
+
+ int myfunct_y_x(int m, int n, double *p, double *deviates,
+ double **derivs, struct example_private_data *private)
+ {
+ int i;
+ double *x, *y, *y_error;
+
+ /* Retrieve values of x, y and y_error from private structure */
+ x = private.x;
+ y = private.y;
+ y_error = private.y_error;
+
+ /* Compute function deviates */
+ for (i=0; i<m; i++) {
+ deviates[i] = (y[i] - f(x[i], p)) / y_error[i];
+ }
+
+ return 0;
+ }
+
+In this example, the user would be responsible to define the function
+f(x,p) where x is the independent variable and p is the array of user
+parameters. Although this example shows y = f(x,p), it is trivial to
+extend it to additional independent variables, such as u = f(x,y,p),
+by adding additional elements to the private data structure.
+
+The setup and call to mpfit() could look something like this:
+ int m = 10; /* 10 data points */
+ double x[10] = {1,2,3,4,5,6,7,8,9,10}; /* Independent variable */
+ double y[10] = {-0.42,1.59,0.8,3.2,2.09,6.39,6.29,9.01,7.47,7.58};/* Y measurements*/
+ double y_error[10] = {1,1,1,1,1,1,1,1,1,1}; /* Y uncertainties */
+ struct example_private_data private;
+
+ double xall[2] = { 1.0, 1.0 }; /* User parameters - initial guess */
+ /* Fill private data structure to pointers with user data */
+ private.x = x;
+ private.y = y;
+ private.y_error = y_error;
+
+ /* Call mpfit() */
+ status = mpfit(&myfunct_y_x, m, npar, xall, 0, 0, &private, 0);
+
+USER-COMPUTED DERIVATIVES
+
+The user function can also compute function derivatives, which are
+used in the minimization process. This can be useful to save time, or
+when the derivative is tricky to evaluate numerically.
+
+Users should pass the "pars" parameter (see below), and for the 'side'
+structure field, the value of 3 should be passed. This indicates to
+mpfit() that it should request the user function will compute the
+derivative. NOTE: this setting is specified on a parameter by
+parameter basis. This means that users are free to choose which
+parameters' derivatives are computed explicitly and which
+numerically. A combination of both is allowed. However, since the
+default value of 'side' is 0, derivative estimates will be numerical
+unless explicitly requested using the 'side' structure field.
+
+The user function should only compute derivatives when the 'derivs'
+parameter is non-null. Note that even when user-computed derivatives
+are enabled, mpfit() may call the user function with derivs set to
+null, so the user function must check before accessing this pointer.
+
+The 'derivs' parameter is an array of pointers, one pointer for each
+parameter. If derivs[i] is non-null, then derivatives are requested
+for the ith parameter. Note that if some parameters are fixed, or
+pars[i].side is not 3 for some parameters, then derivs[i] will be null
+and the derivatives do not need to be computed for those parameters.
+
+derivs[j][i] is the derivative of the ith deviate with respect to the
+jth parameter (for 0<=i<m, 0<=j<n). Storage has already been
+allocated for this array, and the user is not responsible for freeing
+it. The user function may assume that derivs[j][i] are initialized to
+zero.
+
+The function prototype for user-computed derivatives is:
+
+ int myfunct_with_derivs(int m, int n, double *p, double *deviates,
+ double **derivs, void *private)
+ {
+ int i;
+ /* Compute function deviates */
+ for (i=0; i<m; i++) {
+ deviates[i] = {function of x[i], p and private data};
+ }
+
+ /* If derivs is non-zero then user-computed derivatives are
+ requested */
+ if (derivs) {
+ int j;
+ for (j=0; j<n; j++) if (derivs[j]) {
+ /* It is only required to compute derivatives when
+ derivs[ipar] is non-null */
+ for (i=0; i<m; i++) {
+ derivs[j][i] = {derivative of the ith deviate with respect to
+ the jth parameter = d(deviates[i])/d(par[j])}
+ }
+ }
+ }
+
+ return 0;
+ }
+
+TESTING EXPLICIT DERIVATIVES
+
+In principle, the process of computing explicit derivatives should be
+straightforward. In practice, the computation can be error prone,
+often being wrong by a sign or a scale factor.
+
+In order to be sure that the explicit derivatives are correct, the
+user can set pars[i].deriv_debug = 1 for parameter i (see below for a
+description of the "pars" structure). This will cause mpfit() to
+print *both* explicit derivatives and numerical derivatives to the
+console so that the user can compare the results. This would
+typically be used during development and debugging to be sure the
+calculated derivatives are correct, than then deriv_debug would be set
+to zero for production use.
+
+If you want debugging derivatives, it is important to set pars[i].side
+to the kind of numerical derivative you want to compare with.
+pars[i].side should be set to 0, 1, -1, or 2, and *not* set to 3.
+When pars[i].deriv_debug is set, then mpfit() automatically
+understands to request user-computed derivatives.
+
+The console output will be sent to the standard output, and will
+appear as a block of ASCII text like this:
+ FJAC DEBUG BEGIN
+ # IPNT FUNC DERIV_U DERIV_N DIFF_ABS DIFF_REL
+ FJAC PARM 1
+ .... derivative data for parameter 1 ....
+ FJAC PARM 2
+ .... derivative data for parameter 2 ....
+ .... and so on ....
+ FJAC DEBUG END
+
+which is to say, debugging data will be bracketed by pairs of "FJAC
+DEBUG" BEGIN/END phrases. Derivative data for individual parameter i
+will be labeled by "FJAC PARM i". The columns are, in order,
+
+ IPNT - data point number j
+ FUNC - user function evaluated at X[j]
+ DERIV_U - user-calculated derivative d(FUNC(X[j]))/d(P[i])
+ DERIV_N - numerically calculated derivative according to pars[i].side value
+ DIFF_ABS - difference between DERIV_U and DERIV_N = fabs(DERIV_U-DERIV_N)
+ DIFF_REL - relative difference = fabs(DERIV_U-DERIV_N)/DERIV_U
+
+Since individual numerical derivative values may contain significant
+round-off errors, it is up to the user to critically compare DERIV_U
+and DERIV_N, using DIFF_ABS and DIFF_REL as a guide.
+
+
+CONSTRAINING PARAMETER VALUES WITH THE PARS PARAMETER
+
+The behavior of MPFIT can be modified with respect to each
+parameter to be fitted. A parameter value can be fixed; simple
+boundary constraints can be imposed; and limitations on the
+parameter changes can be imposed.
+
+If fitting constraints are to be supplied, then the user should pass
+an array of mp_par structures to mpfit() in the pars parameter. If
+pars is set to 0, then the fitting parameters are asssumed to be
+unconstrained.
+
+pars should be an array of structures, one for each parameter.
+Each parameter is associated with one element of the array, in
+numerical order. The structure is declared to have the following
+fields:
+
+ .fixed - a boolean value, whether the parameter is to be held
+ fixed or not. Fixed parameters are not varied by
+ MPFIT, but are passed on to MYFUNCT for evaluation.
+
+ .limited - a two-element boolean array. If the first/second
+ element is set, then the parameter is bounded on the
+ lower/upper side. A parameter can be bounded on both
+ sides.
+
+
+ .limits - a two-element float or double array. Gives the
+ parameter limits on the lower and upper sides,
+ respectively. Zero, one or two of these values can be
+ set, depending on the values of LIMITED.
+
+ .parname - a string, giving the name of the parameter. The
+ fitting code of MPFIT does not use this tag in any
+ way. However, it may be used for output purposes.
+
+ .step - the step size to be used in calculating the numerical
+ derivatives. If set to zero, then the step size is
+ computed automatically.
+ This value is superceded by the RELSTEP value.
+
+ .relstep - the *relative* step size to be used in calculating
+ the numerical derivatives. This number is the
+ fractional size of the step, compared to the
+ parameter value. This value supercedes the STEP
+ setting. If the parameter is zero, then a default
+ step size is chosen.
+
+ .side - the sidedness of the finite difference when computing
+ numerical derivatives. This field can take four
+ values:
+
+ 0 - one-sided derivative computed automatically
+ 1 - one-sided derivative (f(x+h) - f(x) )/h
+ -1 - one-sided derivative (f(x) - f(x-h))/h
+ 2 - two-sided derivative (f(x+h) - f(x-h))/(2*h)
+ 3 - user-computed explicit derivatives
+
+ Where H is the STEP parameter described above. The
+ "automatic" one-sided derivative method will chose a
+ direction for the finite difference which does not
+ violate any constraints. The other methods do not
+ perform this check. The two-sided method is in
+ principle more precise, but requires twice as many
+ function evaluations. Default: 0.
+
+ .deriv_debug - flag to enable/disable console debug logging of
+ user-computed derivatives, as described above. 1=enable
+ debugging; 0=disable debugging. Note that when
+ pars[i].deriv_debug is set, then pars[i].side should be
+ set to 0, 1, -1 or 2, depending on which numerical
+ derivative you wish to compare to.
+ Default: 0.
+
+RETURN VALUE of MPFIT()
+
+mpfit() returns an integer status code. In principle, any positive return
+value from mpfit() indicates success or partial success.
+
+The possible error codes are integer constants:
+MP_ERR_INPUT /* General input parameter error */
+MP_ERR_NAN /* User function produced non-finite values */
+MP_ERR_FUNC /* No user function was supplied */
+MP_ERR_NPOINTS /* No user data points were supplied */
+MP_ERR_NFREE /* No free parameters */
+MP_ERR_MEMORY /* Memory allocation error */
+MP_ERR_INITBOUNDS /* Initial values inconsistent w constraints*/
+MP_ERR_BOUNDS /* Initial constraints inconsistent */
+MP_ERR_PARAM /* General input parameter error */
+MP_ERR_DOF /* Not enough degrees of freedom */
+
+The possible success status codes are:
+MP_OK_CHI /* Convergence in chi-square value */
+MP_OK_PAR /* Convergence in parameter value */
+MP_OK_BOTH /* Both MP_OK_PAR and MP_OK_CHI hold */
+MP_OK_DIR /* Convergence in orthogonality */
+MP_MAXITER /* Maximum number of iterations reached */
+MP_FTOL /* ftol is too small; no further improvement*/
+MP_XTOL /* xtol is too small; no further improvement*/
+MP_GTOL /* gtol is too small; no further improvement*/
+
+
+CONFIGURING MPFIT()
+
+mpfit() is primarily configured using the config parameter, which in
+turn is defined as a pointer to the following structure:
+
+ struct mp_config_struct {
+ /* NOTE: the user may set the value explicitly; OR, if the passed
+ value is zero, then the "Default" value will be substituted by
+ mpfit(). */
+ double ftol; /* Relative chi-square convergence criterium Default: 1e-10 */
+ double xtol; /* Relative parameter convergence criterium Default: 1e-10 */
+ double gtol; /* Orthogonality convergence criterium Default: 1e-10 */
+ double epsfcn; /* Finite derivative step size Default: MP_MACHEP0 */
+ double stepfactor; /* Initial step bound Default: 100.0 */
+ double covtol; /* Range tolerance for covariance calculation Default: 1e-14 */
+ int maxiter; /* Maximum number of iterations. If maxiter == MP_NO_ITER,
+ then basic error checking is done, and parameter
+ errors/covariances are estimated based on input
+ parameter values, but no fitting iterations are done.
+ Default: 200
+ */
+ int maxfev; /* Maximum number of function evaluations, or 0 for no limit
+ Default: 0 (no limit) */
+ int nprint; /* Default: 1 */
+ int douserscale;/* Scale variables by user values?
+ 1 = yes, user scale values in diag;
+ 0 = no, variables scaled internally (Default) */
+ int nofinitecheck; /* Disable check for infinite quantities from user?
+ 0 = do not perform check (Default)
+ 1 = perform check
+ */
+ mp_iterproc iterproc; /* Placeholder pointer - must set to 0 */
+ };
+
+Generally speaking, a value of 0 for a field in the structure above
+indicates that a default value should be used, indicated in
+parentheses. Therefore, a user should zero this structure before
+passing it.
+
+
+EXTRACTING RESULTS FROM MPFIT()
+
+The basic result of mpfit() is the set of updated parameters, xall.
+However, there are other auxiliary quantities that the user can
+extract by using the results parameter. This is a structure defined
+like this:
+
+ /* Definition of results structure, for when fit completes */
+ struct mp_result_struct {
+ double bestnorm; /* Final chi^2 */
+ double orignorm; /* Starting value of chi^2 */
+ int niter; /* Number of iterations */
+ int nfev; /* Number of function evaluations */
+ int status; /* Fitting status code */
+
+ int npar; /* Total number of parameters */
+ int nfree; /* Number of free parameters */
+ int npegged; /* Number of pegged parameters */
+ int nfunc; /* Number of residuals (= num. of data points) */
+
+ double *resid; /* Final residuals
+ nfunc-vector, or 0 if not desired */
+ double *xerror; /* Final parameter uncertainties (1-sigma)
+ npar-vector, or 0 if not desired */
+ double *covar; /* Final parameter covariance matrix
+ npar x npar array, or 0 if not desired */
+ char version[20]; /* MPFIT version string */
+ };
+
+All of the scalar numeric quantities are filled when mpfit() returns,
+and any incoming value will be overwritten.
+
+If the user would like the final residuals, parameter errors, or the
+covariance matrix, then they should allocate the required storage, and
+pass a pointer to it in the corresponding structure field. A pointer
+value of zero indicates that the array should not be returned. Thus,
+by default, the user should zero this structure before passing it.
+The user is responsible for allocating and freeing resid, xerror and
+covar storage.
+
+The 'version' string can be used to interpret the behavior of MPFIT,
+in case the behavior changes over time. Version numbers will be of
+the form "i.j" or "i.j.k" where i, j and k are integers.
+
+
+EXAMPLE 2 - Basic call
+
+This example does the basics, which is to perform the optimization
+with no constraints, and returns the scalar results in the result
+structure.
+
+ mp_result result;
+
+ memset(&result, 0, sizeof(result));
+ status = mpfit(myfunct, m, n, p, 0, 0, 0, &result);
+
+
+EXAMPLE 3 - Requesting Parameter Errors
+
+The only modification needed to retrieve parameter uncertainties is to
+allocate storage for it, and place a pointer to it in result.
+
+ mp_result result;
+ double perror[n];
+
+ memset(&result, 0, sizeof(result));
+ result.xerror = perror;
+ status = mpfit(myfunct, m, n, p, 0, 0, 0, &result);
+
+
+EXAMPLE 4 - Fixing a parameter
+
+This example fixes the third parameter (i.e. p[2]) at its starting
+value
+
+ mp_result result;
+ mp_par pars[n];
+
+ memset(&pars[0], 0, sizeof(pars));
+ pars[2].fixed = 1;
+
+ memset(&result, 0, sizeof(result));
+ status = mpfit(myfunct, m, n, p, pars, 0, 0, &result);
+
+EXAMPLE 5 - Applying parameter constraints
+
+This example ensures that the third parameter (i.e. p[2]) is always
+greater or equal to ten.
+
+ mp_result result;
+ mp_par pars[n];
+
+ memset(&pars[0], 0, sizeof(pars));
+ pars[2].limited[0] = 1; /* limited[0] indicates lower bound */
+ pars[2].limits[0] = 10.0; /* Actual constraint p[2] >= 10 */
+
+ memset(&result, 0, sizeof(result));
+ status = mpfit(myfunct, m, n, p, pars, 0, 0, &result);
+
+
+EXAMPLE 6 - Increasing maximum number of iterations
+
+This example changes the maximum number of iterations from its default
+to 1000.
+
+ mp_config config;
+ mp_result result;
+
+ memset(&config, 0, sizeof(config));
+ config.maxiter = 1000;
+ memset(&result, 0, sizeof(result));
+ status = mpfit(myfunct, m, n, p, 0, &config, 0, &result);
+
+
+EXAMPLE 7 - Passing private data to user function
+
+This example passes "private" data to its user function using the
+private parameter. It assumes that three variables, x, y, and ey,
+already exist, and that the user function will know what to do with
+them.
+
+ mp_result result;
+ struct mystruct {
+ double *x;
+ double *y;
+ double *ey;
+ };
+
+ struct mystruct mydata;
+
+ mydata.x = x;
+ mydata.y = y;
+ mydata.ey = ey;
+
+ memset(&result, 0, sizeof(result));
+ status = mpfit(myfunct, m, n, p, 0, 0, (void *)&mydata, &result);
+
+
+EXAMPLE 8 - Complete example
+
+
+This example shows how to fit sample data to a linear function
+y = f(x) = a - b*x, where a and b are fitted parameters. There is sample
+data included within the program. The parameters are initialized to
+a = 1.0, b = 1.0, and then the fitting occurs. The function
+residuals; within linfunc(), the value of f(x) is computed. The main
+routine, <b>main()</b> initializes the variables and calls mpfit().
+
+
+#include "mpfit.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <math.h>
+
+/* This is the private data structure which contains the data points
+ and their uncertainties */
+struct vars_struct {
+ double *x;
+ double *y;
+ double *ey;
+};
+
+/*
+ * linear fit function
+ *
+ * m - number of data points
+ * n - number of parameters (2)
+ * p - array of fit parameters
+ * dy - array of residuals to be returned
+ * vars - private data (struct vars_struct *)
+ *
+ * RETURNS: error code (0 = success)
+ */
+int linfunc(int m, int n, double *p, double *dy, double **dvec, void *vars)
+{
+ int i;
+ struct vars_struct *v = (struct vars_struct *) vars;
+ double *x, *y, *ey, f;
+
+ x = v->x;
+ y = v->y;
+ ey = v->ey;
+
+ for (i=0; i<m; i++) {
+ f = p[0] - p[1]*x[i]; /* Linear fit function; note f = a - b*x */
+ dy[i] = (y[i] - f)/ey[i];
+ }
+
+ return 0;
+}
+
+/* Test harness routine, which contains test data, invokes mpfit() */
+int main(int argc, char *argv[])
+{
+ /* X - independent variable */
+ double x[] = {-1.7237128E+00,1.8712276E+00,-9.6608055E-01,
+ -2.8394297E-01,1.3416969E+00,1.3757038E+00,
+ -1.3703436E+00,4.2581975E-02,-1.4970151E-01,
+ 8.2065094E-01};
+ /* Y - measured value of dependent quantity */
+ double y[] = {1.9000429E-01,6.5807428E+00,1.4582725E+00,
+ 2.7270851E+00,5.5969253E+00,5.6249280E+00,
+ 0.787615,3.2599759E+00,2.9771762E+00,
+ 4.5936475E+00};
+ double ey[10]; /* Measurement uncertainty - initialized below */
+
+ double p[2] = {1.0, 1.0}; /* Initial conditions */
+ double pactual[2] = {3.20, 1.78}; /* Actual values used to make data */
+ double perror[2]; /* Returned parameter errors */
+ int i;
+ struct vars_struct v; /* Private data structure */
+ int status;
+ mp_result result;
+
+ memset(&result,0,sizeof(result)); /* Zero results structure */
+ result.xerror = perror;
+ for (i=0; i<10; i++) ey[i] = 0.07; /* Data errors */
+
+ /* Fill private data structure */
+ v.x = x;
+ v.y = y;
+ v.ey = ey;
+
+ /* Call fitting function for 10 data points and 2 parameters */
+ status = mpfit(linfunc, 10, 2, p, 0, 0, (void *) &v, &result);
+
+ printf("*** testlinfit status = %d\n", status);
+ /* ... print or use the results of the fitted parametres p[] here! ... */
+
+ return 0;
+}
+
+
diff --git a/be/src/thirdparty/mpfit/mpfit.c b/be/src/thirdparty/mpfit/mpfit.c
new file mode 100644
index 0000000..1ab9abb
--- /dev/null
+++ b/be/src/thirdparty/mpfit/mpfit.c
@@ -0,0 +1,2305 @@
+/*
+ * MINPACK-1 Least Squares Fitting Library
+ *
+ * Original public domain version by B. Garbow, K. Hillstrom, J. More'
+ * (Argonne National Laboratory, MINPACK project, March 1980)
+ * See the file DISCLAIMER for copyright information.
+ *
+ * Tranlation to C Language by S. Moshier (moshier.net)
+ *
+ * Enhancements and packaging by C. Markwardt
+ * (comparable to IDL fitting routine MPFIT
+ * see http://cow.physics.wisc.edu/~craigm/idl/idl.html)
+ */
+
+/* Main mpfit library routines (double precision)
+ $Id: mpfit.c,v 1.24 2013/04/23 18:37:38 craigm Exp $
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <math.h>
+#include <string.h>
+#include "mpfit.h"
+
+/* Forward declarations of functions in this module */
+static int mp_fdjac2(mp_func funct,
+ int m, int n, int *ifree, int npar, double *x, double *fvec,
+ double *fjac, int ldfjac, double epsfcn,
+ double *wa, void *priv, int *nfev,
+ double *step, double *dstep, int *dside,
+ int *qulimited, double *ulimit,
+ int *ddebug, double *ddrtol, double *ddatol,
+ double *wa2, double **dvecptr);
+static void mp_qrfac(int m, int n, double *a, int lda,
+ int pivot, int *ipvt, int lipvt,
+ double *rdiag, double *acnorm, double *wa);
+static void mp_qrsolv(int n, double *r, int ldr, int *ipvt, double *diag,
+ double *qtb, double *x, double *sdiag, double *wa);
+static void mp_lmpar(int n, double *r, int ldr, int *ipvt, int *ifree, double *diag,
+ double *qtb, double delta, double *par, double *x,
+ double *sdiag, double *wa1, double *wa2);
+static double mp_enorm(int n, double *x);
+static double mp_dmax1(double a, double b);
+static double mp_dmin1(double a, double b);
+static int mp_min0(int a, int b);
+static int mp_covar(int n, double *r, int ldr, int *ipvt, double tol, double *wa);
+
+/* Macro to call user function */
+#define mp_call(funct, m, n, x, fvec, dvec, priv) (*(funct))(m,n,x,fvec,dvec,priv)
+
+/* Macro to safely allocate memory */
+#define mp_malloc(dest,type,size) \
+ dest = (type *) malloc( sizeof(type)*size ); \
+ if (dest == 0) { \
+ info = MP_ERR_MEMORY; \
+ goto CLEANUP; \
+ } else { \
+ int _k; \
+ for (_k=0; _k<(size); _k++) dest[_k] = 0; \
+ }
+
+/*
+* **********
+*
+* subroutine mpfit
+*
+* the purpose of mpfit is to minimize the sum of the squares of
+* m nonlinear functions in n variables by a modification of
+* the levenberg-marquardt algorithm. the user must provide a
+* subroutine which calculates the functions. the jacobian is
+* then calculated by a finite-difference approximation.
+*
+* mp_funct funct - function to be minimized
+* int m - number of data points
+* int npar - number of fit parameters
+* double *xall - array of n initial parameter values
+* upon return, contains adjusted parameter values
+* mp_par *pars - array of npar structures specifying constraints;
+* or 0 (null pointer) for unconstrained fitting
+* [ see README and mpfit.h for definition & use of mp_par]
+* mp_config *config - pointer to structure which specifies the
+* configuration of mpfit(); or 0 (null pointer)
+* if the default configuration is to be used.
+* See README and mpfit.h for definition and use
+* of config.
+* void *private - any private user data which is to be passed directly
+* to funct without modification by mpfit().
+* mp_result *result - pointer to structure, which upon return, contains
+* the results of the fit. The user should zero this
+* structure. If any of the array values are to be
+* returned, the user should allocate storage for them
+* and assign the corresponding pointer in *result.
+* Upon return, *result will be updated, and
+* any of the non-null arrays will be filled.
+*
+*
+* FORTRAN DOCUMENTATION BELOW
+*
+*
+* the subroutine statement is
+*
+* subroutine lmdif(fcn,m,n,x,fvec,ftol,xtol,gtol,maxfev,epsfcn,
+* diag,mode,factor,nprint,info,nfev,fjac,
+* ldfjac,ipvt,qtf,wa1,wa2,wa3,wa4)
+*
+* where
+*
+* fcn is the name of the user-supplied subroutine which
+* calculates the functions. fcn must be declared
+* in an external statement in the user calling
+* program, and should be written as follows.
+*
+* subroutine fcn(m,n,x,fvec,iflag)
+* integer m,n,iflag
+* double precision x(n),fvec(m)
+* ----------
+* calculate the functions at x and
+* return this vector in fvec.
+* ----------
+* return
+* end
+*
+* the value of iflag should not be changed by fcn unless
+* the user wants to terminate execution of lmdif.
+* in this case set iflag to a negative integer.
+*
+* m is a positive integer input variable set to the number
+* of functions.
+*
+* n is a positive integer input variable set to the number
+* of variables. n must not exceed m.
+*
+* x is an array of length n. on input x must contain
+* an initial estimate of the solution vector. on output x
+* contains the final estimate of the solution vector.
+*
+* fvec is an output array of length m which contains
+* the functions evaluated at the output x.
+*
+* ftol is a nonnegative input variable. termination
+* occurs when both the actual and predicted relative
+* reductions in the sum of squares are at most ftol.
+* therefore, ftol measures the relative error desired
+* in the sum of squares.
+*
+* xtol is a nonnegative input variable. termination
+* occurs when the relative error between two consecutive
+* iterates is at most xtol. therefore, xtol measures the
+* relative error desired in the approximate solution.
+*
+* gtol is a nonnegative input variable. termination
+* occurs when the cosine of the angle between fvec and
+* any column of the jacobian is at most gtol in absolute
+* value. therefore, gtol measures the orthogonality
+* desired between the function vector and the columns
+* of the jacobian.
+*
+* maxfev is a positive integer input variable. termination
+* occurs when the number of calls to fcn is at least
+* maxfev by the end of an iteration.
+*
+* epsfcn is an input variable used in determining a suitable
+* step length for the forward-difference approximation. this
+* approximation assumes that the relative errors in the
+* functions are of the order of epsfcn. if epsfcn is less
+* than the machine precision, it is assumed that the relative
+* errors in the functions are of the order of the machine
+* precision.
+*
+* diag is an array of length n. if mode = 1 (see
+* below), diag is internally set. if mode = 2, diag
+* must contain positive entries that serve as
+* multiplicative scale factors for the variables.
+*
+* mode is an integer input variable. if mode = 1, the
+* variables will be scaled internally. if mode = 2,
+* the scaling is specified by the input diag. other
+* values of mode are equivalent to mode = 1.
+*
+* factor is a positive input variable used in determining the
+* initial step bound. this bound is set to the product of
+* factor and the euclidean norm of diag*x if nonzero, or else
+* to factor itself. in most cases factor should lie in the
+* interval (.1,100.). 100. is a generally recommended value.
+*
+* nprint is an integer input variable that enables controlled
+* printing of iterates if it is positive. in this case,
+* fcn is called with iflag = 0 at the beginning of the first
+* iteration and every nprint iterations thereafter and
+* immediately prior to return, with x and fvec available
+* for printing. if nprint is not positive, no special calls
+* of fcn with iflag = 0 are made.
+*
+* info is an integer output variable. if the user has
+* terminated execution, info is set to the (negative)
+* value of iflag. see description of fcn. otherwise,
+* info is set as follows.
+*
+* info = 0 improper input parameters.
+*
+* info = 1 both actual and predicted relative reductions
+* in the sum of squares are at most ftol.
+*
+* info = 2 relative error between two consecutive iterates
+* is at most xtol.
+*
+* info = 3 conditions for info = 1 and info = 2 both hold.
+*
+* info = 4 the cosine of the angle between fvec and any
+* column of the jacobian is at most gtol in
+* absolute value.
+*
+* info = 5 number of calls to fcn has reached or
+* exceeded maxfev.
+*
+* info = 6 ftol is too small. no further reduction in
+* the sum of squares is possible.
+*
+* info = 7 xtol is too small. no further improvement in
+* the approximate solution x is possible.
+*
+* info = 8 gtol is too small. fvec is orthogonal to the
+* columns of the jacobian to machine precision.
+*
+* nfev is an integer output variable set to the number of
+* calls to fcn.
+*
+* fjac is an output m by n array. the upper n by n submatrix
+* of fjac contains an upper triangular matrix r with
+* diagonal elements of nonincreasing magnitude such that
+*
+* t t t
+* p *(jac *jac)*p = r *r,
+*
+* where p is a permutation matrix and jac is the final
+* calculated jacobian. column j of p is column ipvt(j)
+* (see below) of the identity matrix. the lower trapezoidal
+* part of fjac contains information generated during
+* the computation of r.
+*
+* ldfjac is a positive integer input variable not less than m
+* which specifies the leading dimension of the array fjac.
+*
+* ipvt is an integer output array of length n. ipvt
+* defines a permutation matrix p such that jac*p = q*r,
+* where jac is the final calculated jacobian, q is
+* orthogonal (not stored), and r is upper triangular
+* with diagonal elements of nonincreasing magnitude.
+* column j of p is column ipvt(j) of the identity matrix.
+*
+* qtf is an output array of length n which contains
+* the first n elements of the vector (q transpose)*fvec.
+*
+* wa1, wa2, and wa3 are work arrays of length n.
+*
+* wa4 is a work array of length m.
+*
+* subprograms called
+*
+* user-supplied ...... fcn
+*
+* minpack-supplied ... dpmpar,enorm,fdjac2,lmpar,qrfac
+*
+* fortran-supplied ... dabs,dmax1,dmin1,dsqrt,mod
+*
+* argonne national laboratory. minpack project. march 1980.
+* burton s. garbow, kenneth e. hillstrom, jorge j. more
+*
+* ********** */
+
+
+int mpfit(mp_func funct, int m, int npar,
+ double *xall, mp_par *pars, mp_config *config, void *private_data,
+ mp_result *result)
+{
+ mp_config conf;
+ int i, j, info, iflag, nfree, npegged, iter;
+ int qanylim = 0;
+
+ int ij,jj,l;
+ double actred,delta,dirder,fnorm,fnorm1,gnorm, orignorm;
+ double par,pnorm,prered,ratio;
+ double sum,temp,temp1,temp2,temp3,xnorm, alpha;
+ static double one = 1.0;
+ static double p1 = 0.1;
+ static double p5 = 0.5;
+ static double p25 = 0.25;
+ static double p75 = 0.75;
+ static double p0001 = 1.0e-4;
+ static double zero = 0.0;
+ int nfev = 0;
+
+ double *step = 0, *dstep = 0, *llim = 0, *ulim = 0;
+ int *pfixed = 0, *mpside = 0, *ifree = 0, *qllim = 0, *qulim = 0;
+ int *ddebug = 0;
+ double *ddrtol = 0, *ddatol = 0;
+
+ double *fvec = 0, *qtf = 0;
+ double *x = 0, *xnew = 0, *fjac = 0, *diag = 0;
+ double *wa1 = 0, *wa2 = 0, *wa3 = 0, *wa4 = 0;
+ double **dvecptr = 0;
+ int *ipvt = 0;
+
+ int ldfjac;
+
+ /* Default configuration */
+ conf.ftol = 1e-10;
+ conf.xtol = 1e-10;
+ conf.gtol = 1e-10;
+ conf.stepfactor = 100.0;
+ conf.nprint = 1;
+ conf.epsfcn = MP_MACHEP0;
+ conf.maxiter = 200;
+ conf.douserscale = 0;
+ conf.maxfev = 0;
+ conf.covtol = 1e-14;
+ conf.nofinitecheck = 0;
+
+ if (config) {
+ /* Transfer any user-specified configurations */
+ if (config->ftol > 0) conf.ftol = config->ftol;
+ if (config->xtol > 0) conf.xtol = config->xtol;
+ if (config->gtol > 0) conf.gtol = config->gtol;
+ if (config->stepfactor > 0) conf.stepfactor = config->stepfactor;
+ if (config->nprint >= 0) conf.nprint = config->nprint;
+ if (config->epsfcn > 0) conf.epsfcn = config->epsfcn;
+ if (config->maxiter > 0) conf.maxiter = config->maxiter;
+ if (config->maxiter == MP_NO_ITER) conf.maxiter = 0;
+ if (config->douserscale != 0) conf.douserscale = config->douserscale;
+ if (config->covtol > 0) conf.covtol = config->covtol;
+ if (config->nofinitecheck > 0) conf.nofinitecheck = config->nofinitecheck;
+ conf.maxfev = config->maxfev;
+ }
+
+ info = MP_ERR_INPUT; /* = 0 */
+ iflag = 0;
+ nfree = 0;
+ npegged = 0;
+
+ /* Basic error checking */
+ if (funct == 0) {
+ return MP_ERR_FUNC;
+ }
+
+ if ((m <= 0) || (xall == 0)) {
+ return MP_ERR_NPOINTS;
+ }
+
+ if (npar <= 0) {
+ return MP_ERR_NFREE;
+ }
+
+ fnorm = -1.0;
+ fnorm1 = -1.0;
+ xnorm = -1.0;
+ delta = 0.0;
+
+ /* FIXED parameters? */
+ mp_malloc(pfixed, int, npar);
+ if (pars) for (i=0; i<npar; i++) {
+ pfixed[i] = (pars[i].fixed)?1:0;
+ }
+
+ /* Finite differencing step, absolute and relative, and sidedness of deriv */
+ mp_malloc(step, double, npar);
+ mp_malloc(dstep, double, npar);
+ mp_malloc(mpside, int, npar);
+ mp_malloc(ddebug, int, npar);
+ mp_malloc(ddrtol, double, npar);
+ mp_malloc(ddatol, double, npar);
+ if (pars) for (i=0; i<npar; i++) {
+ step[i] = pars[i].step;
+ dstep[i] = pars[i].relstep;
+ mpside[i] = pars[i].side;
+ ddebug[i] = pars[i].deriv_debug;
+ ddrtol[i] = pars[i].deriv_reltol;
+ ddatol[i] = pars[i].deriv_abstol;
+ }
+
+ /* Finish up the free parameters */
+ nfree = 0;
+ mp_malloc(ifree, int, npar);
+ for (i=0, j=0; i<npar; i++) {
+ if (pfixed[i] == 0) {
+ nfree++;
+ ifree[j++] = i;
+ }
+ }
+ if (nfree == 0) {
+ info = MP_ERR_NFREE;
+ goto CLEANUP;
+ }
+
+ if (pars) {
+ for (i=0; i<npar; i++) {
+ if ( (pars[i].limited[0] && (xall[i] < pars[i].limits[0])) ||
+ (pars[i].limited[1] && (xall[i] > pars[i].limits[1])) ) {
+ info = MP_ERR_INITBOUNDS;
+ goto CLEANUP;
+ }
+ if ( (pars[i].fixed == 0) && pars[i].limited[0] && pars[i].limited[1] &&
+ (pars[i].limits[0] >= pars[i].limits[1])) {
+ info = MP_ERR_BOUNDS;
+ goto CLEANUP;
+ }
+ }
+
+ mp_malloc(qulim, int, nfree);
+ mp_malloc(qllim, int, nfree);
+ mp_malloc(ulim, double, nfree);
+ mp_malloc(llim, double, nfree);
+
+ for (i=0; i<nfree; i++) {
+ qllim[i] = pars[ifree[i]].limited[0];
+ qulim[i] = pars[ifree[i]].limited[1];
+ llim[i] = pars[ifree[i]].limits[0];
+ ulim[i] = pars[ifree[i]].limits[1];
+ if (qllim[i] || qulim[i]) qanylim = 1;
+ }
+ }
+
+ /* Sanity checking on input configuration */
+ if ((npar <= 0) || (conf.ftol <= 0) || (conf.xtol <= 0) ||
+ (conf.gtol <= 0) || (conf.maxiter < 0) ||
+ (conf.stepfactor <= 0)) {
+ info = MP_ERR_PARAM;
+ goto CLEANUP;
+ }
+
+ /* Ensure there are some degrees of freedom */
+ if (m < nfree) {
+ info = MP_ERR_DOF;
+ goto CLEANUP;
+ }
+
+ /* Allocate temporary storage */
+ mp_malloc(fvec, double, m);
+ mp_malloc(qtf, double, nfree);
+ mp_malloc(x, double, nfree);
+ mp_malloc(xnew, double, npar);
+ mp_malloc(fjac, double, m*nfree);
+ ldfjac = m;
+ mp_malloc(diag, double, npar);
+ mp_malloc(wa1, double, npar);
+ mp_malloc(wa2, double, npar);
+ mp_malloc(wa3, double, npar);
+ mp_malloc(wa4, double, m);
+ mp_malloc(ipvt, int, npar);
+ mp_malloc(dvecptr, double *, npar);
+
+ /* Evaluate user function with initial parameter values */
+ iflag = mp_call(funct, m, npar, xall, fvec, 0, private_data);
+ nfev += 1;
+ if (iflag < 0) {
+ goto CLEANUP;
+ }
+
+ fnorm = mp_enorm(m, fvec);
+ orignorm = fnorm*fnorm;
+
+ /* Make a new copy */
+ for (i=0; i<npar; i++) {
+ xnew[i] = xall[i];
+ }
+
+ /* Transfer free parameters to 'x' */
+ for (i=0; i<nfree; i++) {
+ x[i] = xall[ifree[i]];
+ }
+
+ /* Initialize Levelberg-Marquardt parameter and iteration counter */
+
+ par = 0.0;
+ iter = 1;
+ for (i=0; i<nfree; i++) {
+ qtf[i] = 0;
+ }
+
+ /* Beginning of the outer loop */
+ OUTER_LOOP:
+ for (i=0; i<nfree; i++) {
+ xnew[ifree[i]] = x[i];
+ }
+
+ /* XXX call iterproc */
+
+ /* Calculate the jacobian matrix */
+ iflag = mp_fdjac2(funct, m, nfree, ifree, npar, xnew, fvec, fjac, ldfjac,
+ conf.epsfcn, wa4, private_data, &nfev,
+ step, dstep, mpside, qulim, ulim,
+ ddebug, ddrtol, ddatol, wa2, dvecptr);
+ if (iflag < 0) {
+ goto CLEANUP;
+ }
+
+ /* Determine if any of the parameters are pegged at the limits */
+ if (qanylim) {
+ for (j=0; j<nfree; j++) {
+ int lpegged = (qllim[j] && (x[j] == llim[j]));
+ int upegged = (qulim[j] && (x[j] == ulim[j]));
+ sum = 0;
+
+ /* If the parameter is pegged at a limit, compute the gradient
+ direction */
+ if (lpegged || upegged) {
+ ij = j*ldfjac;
+ for (i=0; i<m; i++, ij++) {
+ sum += fvec[i] * fjac[ij];
+ }
+ }
+ /* If pegged at lower limit and gradient is toward negative then
+ reset gradient to zero */
+ if (lpegged && (sum > 0)) {
+ ij = j*ldfjac;
+ for (i=0; i<m; i++, ij++) fjac[ij] = 0;
+ }
+ /* If pegged at upper limit and gradient is toward positive then
+ reset gradient to zero */
+ if (upegged && (sum < 0)) {
+ ij = j*ldfjac;
+ for (i=0; i<m; i++, ij++) fjac[ij] = 0;
+ }
+ }
+ }
+
+ /* Compute the QR factorization of the jacobian */
+ mp_qrfac(m,nfree,fjac,ldfjac,1,ipvt,nfree,wa1,wa2,wa3);
+
+ /*
+ * on the first iteration and if mode is 1, scale according
+ * to the norms of the columns of the initial jacobian.
+ */
+ if (iter == 1) {
+ if (conf.douserscale == 0) {
+ for (j=0; j<nfree; j++) {
+ diag[ifree[j]] = wa2[j];
+ if (wa2[j] == zero ) {
+ diag[ifree[j]] = one;
+ }
+ }
+ }
+
+ /*
+ * on the first iteration, calculate the norm of the scaled x
+ * and initialize the step bound delta.
+ */
+ for (j=0; j<nfree; j++ ) {
+ wa3[j] = diag[ifree[j]] * x[j];
+ }
+
+ xnorm = mp_enorm(nfree, wa3);
+ delta = conf.stepfactor*xnorm;
+ if (delta == zero) delta = conf.stepfactor;
+ }
+
+ /*
+ * form (q transpose)*fvec and store the first n components in
+ * qtf.
+ */
+ for (i=0; i<m; i++ ) {
+ wa4[i] = fvec[i];
+ }
+
+ jj = 0;
+ for (j=0; j<nfree; j++ ) {
+ temp3 = fjac[jj];
+ if (temp3 != zero) {
+ sum = zero;
+ ij = jj;
+ for (i=j; i<m; i++ ) {
+ sum += fjac[ij] * wa4[i];
+ ij += 1; /* fjac[i+m*j] */
+ }
+ temp = -sum / temp3;
+ ij = jj;
+ for (i=j; i<m; i++ ) {
+ wa4[i] += fjac[ij] * temp;
+ ij += 1; /* fjac[i+m*j] */
+ }
+ }
+ fjac[jj] = wa1[j];
+ jj += m+1; /* fjac[j+m*j] */
+ qtf[j] = wa4[j];
+ }
+
+ /* ( From this point on, only the square matrix, consisting of the
+ triangle of R, is needed.) */
+
+
+ if (conf.nofinitecheck) {
+ /* Check for overflow. This should be a cheap test here since FJAC
+ has been reduced to a (small) square matrix, and the test is
+ O(N^2). */
+ int off = 0, nonfinite = 0;
+
+ for (j=0; j<nfree; j++) {
+ for (i=0; i<nfree; i++) {
+ if (mpfinite(fjac[off+i]) == 0) nonfinite = 1;
+ }
+ off += ldfjac;
+ }
+
+ if (nonfinite) {
+ info = MP_ERR_NAN;
+ goto CLEANUP;
+ }
+ }
+
+
+ /*
+ * compute the norm of the scaled gradient.
+ */
+ gnorm = zero;
+ if (fnorm != zero) {
+ jj = 0;
+ for (j=0; j<nfree; j++ ) {
+ l = ipvt[j];
+ if (wa2[l] != zero) {
+ sum = zero;
+ ij = jj;
+ for (i=0; i<=j; i++ ) {
+ sum += fjac[ij]*(qtf[i]/fnorm);
+ ij += 1; /* fjac[i+m*j] */
+ }
+ gnorm = mp_dmax1(gnorm,fabs(sum/wa2[l]));
+ }
+ jj += m;
+ }
+ }
+
+ /*
+ * test for convergence of the gradient norm.
+ */
+ if (gnorm <= conf.gtol) info = MP_OK_DIR;
+ if (info != 0) goto L300;
+ if (conf.maxiter == 0) {
+ info = MP_MAXITER;
+ goto L300;
+ }
+
+ /*
+ * rescale if necessary.
+ */
+ if (conf.douserscale == 0) {
+ for (j=0; j<nfree; j++ ) {
+ diag[ifree[j]] = mp_dmax1(diag[ifree[j]],wa2[j]);
+ }
+ }
+
+ /*
+ * beginning of the inner loop.
+ */
+ L200:
+ /*
+ * determine the levenberg-marquardt parameter.
+ */
+ mp_lmpar(nfree,fjac,ldfjac,ipvt,ifree,diag,qtf,delta,&par,wa1,wa2,wa3,wa4);
+ /*
+ * store the direction p and x + p. calculate the norm of p.
+ */
+ for (j=0; j<nfree; j++ ) {
+ wa1[j] = -wa1[j];
+ }
+
+ alpha = 1.0;
+ if (qanylim == 0) {
+ /* No parameter limits, so just move to new position WA2 */
+ for (j=0; j<nfree; j++ ) {
+ wa2[j] = x[j] + wa1[j];
+ }
+
+ } else {
+ /* Respect the limits. If a step were to go out of bounds, then
+ * we should take a step in the same direction but shorter distance.
+ * The step should take us right to the limit in that case.
+ */
+ for (j=0; j<nfree; j++) {
+ int lpegged = (qllim[j] && (x[j] <= llim[j]));
+ int upegged = (qulim[j] && (x[j] >= ulim[j]));
+ int dwa1 = fabs(wa1[j]) > MP_MACHEP0;
+
+ if (lpegged && (wa1[j] < 0)) wa1[j] = 0;
+ if (upegged && (wa1[j] > 0)) wa1[j] = 0;
+
+ if (dwa1 && qllim[j] && ((x[j] + wa1[j]) < llim[j])) {
+ alpha = mp_dmin1(alpha, (llim[j]-x[j])/wa1[j]);
+ }
+ if (dwa1 && qulim[j] && ((x[j] + wa1[j]) > ulim[j])) {
+ alpha = mp_dmin1(alpha, (ulim[j]-x[j])/wa1[j]);
+ }
+ }
+
+ /* Scale the resulting vector, advance to the next position */
+ for (j=0; j<nfree; j++) {
+ double sgnu, sgnl;
+ double ulim1, llim1;
+
+ wa1[j] = wa1[j] * alpha;
+ wa2[j] = x[j] + wa1[j];
+
+ /* Adjust the output values. If the step put us exactly
+ * on a boundary, make sure it is exact.
+ */
+ sgnu = (ulim[j] >= 0) ? (+1) : (-1);
+ sgnl = (llim[j] >= 0) ? (+1) : (-1);
+ ulim1 = ulim[j]*(1-sgnu*MP_MACHEP0) - ((ulim[j] == 0)?(MP_MACHEP0):0);
+ llim1 = llim[j]*(1+sgnl*MP_MACHEP0) + ((llim[j] == 0)?(MP_MACHEP0):0);
+
+ if (qulim[j] && (wa2[j] >= ulim1)) {
+ wa2[j] = ulim[j];
+ }
+ if (qllim[j] && (wa2[j] <= llim1)) {
+ wa2[j] = llim[j];
+ }
+ }
+
+ }
+
+ for (j=0; j<nfree; j++ ) {
+ wa3[j] = diag[ifree[j]]*wa1[j];
+ }
+
+ pnorm = mp_enorm(nfree,wa3);
+
+ /*
+ * on the first iteration, adjust the initial step bound.
+ */
+ if (iter == 1) {
+ delta = mp_dmin1(delta,pnorm);
+ }
+
+ /*
+ * evaluate the function at x + p and calculate its norm.
+ */
+ for (i=0; i<nfree; i++) {
+ xnew[ifree[i]] = wa2[i];
+ }
+
+ iflag = mp_call(funct, m, npar, xnew, wa4, 0, private_data);
+ nfev += 1;
+ if (iflag < 0) goto L300;
+
+ fnorm1 = mp_enorm(m,wa4);
+
+ /*
+ * compute the scaled actual reduction.
+ */
+ actred = -one;
+ if ((p1*fnorm1) < fnorm) {
+ temp = fnorm1/fnorm;
+ actred = one - temp * temp;
+ }
+
+ /*
+ * compute the scaled predicted reduction and
+ * the scaled directional derivative.
+ */
+ jj = 0;
+ for (j=0; j<nfree; j++ ) {
+ wa3[j] = zero;
+ l = ipvt[j];
+ temp = wa1[l];
+ ij = jj;
+ for (i=0; i<=j; i++ ) {
+ wa3[i] += fjac[ij]*temp;
+ ij += 1; /* fjac[i+m*j] */
+ }
+ jj += m;
+ }
+
+ /* Remember, alpha is the fraction of the full LM step actually
+ * taken
+ */
+
+ temp1 = mp_enorm(nfree,wa3)*alpha/fnorm;
+ temp2 = (sqrt(alpha*par)*pnorm)/fnorm;
+ prered = temp1*temp1 + (temp2*temp2)/p5;
+ dirder = -(temp1*temp1 + temp2*temp2);
+
+ /*
+ * compute the ratio of the actual to the predicted
+ * reduction.
+ */
+ ratio = zero;
+ if (prered != zero) {
+ ratio = actred/prered;
+ }
+
+ /*
+ * update the step bound.
+ */
+
+ if (ratio <= p25) {
+ if (actred >= zero) {
+ temp = p5;
+ } else {
+ temp = p5*dirder/(dirder + p5*actred);
+ }
+ if (((p1*fnorm1) >= fnorm)
+ || (temp < p1) ) {
+ temp = p1;
+ }
+ delta = temp*mp_dmin1(delta,pnorm/p1);
+ par = par/temp;
+ } else {
+ if ((par == zero) || (ratio >= p75) ) {
+ delta = pnorm/p5;
+ par = p5*par;
+ }
+ }
+
+ /*
+ * test for successful iteration.
+ */
+ if (ratio >= p0001) {
+
+ /*
+ * successful iteration. update x, fvec, and their norms.
+ */
+ for (j=0; j<nfree; j++ ) {
+ x[j] = wa2[j];
+ wa2[j] = diag[ifree[j]]*x[j];
+ }
+ for (i=0; i<m; i++ ) {
+ fvec[i] = wa4[i];
+ }
+ xnorm = mp_enorm(nfree,wa2);
+ fnorm = fnorm1;
+ iter += 1;
+ }
+
+ /*
+ * tests for convergence.
+ */
+ if ((fabs(actred) <= conf.ftol) && (prered <= conf.ftol) &&
+ (p5*ratio <= one) ) {
+ info = MP_OK_CHI;
+ }
+ if (delta <= conf.xtol*xnorm) {
+ info = MP_OK_PAR;
+ }
+ if ((fabs(actred) <= conf.ftol) && (prered <= conf.ftol) && (p5*ratio <= one)
+ && ( info == 2) ) {
+ info = MP_OK_BOTH;
+ }
+ if (info != 0) {
+ goto L300;
+ }
+
+ /*
+ * tests for termination and stringent tolerances.
+ */
+ if ((conf.maxfev > 0) && (nfev >= conf.maxfev)) {
+ /* Too many function evaluations */
+ info = MP_MAXITER;
+ }
+ if (iter >= conf.maxiter) {
+ /* Too many iterations */
+ info = MP_MAXITER;
+ }
+ if ((fabs(actred) <= MP_MACHEP0) && (prered <= MP_MACHEP0) && (p5*ratio <= one) ) {
+ info = MP_FTOL;
+ }
+ if (delta <= MP_MACHEP0*xnorm) {
+ info = MP_XTOL;
+ }
+ if (gnorm <= MP_MACHEP0) {
+ info = MP_GTOL;
+ }
+ if (info != 0) {
+ goto L300;
+ }
+
+ /*
+ * end of the inner loop. repeat if iteration unsuccessful.
+ */
+ if (ratio < p0001) goto L200;
+ /*
+ * end of the outer loop.
+ */
+ goto OUTER_LOOP;
+
+ L300:
+ /*
+ * termination, either normal or user imposed.
+ */
+ if (iflag < 0) {
+ info = iflag;
+ }
+ iflag = 0;
+
+ for (i=0; i<nfree; i++) {
+ xall[ifree[i]] = x[i];
+ }
+
+ if ((conf.nprint > 0) && (info > 0)) {
+ iflag = mp_call(funct, m, npar, xall, fvec, 0, private_data);
+ nfev += 1;
+ }
+
+ /* Compute number of pegged parameters */
+ npegged = 0;
+ if (pars) for (i=0; i<npar; i++) {
+ if ((pars[i].limited[0] && (pars[i].limits[0] == xall[i])) ||
+ (pars[i].limited[1] && (pars[i].limits[1] == xall[i]))) {
+ npegged ++;
+ }
+ }
+
+ /* Compute and return the covariance matrix and/or parameter errors */
+ if (result && (result->covar || result->xerror)) {
+ mp_covar(nfree, fjac, ldfjac, ipvt, conf.covtol, wa2);
+
+ if (result->covar) {
+ /* Zero the destination covariance array */
+ for (j=0; j<(npar*npar); j++) result->covar[j] = 0;
+
+ /* Transfer the covariance array */
+ for (j=0; j<nfree; j++) {
+ for (i=0; i<nfree; i++) {
+ result->covar[ifree[j]*npar+ifree[i]] = fjac[j*ldfjac+i];
+ }
+ }
+ }
+
+ if (result->xerror) {
+ for (j=0; j<npar; j++) result->xerror[j] = 0;
+
+ for (j=0; j<nfree; j++) {
+ double cc = fjac[j*ldfjac+j];
+ if (cc > 0) result->xerror[ifree[j]] = sqrt(cc);
+ }
+ }
+ }
+
+ if (result) {
+ strcpy(result->version, MPFIT_VERSION);
+ result->bestnorm = mp_dmax1(fnorm,fnorm1);
+ result->bestnorm *= result->bestnorm;
+ result->orignorm = orignorm;
+ result->status = info;
+ result->niter = iter;
+ result->nfev = nfev;
+ result->npar = npar;
+ result->nfree = nfree;
+ result->npegged = npegged;
+ result->nfunc = m;
+
+ /* Copy residuals if requested */
+ if (result->resid) {
+ for (j=0; j<m; j++) result->resid[j] = fvec[j];
+ }
+ }
+
+
+ CLEANUP:
+ if (fvec) free(fvec);
+ if (qtf) free(qtf);
+ if (x) free(x);
+ if (xnew) free(xnew);
+ if (fjac) free(fjac);
+ if (diag) free(diag);
+ if (wa1) free(wa1);
+ if (wa2) free(wa2);
+ if (wa3) free(wa3);
+ if (wa4) free(wa4);
+ if (ipvt) free(ipvt);
+ if (pfixed) free(pfixed);
+ if (step) free(step);
+ if (dstep) free(dstep);
+ if (mpside) free(mpside);
+ if (ddebug) free(ddebug);
+ if (ddrtol) free(ddrtol);
+ if (ddatol) free(ddatol);
+ if (ifree) free(ifree);
+ if (qllim) free(qllim);
+ if (qulim) free(qulim);
+ if (llim) free(llim);
+ if (ulim) free(ulim);
+ if (dvecptr) free(dvecptr);
+
+ return info;
+}
+
+
+/************************fdjac2.c*************************/
+
+static
+int mp_fdjac2(mp_func funct,
+ int m, int n, int *ifree, int npar, double *x, double *fvec,
+ double *fjac, int ldfjac, double epsfcn,
+ double *wa, void *priv, int *nfev,
+ double *step, double *dstep, int *dside,
+ int *qulimited, double *ulimit,
+ int *ddebug, double *ddrtol, double *ddatol,
+ double *wa2, double **dvec)
+{
+/*
+* **********
+*
+* subroutine fdjac2
+*
+* this subroutine computes a forward-difference approximation
+* to the m by n jacobian matrix associated with a specified
+* problem of m functions in n variables.
+*
+* the subroutine statement is
+*
+* subroutine fdjac2(fcn,m,n,x,fvec,fjac,ldfjac,iflag,epsfcn,wa)
+*
+* where
+*
+* fcn is the name of the user-supplied subroutine which
+* calculates the functions. fcn must be declared
+* in an external statement in the user calling
+* program, and should be written as follows.
+*
+* subroutine fcn(m,n,x,fvec,iflag)
+* integer m,n,iflag
+* double precision x(n),fvec(m)
+* ----------
+* calculate the functions at x and
+* return this vector in fvec.
+* ----------
+* return
+* end
+*
+* the value of iflag should not be changed by fcn unless
+* the user wants to terminate execution of fdjac2.
+* in this case set iflag to a negative integer.
+*
+* m is a positive integer input variable set to the number
+* of functions.
+*
+* n is a positive integer input variable set to the number
+* of variables. n must not exceed m.
+*
+* x is an input array of length n.
+*
+* fvec is an input array of length m which must contain the
+* functions evaluated at x.
+*
+* fjac is an output m by n array which contains the
+* approximation to the jacobian matrix evaluated at x.
+*
+* ldfjac is a positive integer input variable not less than m
+* which specifies the leading dimension of the array fjac.
+*
+* iflag is an integer variable which can be used to terminate
+* the execution of fdjac2. see description of fcn.
+*
+* epsfcn is an input variable used in determining a suitable
+* step length for the forward-difference approximation. this
+* approximation assumes that the relative errors in the
+* functions are of the order of epsfcn. if epsfcn is less
+* than the machine precision, it is assumed that the relative
+* errors in the functions are of the order of the machine
+* precision.
+*
+* wa is a work array of length m.
+*
+* subprograms called
+*
+* user-supplied ...... fcn
+*
+* minpack-supplied ... dpmpar
+*
+* fortran-supplied ... dabs,dmax1,dsqrt
+*
+* argonne national laboratory. minpack project. march 1980.
+* burton s. garbow, kenneth e. hillstrom, jorge j. more
+*
+ **********
+*/
+ int i,j,ij;
+ int iflag = 0;
+ double eps,h,temp;
+ static double zero = 0.0;
+ int has_analytical_deriv = 0, has_numerical_deriv = 0;
+ int has_debug_deriv = 0;
+
+ temp = mp_dmax1(epsfcn,MP_MACHEP0);
+ eps = sqrt(temp);
+ ij = 0;
+ ldfjac = 0; /* Prevent compiler warning */
+ if (ldfjac){} /* Prevent compiler warning */
+
+ for (j=0; j<npar; j++) dvec[j] = 0;
+
+ /* Initialize the Jacobian derivative matrix */
+ for (j=0; j<(n*m); j++) fjac[j] = 0;
+
+ /* Check for which parameters need analytical derivatives and which
+ need numerical ones */
+ for (j=0; j<n; j++) { /* Loop through free parameters only */
+ if (dside && dside[ifree[j]] == 3 && ddebug[ifree[j]] == 0) {
+ /* Purely analytical derivatives */
+ dvec[ifree[j]] = fjac + j*m;
+ has_analytical_deriv = 1;
+ } else if (dside && ddebug[ifree[j]] == 1) {
+ /* Numerical and analytical derivatives as a debug cross-check */
+ dvec[ifree[j]] = fjac + j*m;
+ has_analytical_deriv = 1;
+ has_numerical_deriv = 1;
+ has_debug_deriv = 1;
+ } else {
+ has_numerical_deriv = 1;
+ }
+ }
+
+ /* If there are any parameters requiring analytical derivatives,
+ then compute them first. */
+ if (has_analytical_deriv) {
+ iflag = mp_call(funct, m, npar, x, wa, dvec, priv);
+ if (nfev) *nfev = *nfev + 1;
+ if (iflag < 0 ) goto DONE;
+ }
+
+ if (has_debug_deriv) {
+ printf("FJAC DEBUG BEGIN\n");
+ printf("# %10s %10s %10s %10s %10s %10s\n",
+ "IPNT", "FUNC", "DERIV_U", "DERIV_N", "DIFF_ABS", "DIFF_REL");
+ }
+
+ /* Any parameters requiring numerical derivatives */
+ if (has_numerical_deriv) for (j=0; j<n; j++) { /* Loop thru free parms */
+ int dsidei = (dside)?(dside[ifree[j]]):(0);
+ int debug = ddebug[ifree[j]];
+ double dr = ddrtol[ifree[j]], da = ddatol[ifree[j]];
+
+ /* Check for debugging */
+ if (debug) {
+ printf("FJAC PARM %d\n", ifree[j]);
+ }
+
+ /* Skip parameters already done by user-computed partials */
+ if (dside && dsidei == 3) continue;
+
+ temp = x[ifree[j]];
+ h = eps * fabs(temp);
+ if (step && step[ifree[j]] > 0) h = step[ifree[j]];
+ if (dstep && dstep[ifree[j]] > 0) h = fabs(dstep[ifree[j]]*temp);
+ if (h == zero) h = eps;
+
+ /* If negative step requested, or we are against the upper limit */
+ if ((dside && dsidei == -1) ||
+ (dside && dsidei == 0 &&
+ qulimited && ulimit && qulimited[j] &&
+ (temp > (ulimit[j]-h)))) {
+ h = -h;
+ }
+
+ x[ifree[j]] = temp + h;
+ iflag = mp_call(funct, m, npar, x, wa, 0, priv);
+ if (nfev) *nfev = *nfev + 1;
+ if (iflag < 0 ) goto DONE;
+ x[ifree[j]] = temp;
+
+ if (dsidei <= 1) {
+ /* COMPUTE THE ONE-SIDED DERIVATIVE */
+ if (! debug) {
+ /* Non-debug path for speed */
+ for (i=0; i<m; i++, ij++) {
+ fjac[ij] = (wa[i] - fvec[i])/h; /* fjac[i+m*j] */
+ }
+ } else {
+ /* Debug path for correctness */
+ for (i=0; i<m; i++, ij++) {
+ double fjold = fjac[ij];
+ fjac[ij] = (wa[i] - fvec[i])/h; /* fjac[i+m*j] */
+ if ((da == 0 && dr == 0 && (fjold != 0 || fjac[ij] != 0)) ||
+ ((da != 0 || dr != 0) && (fabs(fjold-fjac[ij]) > da + fabs(fjold)*dr))) {
+ printf(" %10d %10.4g %10.4g %10.4g %10.4g %10.4g\n",
+ i, fvec[i], fjold, fjac[ij], fjold-fjac[ij],
+ (fjold == 0)?(0):((fjold-fjac[ij])/fjold));
+ }
+ }
+ } /* end debugging */
+
+ } else { /* dside > 2 */
+ /* COMPUTE THE TWO-SIDED DERIVATIVE */
+ for (i=0; i<m; i++) {
+ wa2[i] = wa[i];
+ }
+
+ /* Evaluate at x - h */
+ x[ifree[j]] = temp - h;
+ iflag = mp_call(funct, m, npar, x, wa, 0, priv);
+ if (nfev) *nfev = *nfev + 1;
+ if (iflag < 0 ) goto DONE;
+ x[ifree[j]] = temp;
+
+ /* Now compute derivative as (f(x+h) - f(x-h))/(2h) */
+ if (! debug ) {
+ /* Non-debug path for speed */
+ for (i=0; i<m; i++, ij++) {
+ fjac[ij] = (fjac[ij] - wa[i])/(2*h); /* fjac[i+m*j] */
+ }
+ } else {
+ /* Debug path for correctness */
+ for (i=0; i<m; i++, ij++) {
+ double fjold = fjac[ij];
+ fjac[ij] = (wa2[i] - wa[i])/(2*h); /* fjac[i+m*j] */
+ if ((da == 0 && dr == 0 && (fjold != 0 || fjac[ij] != 0)) ||
+ ((da != 0 || dr != 0) && (fabs(fjold-fjac[ij]) > da + fabs(fjold)*dr))) {
+ printf(" %10d %10.4g %10.4g %10.4g %10.4g %10.4g\n",
+ i, fvec[i], fjold, fjac[ij], fjold-fjac[ij],
+ (fjold == 0)?(0):((fjold-fjac[ij])/fjold));
+ }
+ }
+ } /* end debugging */
+
+ } /* if (dside > 2) */
+ } /* if (has_numerical_derivative) */
+
+ if (has_debug_deriv) {
+ printf("FJAC DEBUG END\n");
+ }
+
+ DONE:
+ if (iflag < 0) return iflag;
+ return 0;
+ /*
+ * last card of subroutine fdjac2.
+ */
+}
+
+
+/************************qrfac.c*************************/
+
+static
+void mp_qrfac(int m, int n, double *a, int lda,
+ int pivot, int *ipvt, int lipvt,
+ double *rdiag, double *acnorm, double *wa)
+{
+/*
+* **********
+*
+* subroutine qrfac
+*
+* this subroutine uses householder transformations with column
+* pivoting (optional) to compute a qr factorization of the
+* m by n matrix a. that is, qrfac determines an orthogonal
+* matrix q, a permutation matrix p, and an upper trapezoidal
+* matrix r with diagonal elements of nonincreasing magnitude,
+* such that a*p = q*r. the householder transformation for
+* column k, k = 1,2,...,min(m,n), is of the form
+*
+* t
+* i - (1/u(k))*u*u
+*
+* where u has zeros in the first k-1 positions. the form of
+* this transformation and the method of pivoting first
+* appeared in the corresponding linpack subroutine.
+*
+* the subroutine statement is
+*
+* subroutine qrfac(m,n,a,lda,pivot,ipvt,lipvt,rdiag,acnorm,wa)
+*
+* where
+*
+* m is a positive integer input variable set to the number
+* of rows of a.
+*
+* n is a positive integer input variable set to the number
+* of columns of a.
+*
+* a is an m by n array. on input a contains the matrix for
+* which the qr factorization is to be computed. on output
+* the strict upper trapezoidal part of a contains the strict
+* upper trapezoidal part of r, and the lower trapezoidal
+* part of a contains a factored form of q (the non-trivial
+* elements of the u vectors described above).
+*
+* lda is a positive integer input variable not less than m
+* which specifies the leading dimension of the array a.
+*
+* pivot is a logical input variable. if pivot is set true,
+* then column pivoting is enforced. if pivot is set false,
+* then no column pivoting is done.
+*
+* ipvt is an integer output array of length lipvt. ipvt
+* defines the permutation matrix p such that a*p = q*r.
+* column j of p is column ipvt(j) of the identity matrix.
+* if pivot is false, ipvt is not referenced.
+*
+* lipvt is a positive integer input variable. if pivot is false,
+* then lipvt may be as small as 1. if pivot is true, then
+* lipvt must be at least n.
+*
+* rdiag is an output array of length n which contains the
+* diagonal elements of r.
+*
+* acnorm is an output array of length n which contains the
+* norms of the corresponding columns of the input matrix a.
+* if this information is not needed, then acnorm can coincide
+* with rdiag.
+*
+* wa is a work array of length n. if pivot is false, then wa
+* can coincide with rdiag.
+*
+* subprograms called
+*
+* minpack-supplied ... dpmpar,enorm
+*
+* fortran-supplied ... dmax1,dsqrt,min0
+*
+* argonne national laboratory. minpack project. march 1980.
+* burton s. garbow, kenneth e. hillstrom, jorge j. more
+*
+* **********
+*/
+ int i,ij,jj,j,jp1,k,kmax,minmn;
+ double ajnorm,sum,temp;
+ static double zero = 0.0;
+ static double one = 1.0;
+ static double p05 = 0.05;
+
+ lda = 0; /* Prevent compiler warning */
+ lipvt = 0; /* Prevent compiler warning */
+ if (lda) {} /* Prevent compiler warning */
+ if (lipvt) {} /* Prevent compiler warning */
+
+ /*
+ * compute the initial column norms and initialize several arrays.
+ */
+ ij = 0;
+ for (j=0; j<n; j++) {
+ acnorm[j] = mp_enorm(m,&a[ij]);
+ rdiag[j] = acnorm[j];
+ wa[j] = rdiag[j];
+ if (pivot != 0)
+ ipvt[j] = j;
+ ij += m; /* m*j */
+ }
+ /*
+ * reduce a to r with householder transformations.
+ */
+ minmn = mp_min0(m,n);
+ for (j=0; j<minmn; j++) {
+ if (pivot == 0)
+ goto L40;
+ /*
+ * bring the column of largest norm into the pivot position.
+ */
+ kmax = j;
+ for (k=j; k<n; k++)
+ {
+ if (rdiag[k] > rdiag[kmax])
+ kmax = k;
+ }
+ if (kmax == j)
+ goto L40;
+
+ ij = m * j;
+ jj = m * kmax;
+ for (i=0; i<m; i++)
+ {
+ temp = a[ij]; /* [i+m*j] */
+ a[ij] = a[jj]; /* [i+m*kmax] */
+ a[jj] = temp;
+ ij += 1;
+ jj += 1;
+ }
+ rdiag[kmax] = rdiag[j];
+ wa[kmax] = wa[j];
+ k = ipvt[j];
+ ipvt[j] = ipvt[kmax];
+ ipvt[kmax] = k;
+
+ L40:
+ /*
+ * compute the householder transformation to reduce the
+ * j-th column of a to a multiple of the j-th unit vector.
+ */
+ jj = j + m*j;
+ ajnorm = mp_enorm(m-j,&a[jj]);
+ if (ajnorm == zero)
+ goto L100;
+ if (a[jj] < zero)
+ ajnorm = -ajnorm;
+ ij = jj;
+ for (i=j; i<m; i++)
+ {
+ a[ij] /= ajnorm;
+ ij += 1; /* [i+m*j] */
+ }
+ a[jj] += one;
+ /*
+ * apply the transformation to the remaining columns
+ * and update the norms.
+ */
+ jp1 = j + 1;
+ if (jp1 < n)
+ {
+ for (k=jp1; k<n; k++)
+ {
+ sum = zero;
+ ij = j + m*k;
+ jj = j + m*j;
+ for (i=j; i<m; i++)
+ {
+ sum += a[jj]*a[ij];
+ ij += 1; /* [i+m*k] */
+ jj += 1; /* [i+m*j] */
+ }
+ temp = sum/a[j+m*j];
+ ij = j + m*k;
+ jj = j + m*j;
+ for (i=j; i<m; i++)
+ {
+ a[ij] -= temp*a[jj];
+ ij += 1; /* [i+m*k] */
+ jj += 1; /* [i+m*j] */
+ }
+ if ((pivot != 0) && (rdiag[k] != zero))
+ {
+ temp = a[j+m*k]/rdiag[k];
+ temp = mp_dmax1( zero, one-temp*temp );
+ rdiag[k] *= sqrt(temp);
+ temp = rdiag[k]/wa[k];
+ if ((p05*temp*temp) <= MP_MACHEP0)
+ {
+ rdiag[k] = mp_enorm(m-j-1,&a[jp1+m*k]);
+ wa[k] = rdiag[k];
+ }
+ }
+ }
+ }
+
+ L100:
+ rdiag[j] = -ajnorm;
+ }
+ /*
+ * last card of subroutine qrfac.
+ */
+}
+
+/************************qrsolv.c*************************/
+
+static
+void mp_qrsolv(int n, double *r, int ldr, int *ipvt, double *diag,
+ double *qtb, double *x, double *sdiag, double *wa)
+{
+/*
+* **********
+*
+* subroutine qrsolv
+*
+* given an m by n matrix a, an n by n diagonal matrix d,
+* and an m-vector b, the problem is to determine an x which
+* solves the system
+*
+* a*x = b , d*x = 0 ,
+*
+* in the least squares sense.
+*
+* this subroutine completes the solution of the problem
+* if it is provided with the necessary information from the
+* qr factorization, with column pivoting, of a. that is, if
+* a*p = q*r, where p is a permutation matrix, q has orthogonal
+* columns, and r is an upper triangular matrix with diagonal
+* elements of nonincreasing magnitude, then qrsolv expects
+* the full upper triangle of r, the permutation matrix p,
+* and the first n components of (q transpose)*b. the system
+* a*x = b, d*x = 0, is then equivalent to
+*
+* t t
+* r*z = q *b , p *d*p*z = 0 ,
+*
+* where x = p*z. if this system does not have full rank,
+* then a least squares solution is obtained. on output qrsolv
+* also provides an upper triangular matrix s such that
+*
+* t t t
+* p *(a *a + d*d)*p = s *s .
+*
+* s is computed within qrsolv and may be of separate interest.
+*
+* the subroutine statement is
+*
+* subroutine qrsolv(n,r,ldr,ipvt,diag,qtb,x,sdiag,wa)
+*
+* where
+*
+* n is a positive integer input variable set to the order of r.
+*
+* r is an n by n array. on input the full upper triangle
+* must contain the full upper triangle of the matrix r.
+* on output the full upper triangle is unaltered, and the
+* strict lower triangle contains the strict upper triangle
+* (transposed) of the upper triangular matrix s.
+*
+* ldr is a positive integer input variable not less than n
+* which specifies the leading dimension of the array r.
+*
+* ipvt is an integer input array of length n which defines the
+* permutation matrix p such that a*p = q*r. column j of p
+* is column ipvt(j) of the identity matrix.
+*
+* diag is an input array of length n which must contain the
+* diagonal elements of the matrix d.
+*
+* qtb is an input array of length n which must contain the first
+* n elements of the vector (q transpose)*b.
+*
+* x is an output array of length n which contains the least
+* squares solution of the system a*x = b, d*x = 0.
+*
+* sdiag is an output array of length n which contains the
+* diagonal elements of the upper triangular matrix s.
+*
+* wa is a work array of length n.
+*
+* subprograms called
+*
+* fortran-supplied ... dabs,dsqrt
+*
+* argonne national laboratory. minpack project. march 1980.
+* burton s. garbow, kenneth e. hillstrom, jorge j. more
+*
+* **********
+*/
+ int i,ij,ik,kk,j,jp1,k,kp1,l,nsing;
+ double cosx,cotan,qtbpj,sinx,sum,tanx,temp;
+ static double zero = 0.0;
+ static double p25 = 0.25;
+ static double p5 = 0.5;
+
+ /*
+ * copy r and (q transpose)*b to preserve input and initialize s.
+ * in particular, save the diagonal elements of r in x.
+ */
+ kk = 0;
+ for (j=0; j<n; j++) {
+ ij = kk;
+ ik = kk;
+ for (i=j; i<n; i++)
+ {
+ r[ij] = r[ik];
+ ij += 1; /* [i+ldr*j] */
+ ik += ldr; /* [j+ldr*i] */
+ }
+ x[j] = r[kk];
+ wa[j] = qtb[j];
+ kk += ldr+1; /* j+ldr*j */
+ }
+
+ /*
+ * eliminate the diagonal matrix d using a givens rotation.
+ */
+ for (j=0; j<n; j++) {
+ /*
+ * prepare the row of d to be eliminated, locating the
+ * diagonal element using p from the qr factorization.
+ */
+ l = ipvt[j];
+ if (diag[l] == zero)
+ goto L90;
+ for (k=j; k<n; k++)
+ sdiag[k] = zero;
+ sdiag[j] = diag[l];
+ /*
+ * the transformations to eliminate the row of d
+ * modify only a single element of (q transpose)*b
+ * beyond the first n, which is initially zero.
+ */
+ qtbpj = zero;
+ for (k=j; k<n; k++)
+ {
+ /*
+ * determine a givens rotation which eliminates the
+ * appropriate element in the current row of d.
+ */
+ if (sdiag[k] == zero)
+ continue;
+ kk = k + ldr * k;
+ if (fabs(r[kk]) < fabs(sdiag[k]))
+ {
+ cotan = r[kk]/sdiag[k];
+ sinx = p5/sqrt(p25+p25*cotan*cotan);
+ cosx = sinx*cotan;
+ }
+ else
+ {
+ tanx = sdiag[k]/r[kk];
+ cosx = p5/sqrt(p25+p25*tanx*tanx);
+ sinx = cosx*tanx;
+ }
+ /*
+ * compute the modified diagonal element of r and
+ * the modified element of ((q transpose)*b,0).
+ */
+ r[kk] = cosx*r[kk] + sinx*sdiag[k];
+ temp = cosx*wa[k] + sinx*qtbpj;
+ qtbpj = -sinx*wa[k] + cosx*qtbpj;
+ wa[k] = temp;
+ /*
+ * accumulate the tranformation in the row of s.
+ */
+ kp1 = k + 1;
+ if (n > kp1)
+ {
+ ik = kk + 1;
+ for (i=kp1; i<n; i++)
+ {
+ temp = cosx*r[ik] + sinx*sdiag[i];
+ sdiag[i] = -sinx*r[ik] + cosx*sdiag[i];
+ r[ik] = temp;
+ ik += 1; /* [i+ldr*k] */
+ }
+ }
+ }
+ L90:
+ /*
+ * store the diagonal element of s and restore
+ * the corresponding diagonal element of r.
+ */
+ kk = j + ldr*j;
+ sdiag[j] = r[kk];
+ r[kk] = x[j];
+ }
+ /*
+ * solve the triangular system for z. if the system is
+ * singular, then obtain a least squares solution.
+ */
+ nsing = n;
+ for (j=0; j<n; j++) {
+ if ((sdiag[j] == zero) && (nsing == n))
+ nsing = j;
+ if (nsing < n)
+ wa[j] = zero;
+ }
+ if (nsing < 1)
+ goto L150;
+
+ for (k=0; k<nsing; k++) {
+ j = nsing - k - 1;
+ sum = zero;
+ jp1 = j + 1;
+ if (nsing > jp1)
+ {
+ ij = jp1 + ldr * j;
+ for (i=jp1; i<nsing; i++)
+ {
+ sum += r[ij]*wa[i];
+ ij += 1; /* [i+ldr*j] */
+ }
+ }
+ wa[j] = (wa[j] - sum)/sdiag[j];
+ }
+ L150:
+ /*
+ * permute the components of z back to components of x.
+ */
+ for (j=0; j<n; j++) {
+ l = ipvt[j];
+ x[l] = wa[j];
+ }
+ /*
+ * last card of subroutine qrsolv.
+ */
+}
+
+/************************lmpar.c*************************/
+
+static
+void mp_lmpar(int n, double *r, int ldr, int *ipvt, int *ifree, double *diag,
+ double *qtb, double delta, double *par, double *x,
+ double *sdiag, double *wa1, double *wa2)
+{
+ /* **********
+ *
+ * subroutine lmpar
+ *
+ * given an m by n matrix a, an n by n nonsingular diagonal
+ * matrix d, an m-vector b, and a positive number delta,
+ * the problem is to determine a value for the parameter
+ * par such that if x solves the system
+ *
+ * a*x = b , sqrt(par)*d*x = 0 ,
+ *
+ * in the least squares sense, and dxnorm is the euclidean
+ * norm of d*x, then either par is zero and
+ *
+ * (dxnorm-delta) .le. 0.1*delta ,
+ *
+ * or par is positive and
+ *
+ * abs(dxnorm-delta) .le. 0.1*delta .
+ *
+ * this subroutine completes the solution of the problem
+ * if it is provided with the necessary information from the
+ * qr factorization, with column pivoting, of a. that is, if
+ * a*p = q*r, where p is a permutation matrix, q has orthogonal
+ * columns, and r is an upper triangular matrix with diagonal
+ * elements of nonincreasing magnitude, then lmpar expects
+ * the full upper triangle of r, the permutation matrix p,
+ * and the first n components of (q transpose)*b. on output
+ * lmpar also provides an upper triangular matrix s such that
+ *
+ * t t t
+ * p *(a *a + par*d*d)*p = s *s .
+ *
+ * s is employed within lmpar and may be of separate interest.
+ *
+ * only a few iterations are generally needed for convergence
+ * of the algorithm. if, however, the limit of 10 iterations
+ * is reached, then the output par will contain the best
+ * value obtained so far.
+ *
+ * the subroutine statement is
+ *
+ * subroutine lmpar(n,r,ldr,ipvt,diag,qtb,delta,par,x,sdiag,
+ * wa1,wa2)
+ *
+ * where
+ *
+ * n is a positive integer input variable set to the order of r.
+ *
+ * r is an n by n array. on input the full upper triangle
+ * must contain the full upper triangle of the matrix r.
+ * on output the full upper triangle is unaltered, and the
+ * strict lower triangle contains the strict upper triangle
+ * (transposed) of the upper triangular matrix s.
+ *
+ * ldr is a positive integer input variable not less than n
+ * which specifies the leading dimension of the array r.
+ *
+ * ipvt is an integer input array of length n which defines the
+ * permutation matrix p such that a*p = q*r. column j of p
+ * is column ipvt(j) of the identity matrix.
+ *
+ * diag is an input array of length n which must contain the
+ * diagonal elements of the matrix d.
+ *
+ * qtb is an input array of length n which must contain the first
+ * n elements of the vector (q transpose)*b.
+ *
+ * delta is a positive input variable which specifies an upper
+ * bound on the euclidean norm of d*x.
+ *
+ * par is a nonnegative variable. on input par contains an
+ * initial estimate of the levenberg-marquardt parameter.
+ * on output par contains the final estimate.
+ *
+ * x is an output array of length n which contains the least
+ * squares solution of the system a*x = b, sqrt(par)*d*x = 0,
+ * for the output par.
+ *
+ * sdiag is an output array of length n which contains the
+ * diagonal elements of the upper triangular matrix s.
+ *
+ * wa1 and wa2 are work arrays of length n.
+ *
+ * subprograms called
+ *
+ * minpack-supplied ... dpmpar,mp_enorm,qrsolv
+ *
+ * fortran-supplied ... dabs,mp_dmax1,dmin1,dsqrt
+ *
+ * argonne national laboratory. minpack project. march 1980.
+ * burton s. garbow, kenneth e. hillstrom, jorge j. more
+ *
+ * **********
+ */
+ int i,iter,ij,jj,j,jm1,jp1,k,l,nsing;
+ double dxnorm,fp,gnorm,parc,parl,paru;
+ double sum,temp;
+ static double zero = 0.0;
+ /* static double one = 1.0; */
+ static double p1 = 0.1;
+ static double p001 = 0.001;
+
+ /*
+ * compute and store in x the gauss-newton direction. if the
+ * jacobian is rank-deficient, obtain a least squares solution.
+ */
+ nsing = n;
+ jj = 0;
+ for (j=0; j<n; j++) {
+ wa1[j] = qtb[j];
+ if ((r[jj] == zero) && (nsing == n))
+ nsing = j;
+ if (nsing < n)
+ wa1[j] = zero;
+ jj += ldr+1; /* [j+ldr*j] */
+ }
+
+ if (nsing >= 1) {
+ for (k=0; k<nsing; k++)
+ {
+ j = nsing - k - 1;
+ wa1[j] = wa1[j]/r[j+ldr*j];
+ temp = wa1[j];
+ jm1 = j - 1;
+ if (jm1 >= 0)
+ {
+ ij = ldr * j;
+ for (i=0; i<=jm1; i++)
+ {
+ wa1[i] -= r[ij]*temp;
+ ij += 1;
+ }
+ }
+ }
+ }
+
+ for (j=0; j<n; j++) {
+ l = ipvt[j];
+ x[l] = wa1[j];
+ }
+ /*
+ * initialize the iteration counter.
+ * evaluate the function at the origin, and test
+ * for acceptance of the gauss-newton direction.
+ */
+ iter = 0;
+ for (j=0; j<n; j++)
+ wa2[j] = diag[ifree[j]]*x[j];
+ dxnorm = mp_enorm(n,wa2);
+ fp = dxnorm - delta;
+ if (fp <= p1*delta) {
+ goto L220;
+ }
+ /*
+ * if the jacobian is not rank deficient, the newton
+ * step provides a lower bound, parl, for the zero of
+ * the function. otherwise set this bound to zero.
+ */
+ parl = zero;
+ if (nsing >= n) {
+ for (j=0; j<n; j++)
+ {
+ l = ipvt[j];
+ wa1[j] = diag[ifree[l]]*(wa2[l]/dxnorm);
+ }
+ jj = 0;
+ for (j=0; j<n; j++)
+ {
+ sum = zero;
+ jm1 = j - 1;
+ if (jm1 >= 0)
+ {
+ ij = jj;
+ for (i=0; i<=jm1; i++)
+ {
+ sum += r[ij]*wa1[i];
+ ij += 1;
+ }
+ }
+ wa1[j] = (wa1[j] - sum)/r[j+ldr*j];
+ jj += ldr; /* [i+ldr*j] */
+ }
+ temp = mp_enorm(n,wa1);
+ parl = ((fp/delta)/temp)/temp;
+ }
+ /*
+ * calculate an upper bound, paru, for the zero of the function.
+ */
+ jj = 0;
+ for (j=0; j<n; j++) {
+ sum = zero;
+ ij = jj;
+ for (i=0; i<=j; i++)
+ {
+ sum += r[ij]*qtb[i];
+ ij += 1;
+ }
+ l = ipvt[j];
+ wa1[j] = sum/diag[ifree[l]];
+ jj += ldr; /* [i+ldr*j] */
+ }
+ gnorm = mp_enorm(n,wa1);
+ paru = gnorm/delta;
+ if (paru == zero)
+ paru = MP_DWARF/mp_dmin1(delta,p1);
+ /*
+ * if the input par lies outside of the interval (parl,paru),
+ * set par to the closer endpoint.
+ */
+ *par = mp_dmax1( *par,parl);
+ *par = mp_dmin1( *par,paru);
+ if (*par == zero)
+ *par = gnorm/dxnorm;
+
+ /*
+ * beginning of an iteration.
+ */
+ L150:
+ iter += 1;
+ /*
+ * evaluate the function at the current value of par.
+ */
+ if (*par == zero)
+ *par = mp_dmax1(MP_DWARF,p001*paru);
+ temp = sqrt( *par );
+ for (j=0; j<n; j++)
+ wa1[j] = temp*diag[ifree[j]];
+ mp_qrsolv(n,r,ldr,ipvt,wa1,qtb,x,sdiag,wa2);
+ for (j=0; j<n; j++)
+ wa2[j] = diag[ifree[j]]*x[j];
+ dxnorm = mp_enorm(n,wa2);
+ temp = fp;
+ fp = dxnorm - delta;
+ /*
+ * if the function is small enough, accept the current value
+ * of par. also test for the exceptional cases where parl
+ * is zero or the number of iterations has reached 10.
+ */
+ if ((fabs(fp) <= p1*delta)
+ || ((parl == zero) && (fp <= temp) && (temp < zero))
+ || (iter == 10))
+ goto L220;
+ /*
+ * compute the newton correction.
+ */
+ for (j=0; j<n; j++) {
+ l = ipvt[j];
+ wa1[j] = diag[ifree[l]]*(wa2[l]/dxnorm);
+ }
+ jj = 0;
+ for (j=0; j<n; j++) {
+ wa1[j] = wa1[j]/sdiag[j];
+ temp = wa1[j];
+ jp1 = j + 1;
+ if (jp1 < n)
+ {
+ ij = jp1 + jj;
+ for (i=jp1; i<n; i++)
+ {
+ wa1[i] -= r[ij]*temp;
+ ij += 1; /* [i+ldr*j] */
+ }
+ }
+ jj += ldr; /* ldr*j */
+ }
+ temp = mp_enorm(n,wa1);
+ parc = ((fp/delta)/temp)/temp;
+ /*
+ * depending on the sign of the function, update parl or paru.
+ */
+ if (fp > zero)
+ parl = mp_dmax1(parl, *par);
+ if (fp < zero)
+ paru = mp_dmin1(paru, *par);
+ /*
+ * compute an improved estimate for par.
+ */
+ *par = mp_dmax1(parl, *par + parc);
+ /*
+ * end of an iteration.
+ */
+ goto L150;
+
+ L220:
+ /*
+ * termination.
+ */
+ if (iter == 0)
+ *par = zero;
+ /*
+ * last card of subroutine lmpar.
+ */
+}
+
+
+/************************enorm.c*************************/
+
+static
+double mp_enorm(int n, double *x)
+{
+ /*
+ * **********
+ *
+ * function enorm
+ *
+ * given an n-vector x, this function calculates the
+ * euclidean norm of x.
+ *
+ * the euclidean norm is computed by accumulating the sum of
+ * squares in three different sums. the sums of squares for the
+ * small and large components are scaled so that no overflows
+ * occur. non-destructive underflows are permitted. underflows
+ * and overflows do not occur in the computation of the unscaled
+ * sum of squares for the intermediate components.
+ * the definitions of small, intermediate and large components
+ * depend on two constants, rdwarf and rgiant. the main
+ * restrictions on these constants are that rdwarf**2 not
+ * underflow and rgiant**2 not overflow. the constants
+ * given here are suitable for every known computer.
+ *
+ * the function statement is
+ *
+ * double precision function enorm(n,x)
+ *
+ * where
+ *
+ * n is a positive integer input variable.
+ *
+ * x is an input array of length n.
+ *
+ * subprograms called
+ *
+ * fortran-supplied ... dabs,dsqrt
+ *
+ * argonne national laboratory. minpack project. march 1980.
+ * burton s. garbow, kenneth e. hillstrom, jorge j. more
+ *
+ * **********
+ */
+ int i;
+ double agiant,floatn,s1,s2,s3,xabs,x1max,x3max;
+ double ans, temp;
+ double rdwarf = MP_RDWARF;
+ double rgiant = MP_RGIANT;
+ static double zero = 0.0;
+ static double one = 1.0;
+
+ s1 = zero;
+ s2 = zero;
+ s3 = zero;
+ x1max = zero;
+ x3max = zero;
+ floatn = n;
+ agiant = rgiant/floatn;
+
+ for (i=0; i<n; i++) {
+ xabs = fabs(x[i]);
+ if ((xabs > rdwarf) && (xabs < agiant))
+ {
+ /*
+ * sum for intermediate components.
+ */
+ s2 += xabs*xabs;
+ continue;
+ }
+
+ if (xabs > rdwarf)
+ {
+ /*
+ * sum for large components.
+ */
+ if (xabs > x1max)
+ {
+ temp = x1max/xabs;
+ s1 = one + s1*temp*temp;
+ x1max = xabs;
+ }
+ else
+ {
+ temp = xabs/x1max;
+ s1 += temp*temp;
+ }
+ continue;
+ }
+ /*
+ * sum for small components.
+ */
+ if (xabs > x3max)
+ {
+ temp = x3max/xabs;
+ s3 = one + s3*temp*temp;
+ x3max = xabs;
+ }
+ else
+ {
+ if (xabs != zero)
+ {
+ temp = xabs/x3max;
+ s3 += temp*temp;
+ }
+ }
+ }
+ /*
+ * calculation of norm.
+ */
+ if (s1 != zero) {
+ temp = s1 + (s2/x1max)/x1max;
+ ans = x1max*sqrt(temp);
+ return(ans);
+ }
+ if (s2 != zero) {
+ if (s2 >= x3max)
+ temp = s2*(one+(x3max/s2)*(x3max*s3));
+ else
+ temp = x3max*((s2/x3max)+(x3max*s3));
+ ans = sqrt(temp);
+ }
+ else
+ {
+ ans = x3max*sqrt(s3);
+ }
+ return(ans);
+ /*
+ * last card of function enorm.
+ */
+}
+
+/************************lmmisc.c*************************/
+
+static
+double mp_dmax1(double a, double b)
+{
+ if (a >= b)
+ return(a);
+ else
+ return(b);
+}
+
+static
+double mp_dmin1(double a, double b)
+{
+ if (a <= b)
+ return(a);
+ else
+ return(b);
+}
+
+static
+int mp_min0(int a, int b)
+{
+ if (a <= b)
+ return(a);
+ else
+ return(b);
+}
+
+/************************covar.c*************************/
+/*
+c **********
+c
+c subroutine covar
+c
+c given an m by n matrix a, the problem is to determine
+c the covariance matrix corresponding to a, defined as
+c
+c t
+c inverse(a *a) .
+c
+c this subroutine completes the solution of the problem
+c if it is provided with the necessary information from the
+c qr factorization, with column pivoting, of a. that is, if
+c a*p = q*r, where p is a permutation matrix, q has orthogonal
+c columns, and r is an upper triangular matrix with diagonal
+c elements of nonincreasing magnitude, then covar expects
+c the full upper triangle of r and the permutation matrix p.
+c the covariance matrix is then computed as
+c
+c t t
+c p*inverse(r *r)*p .
+c
+c if a is nearly rank deficient, it may be desirable to compute
+c the covariance matrix corresponding to the linearly independent
+c columns of a. to define the numerical rank of a, covar uses
+c the tolerance tol. if l is the largest integer such that
+c
+c abs(r(l,l)) .gt. tol*abs(r(1,1)) ,
+c
+c then covar computes the covariance matrix corresponding to
+c the first l columns of r. for k greater than l, column
+c and row ipvt(k) of the covariance matrix are set to zero.
+c
+c the subroutine statement is
+c
+c subroutine covar(n,r,ldr,ipvt,tol,wa)
+c
+c where
+c
+c n is a positive integer input variable set to the order of r.
+c
+c r is an n by n array. on input the full upper triangle must
+c contain the full upper triangle of the matrix r. on output
+c r contains the square symmetric covariance matrix.
+c
+c ldr is a positive integer input variable not less than n
+c which specifies the leading dimension of the array r.
+c
+c ipvt is an integer input array of length n which defines the
+c permutation matrix p such that a*p = q*r. column j of p
+c is column ipvt(j) of the identity matrix.
+c
+c tol is a nonnegative input variable used to define the
+c numerical rank of a in the manner described above.
+c
+c wa is a work array of length n.
+c
+c subprograms called
+c
+c fortran-supplied ... dabs
+c
+c argonne national laboratory. minpack project. august 1980.
+c burton s. garbow, kenneth e. hillstrom, jorge j. more
+c
+c **********
+*/
+
+static
+int mp_covar(int n, double *r, int ldr, int *ipvt, double tol, double *wa)
+{
+ int i, ii, j, jj, k, l;
+ int kk, kj, ji, j0, k0, jj0;
+ int sing;
+ double one = 1.0, temp, tolr, zero = 0.0;
+
+ /*
+ * form the inverse of r in the full upper triangle of r.
+ */
+
+#if 0
+ for (j=0; j<n; j++) {
+ for (i=0; i<n; i++) {
+ printf("%f ", r[j*ldr+i]);
+ }
+ printf("\n");
+ }
+#endif
+
+ tolr = tol*fabs(r[0]);
+ l = -1;
+ for (k=0; k<n; k++) {
+ kk = k*ldr + k;
+ if (fabs(r[kk]) <= tolr) break;
+
+ r[kk] = one/r[kk];
+ for (j=0; j<k; j++) {
+ kj = k*ldr + j;
+ temp = r[kk] * r[kj];
+ r[kj] = zero;
+
+ k0 = k*ldr; j0 = j*ldr;
+ for (i=0; i<=j; i++) {
+ r[k0+i] += (-temp*r[j0+i]);
+ }
+ }
+ l = k;
+ }
+
+ /*
+ * Form the full upper triangle of the inverse of (r transpose)*r
+ * in the full upper triangle of r
+ */
+
+ if (l >= 0) {
+ for (k=0; k <= l; k++) {
+ k0 = k*ldr;
+
+ for (j=0; j<k; j++) {
+ temp = r[k*ldr+j];
+
+ j0 = j*ldr;
+ for (i=0; i<=j; i++) {
+ r[j0+i] += temp*r[k0+i];
+ }
+ }
+
+ temp = r[k0+k];
+ for (i=0; i<=k; i++) {
+ r[k0+i] *= temp;
+ }
+ }
+ }
+
+ /*
+ * For the full lower triangle of the covariance matrix
+ * in the strict lower triangle or and in wa
+ */
+ for (j=0; j<n; j++) {
+ jj = ipvt[j];
+ sing = (j > l);
+ j0 = j*ldr;
+ jj0 = jj*ldr;
+ for (i=0; i<=j; i++) {
+ ji = j0+i;
+
+ if (sing) r[ji] = zero;
+ ii = ipvt[i];
+ if (ii > jj) r[jj0+ii] = r[ji];
+ if (ii < jj) r[ii*ldr+jj] = r[ji];
+ }
+ wa[jj] = r[j0+j];
+ }
+
+ /*
+ * Symmetrize the covariance matrix in r
+ */
+ for (j=0; j<n; j++) {
+ j0 = j*ldr;
+ for (i=0; i<j; i++) {
+ r[j0+i] = r[i*ldr+j];
+ }
+ r[j0+j] = wa[j];
+ }
+
+#if 0
+ for (j=0; j<n; j++) {
+ for (i=0; i<n; i++) {
+ printf("%f ", r[j*ldr+i]);
+ }
+ printf("\n");
+ }
+#endif
+
+ return 0;
+}
diff --git a/be/src/thirdparty/mpfit/mpfit.h b/be/src/thirdparty/mpfit/mpfit.h
new file mode 100644
index 0000000..47469c8
--- /dev/null
+++ b/be/src/thirdparty/mpfit/mpfit.h
@@ -0,0 +1,198 @@
+/*
+ * MINPACK-1 Least Squares Fitting Library
+ *
+ * Original public domain version by B. Garbow, K. Hillstrom, J. More'
+ * (Argonne National Laboratory, MINPACK project, March 1980)
+ *
+ * Tranlation to C Language by S. Moshier (moshier.net)
+ *
+ * Enhancements and packaging by C. Markwardt
+ * (comparable to IDL fitting routine MPFIT
+ * see http://cow.physics.wisc.edu/~craigm/idl/idl.html)
+ */
+
+/* Header file defining constants, data structures and functions of
+ mpfit library
+ $Id: mpfit.h,v 1.16 2016/06/02 19:14:16 craigm Exp $
+*/
+
+#ifndef MPFIT_H
+#define MPFIT_H
+
+/* This is a C library. Allow compilation with a C++ compiler */
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* MPFIT version string */
+#define MPFIT_VERSION "1.3"
+
+/* Definition of a parameter constraint structure */
+struct mp_par_struct {
+ int fixed; /* 1 = fixed; 0 = free */
+ int limited[2]; /* 1 = low/upper limit; 0 = no limit */
+ double limits[2]; /* lower/upper limit boundary value */
+
+ char *parname; /* Name of parameter, or 0 for none */
+ double step; /* Step size for finite difference */
+ double relstep; /* Relative step size for finite difference */
+ int side; /* Sidedness of finite difference derivative
+ 0 - one-sided derivative computed automatically
+ 1 - one-sided derivative (f(x+h) - f(x) )/h
+ -1 - one-sided derivative (f(x) - f(x-h))/h
+ 2 - two-sided derivative (f(x+h) - f(x-h))/(2*h)
+ 3 - user-computed analytical derivatives
+ */
+ int deriv_debug; /* Derivative debug mode: 1 = Yes; 0 = No;
+
+ If yes, compute both analytical and numerical
+ derivatives and print them to the console for
+ comparison.
+
+ NOTE: when debugging, do *not* set side = 3,
+ but rather to the kind of numerical derivative
+ you want to compare the user-analytical one to
+ (0, 1, -1, or 2).
+ */
+ double deriv_reltol; /* Relative tolerance for derivative debug
+ printout */
+ double deriv_abstol; /* Absolute tolerance for derivative debug
+ printout */
+};
+
+/* Just a placeholder - do not use!! */
+typedef void (*mp_iterproc)(void);
+
+/* Definition of MPFIT configuration structure */
+struct mp_config_struct {
+ /* NOTE: the user may set the value explicitly; OR, if the passed
+ value is zero, then the "Default" value will be substituted by
+ mpfit(). */
+ double ftol; /* Relative chi-square convergence criterium Default: 1e-10 */
+ double xtol; /* Relative parameter convergence criterium Default: 1e-10 */
+ double gtol; /* Orthogonality convergence criterium Default: 1e-10 */
+ double epsfcn; /* Finite derivative step size Default: MP_MACHEP0 */
+ double stepfactor; /* Initial step bound Default: 100.0 */
+ double covtol; /* Range tolerance for covariance calculation Default: 1e-14 */
+ int maxiter; /* Maximum number of iterations. If maxiter == MP_NO_ITER,
+ then basic error checking is done, and parameter
+ errors/covariances are estimated based on input
+ parameter values, but no fitting iterations are done.
+ Default: 200
+ */
+#define MP_NO_ITER (-1) /* No iterations, just checking */
+ int maxfev; /* Maximum number of function evaluations, or 0 for no limit
+ Default: 0 (no limit) */
+ int nprint; /* Default: 1 */
+ int douserscale;/* Scale variables by user values?
+ 1 = yes, user scale values in diag;
+ 0 = no, variables scaled internally (Default) */
+ int nofinitecheck; /* Disable check for infinite quantities from user?
+ 0 = do not perform check (Default)
+ 1 = perform check
+ */
+ mp_iterproc iterproc; /* Placeholder pointer - must set to 0 */
+
+};
+
+/* Definition of results structure, for when fit completes */
+struct mp_result_struct {
+ double bestnorm; /* Final chi^2 */
+ double orignorm; /* Starting value of chi^2 */
+ int niter; /* Number of iterations */
+ int nfev; /* Number of function evaluations */
+ int status; /* Fitting status code */
+
+ int npar; /* Total number of parameters */
+ int nfree; /* Number of free parameters */
+ int npegged; /* Number of pegged parameters */
+ int nfunc; /* Number of residuals (= num. of data points) */
+
+ double *resid; /* Final residuals
+ nfunc-vector, or 0 if not desired */
+ double *xerror; /* Final parameter uncertainties (1-sigma)
+ npar-vector, or 0 if not desired */
+ double *covar; /* Final parameter covariance matrix
+ npar x npar array, or 0 if not desired */
+ char version[20]; /* MPFIT version string */
+};
+
+/* Convenience typedefs */
+typedef struct mp_par_struct mp_par;
+typedef struct mp_config_struct mp_config;
+typedef struct mp_result_struct mp_result;
+
+/* Enforce type of fitting function */
+typedef int (*mp_func)(int m, /* Number of functions (elts of fvec) */
+ int n, /* Number of variables (elts of x) */
+ double *x, /* I - Parameters */
+ double *fvec, /* O - function values */
+ double **dvec, /* O - function derivatives (optional)*/
+ void *private_data); /* I/O - function private data*/
+
+/* Error codes */
+#define MP_ERR_INPUT (0) /* General input parameter error */
+#define MP_ERR_NAN (-16) /* User function produced non-finite values */
+#define MP_ERR_FUNC (-17) /* No user function was supplied */
+#define MP_ERR_NPOINTS (-18) /* No user data points were supplied */
+#define MP_ERR_NFREE (-19) /* No free parameters */
+#define MP_ERR_MEMORY (-20) /* Memory allocation error */
+#define MP_ERR_INITBOUNDS (-21) /* Initial values inconsistent w constraints*/
+#define MP_ERR_BOUNDS (-22) /* Initial constraints inconsistent */
+#define MP_ERR_PARAM (-23) /* General input parameter error */
+#define MP_ERR_DOF (-24) /* Not enough degrees of freedom */
+
+/* Potential success status codes */
+#define MP_OK_CHI (1) /* Convergence in chi-square value */
+#define MP_OK_PAR (2) /* Convergence in parameter value */
+#define MP_OK_BOTH (3) /* Both MP_OK_PAR and MP_OK_CHI hold */
+#define MP_OK_DIR (4) /* Convergence in orthogonality */
+#define MP_MAXITER (5) /* Maximum number of iterations reached */
+#define MP_FTOL (6) /* ftol is too small; no further improvement*/
+#define MP_XTOL (7) /* xtol is too small; no further improvement*/
+#define MP_GTOL (8) /* gtol is too small; no further improvement*/
+
+/* Double precision numeric constants */
+#define MP_MACHEP0 2.2204460e-16
+#define MP_DWARF 2.2250739e-308
+#define MP_GIANT 1.7976931e+308
+
+#if 0
+/* Float precision */
+#define MP_MACHEP0 1.19209e-07
+#define MP_DWARF 1.17549e-38
+#define MP_GIANT 3.40282e+38
+#endif
+
+#define MP_RDWARF (sqrt(MP_DWARF*1.5)*10)
+#define MP_RGIANT (sqrt(MP_GIANT)*0.1)
+
+
+/* External function prototype declarations */
+extern int mpfit(mp_func funct, int m, int npar,
+ double *xall, mp_par *pars, mp_config *config,
+ void *private_data,
+ mp_result *result);
+
+
+
+/* C99 uses isfinite() instead of finite() */
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+#define mpfinite(x) isfinite(x)
+
+/* Microsoft C uses _finite(x) instead of finite(x) */
+#elif defined(_MSC_VER) && _MSC_VER
+#include <float.h>
+#define mpfinite(x) _finite(x)
+
+/* Default is to assume that compiler/library has finite() function */
+#else
+#define mpfinite(x) finite(x)
+
+#endif
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif /* MPFIT_H */
diff --git a/be/src/util/CMakeLists.txt b/be/src/util/CMakeLists.txt
index 08002ed..c9e0f3c 100644
--- a/be/src/util/CMakeLists.txt
+++ b/be/src/util/CMakeLists.txt
@@ -17,6 +17,7 @@
set(SQUEASEL_SRC_DIR "${CMAKE_SOURCE_DIR}/be/src/thirdparty/squeasel")
set(MUSTACHE_SRC_DIR "${CMAKE_SOURCE_DIR}/be/src/thirdparty/mustache")
+set(MPFIT_SRC_DIR "${CMAKE_SOURCE_DIR}/be/src/thirdparty/mpfit")
# where to put generated libraries
set(LIBRARY_OUTPUT_PATH "${BUILD_OUTPUT_ROOT_DIRECTORY}/util")
@@ -58,6 +59,7 @@
min-max-filter.cc
min-max-filter-ir.cc
minidump.cc
+ mpfit-util.cc
network-util.cc
openssl-util.cc
os-info.cc
@@ -86,6 +88,7 @@
${SQUEASEL_SRC_DIR}/squeasel.c
webserver.cc
${MUSTACHE_SRC_DIR}/mustache.cc
+ ${MPFIT_SRC_DIR}/mpfit.c
)
add_dependencies(Util gen-deps gen_ir_descriptions)
diff --git a/be/src/util/mpfit-util.cc b/be/src/util/mpfit-util.cc
new file mode 100644
index 0000000..0f001ad
--- /dev/null
+++ b/be/src/util/mpfit-util.cc
@@ -0,0 +1,82 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "util/mpfit-util.h"
+
+#include <cstring>
+
+using namespace impala;
+using std::string;
+using std::function;
+
+ObjectiveFunction::ObjectiveFunction(string name, int num_params,
+ function<double (double, const double*)> fn)
+ : name_(name), num_params_(num_params), fn_(fn), num_points_(0), xs_(nullptr),
+ ys_(nullptr) {
+ DCHECK_GT(num_params, 0);
+}
+
+/// User-supplied function called by MpFit during curve fitting. MpFit requires this
+/// user-supplied function to compute "dy = y - fn(x)" for every known data point given
+/// the objective function 'fn' and the current list of parameter values 'params'.
+/// Declared here so it can be used in ObjectiveFunction below.
+/// 'num_points' is the number of data points for fitting
+/// 'num_params' is the number of parameters in the objective function
+/// 'params' is an input array with 'num_params' parameter values
+/// 'dy' is an output parameter allocated by MpFit. It contains the "delta y" for each
+/// data point. This function sets dy[i] = y[i] - fn(x[i]) for each data point i and
+/// for the objective function 'fn'.
+/// 'obj_func' points to an ObjectiveFunction which contains the objective function
+/// and the data points
+/// The 'dvec' parameter is not important for our purposes. See the MpFit documentation.
+/// MpFit allows these user-supplied functions to indicate errors by returning a non-zero
+/// value. Our objective functions are simple and we always return 0.
+static int ComputeDeltaY(int num_points, int num_params, double* params, double* dy,
+ double** dvec, void* obj_func);
+
+bool ObjectiveFunction::LmsFit(const double* xs, const double* ys, int num_points) {
+ DCHECK(xs != nullptr);
+ DCHECK(ys != nullptr);
+ DCHECK_GE(num_points, 0);
+ num_points_ = num_points;
+ xs_ = xs;
+ ys_ = ys;
+
+ // Initialize function parameters.
+ params_.reset(new (std::nothrow) double[num_params_]);
+ if (params_ == nullptr) return false;
+ for (int i = 0; i < num_params_; ++i) params_.get()[i] = 1.0;
+
+ struct mp_config_struct config;
+ memset(&config, 0, sizeof(config));
+ // Maximum number of calls to SampledNdvMpFit(). Value determined empirically.
+ config.maxfev = 1000;
+ // Maximum number of fitting iterations. Value determined empirically.
+ config.maxiter = 200;
+ memset(&result_, 0, sizeof(result_));
+ int ret = mpfit(ComputeDeltaY, num_points, num_params_, params_.get(), nullptr,
+ &config, this, &result_);
+ // Any positive integer indicates success.
+ return ret > 0;
+}
+
+int ComputeDeltaY(int num_points, int num_params, double* params, double* dy,
+ double** dvec, void* obj_func) {
+ ObjectiveFunction* fn = reinterpret_cast<ObjectiveFunction*>(obj_func);
+ for (int i = 0; i < num_points; ++i) dy[i] = fn->GetDeltaY(i, params);
+ return 0;
+}
diff --git a/be/src/util/mpfit-util.h b/be/src/util/mpfit-util.h
new file mode 100644
index 0000000..9eac3c2
--- /dev/null
+++ b/be/src/util/mpfit-util.h
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+
+#ifndef IMPALA_MPFIT_UTIL_H
+#define IMPALA_MPFIT_UTIL_H
+
+#include <functional>
+#include <memory>
+#include <string>
+
+#include "common/logging.h"
+#include "common/status.h"
+#include "thirdparty/mpfit/mpfit.h"
+
+namespace impala {
+
+/// Objective function to be fit using the MpFit library. An objective function has
+/// exactly one 'x' variable and any number of additional non-variable parameters whose
+/// values are unknown. The purpose of curve fitting is to find the best values for those
+/// parameters given an objective function and a series of x/y data points.
+/// This class contains the objective function as well as the data points because that's
+/// the most natural setup for the MpFit API.
+/// Calling LmsFit() determines the parameters for the objective function.
+/// Calling GetY() computes the 'y' value for a given 'x' using the function parameters.
+/// The objective function is of type function<double (double, const double*)>
+/// By convention, the first argument is the value of the 'x' variable and the second
+/// argument is an array of function parameters which are determined during fitting.
+class ObjectiveFunction {
+ public:
+ ObjectiveFunction(std::string name, int num_params,
+ std::function<double (double, const double*)> fn);
+
+ /// Performs least mean squares (LMS) curve fitting using the MpFit library
+ /// against the provided x/y data points.
+ /// Returns true if fitting was successful, false otherwise.
+ bool LmsFit(const double* xs, const double* ys, int num_points) WARN_UNUSED_RESULT;
+
+ /// Evaluates the objective function over the given 'x' value.
+ double GetY(int64_t x) const {
+ DCHECK(params_ != nullptr);
+ return fn_(x, params_.get());
+ }
+
+ /// Returns the difference between the y value of data point 'pidx' and the
+ /// y value of the objective function with the given parameters over the x value
+ /// of the same point.
+ double GetDeltaY(int pidx, const double* params) const {
+ DCHECK_LT(pidx, num_points_);
+ return ys_[pidx] - fn_(xs_[pidx], params);
+ }
+
+ /// Returns the Chi-Square of fitting. This is an indication of how well the function
+ /// fits. Lower is better. Valid to call after LmsFit().
+ double GetError() const {
+ DCHECK(params_ != nullptr);
+ return result_.bestnorm;
+ }
+
+ private:
+ /// Human-readable name of this function. Used for debugging.
+ std::string name_;
+
+ /// Function parameters to be determined by fitting.
+ const int num_params_;
+ std::unique_ptr<double> params_;
+
+ /// MPFit result structure. Populated by in LmsFit(). All pointers in this structure
+ /// are optional and must be allocated and owned by the caller of mpfit(). Passing
+ /// nullptr indicates to MPFit that those fields should not be populated.
+ mp_result result_;
+
+ /// Objective function whose parameters should be fit to the data points.
+ std::function<double (double, const double*)> fn_;
+
+ /// Known x/y data points. Memory not owned.
+ int num_points_;
+ const double* xs_;
+ const double* ys_;
+};
+
+}
+
+#endif
diff --git a/bin/rat_exclude_files.txt b/bin/rat_exclude_files.txt
index 73cd073..fdb1926 100644
--- a/bin/rat_exclude_files.txt
+++ b/bin/rat_exclude_files.txt
@@ -19,8 +19,9 @@
tests/__init__.py
www/index.html
-# See LICENSE.txt
+# See $IMPALA_HOME/LICENSE.txt
be/src/gutil/*
+be/src/thirdparty/mpfit/*
be/src/kudu/gutil
www/highlight/*
www/DataTables*/*
diff --git a/bin/run_clang_tidy.sh b/bin/run_clang_tidy.sh
index ae389a9..e879b35 100755
--- a/bin/run_clang_tidy.sh
+++ b/bin/run_clang_tidy.sh
@@ -35,7 +35,10 @@
echo "WARNING: compile failed" >&2
fi
-DIRS=$(ls -d "${IMPALA_HOME}/be/src/"*/ | grep -v gutil | grep -v kudu | tr '\n' ' ')
+DIRS=$(ls -d "${IMPALA_HOME}/be/src/"*/ | grep -v gutil | grep -v kudu |\
+ grep -v thirdparty | tr '\n' ' ')
+# Include/exclude select thirdparty dirs.
+DIRS=$DIRS$(ls -d "${IMPALA_HOME}/be/src/thirdparty/"*/ | grep -v mpfit | tr '\n' ' ')
PIPE_DIRS=$(echo "${DIRS}" | tr ' ' '|')
# Reduce the concurrency to one less than the number of cores in the system. Note than
diff --git a/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java b/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
index 9574a02..12a34f7 100644
--- a/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
+++ b/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
@@ -469,8 +469,6 @@
return;
}
- Type[] argTypes = collectChildReturnTypes();
-
// User needs DB access.
Db db = analyzer.getDb(fnName_.getDb(), Privilege.VIEW_METADATA, true);
if (!db.containsFunction(fnName_.getFunction())) {
@@ -482,8 +480,7 @@
// There is no version of COUNT() that takes more than 1 argument but after
// the rewrite, we only need count(*).
// TODO: fix how we rewrite count distinct.
- argTypes = new Type[0];
- Function searchDesc = new Function(fnName_, argTypes, Type.INVALID, false);
+ Function searchDesc = new Function(fnName_, new Type[0], Type.INVALID, false);
fn_ = db.getFunction(searchDesc, Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF);
type_ = fn_.getReturnType();
// Make sure BE doesn't see any TYPE_NULL exprs
@@ -506,6 +503,28 @@
"AVG requires a numeric or timestamp parameter: " + toSql());
}
+ // SAMPLED_NDV() is only valid with two children. Invocations with an invalid number
+ // of children are gracefully handled when resolving the function signature.
+ if (fnName_.getFunction().equalsIgnoreCase("sampled_ndv")
+ && children_.size() == 2) {
+ if (!(children_.get(1) instanceof NumericLiteral)) {
+ throw new AnalysisException(
+ "Second parameter of SAMPLED_NDV() must be a numeric literal in [0,1]: " +
+ children_.get(1).toSql());
+ }
+ NumericLiteral samplePerc = (NumericLiteral) children_.get(1);
+ if (samplePerc.getDoubleValue() < 0 || samplePerc.getDoubleValue() > 1.0) {
+ throw new AnalysisException(
+ "Second parameter of SAMPLED_NDV() must be a numeric literal in [0,1]: " +
+ samplePerc.toSql());
+ }
+ // Numeric literals with a decimal point are analyzed as decimals. Without this
+ // cast we might resolve to the wrong function because there is no exactly
+ // matching signature with decimal as the second argument.
+ children_.set(1, samplePerc.uncheckedCastTo(Type.DOUBLE));
+ }
+
+ Type[] argTypes = collectChildReturnTypes();
Function searchDesc = new Function(fnName_, argTypes, Type.INVALID, false);
fn_ = db.getFunction(searchDesc, Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF);
if (fn_ == null || (!isInternalFnCall_ && !fn_.userVisible())) {
diff --git a/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java b/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
index 07699d3..f316410 100644
--- a/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
+++ b/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
@@ -22,7 +22,6 @@
import java.util.Map;
import org.apache.hadoop.hive.metastore.api.Database;
-
import org.apache.impala.analysis.ArithmeticExpr;
import org.apache.impala.analysis.BinaryPredicate;
import org.apache.impala.analysis.CaseExpr;
@@ -32,6 +31,7 @@
import org.apache.impala.analysis.IsNullPredicate;
import org.apache.impala.analysis.LikePredicate;
import org.apache.impala.builtins.ScalarBuiltins;
+
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.Lists;
@@ -304,6 +304,30 @@
"9HllUpdateIN10impala_udf10DecimalValEEEvPNS2_15FunctionContextERKT_PNS2_9StringValE")
.build();
+ private static final Map<Type, String> SAMPLED_NDV_UPDATE_SYMBOL =
+ ImmutableMap.<Type, String>builder()
+ .put(Type.BOOLEAN,
+ "16SampledNdvUpdateIN10impala_udf10BooleanValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .put(Type.TINYINT,
+ "16SampledNdvUpdateIN10impala_udf10TinyIntValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .put(Type.SMALLINT,
+ "16SampledNdvUpdateIN10impala_udf11SmallIntValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .put(Type.INT,
+ "16SampledNdvUpdateIN10impala_udf6IntValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .put(Type.BIGINT,
+ "16SampledNdvUpdateIN10impala_udf9BigIntValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .put(Type.FLOAT,
+ "16SampledNdvUpdateIN10impala_udf8FloatValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .put(Type.DOUBLE,
+ "16SampledNdvUpdateIN10impala_udf9DoubleValEEEvPNS2_15FunctionContextERKT_RKS3_PNS2_9StringValE")
+ .put(Type.STRING,
+ "16SampledNdvUpdateIN10impala_udf9StringValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPS3_")
+ .put(Type.TIMESTAMP,
+ "16SampledNdvUpdateIN10impala_udf12TimestampValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .put(Type.DECIMAL,
+ "16SampledNdvUpdateIN10impala_udf10DecimalValEEEvPNS2_15FunctionContextERKT_RKNS2_9DoubleValEPNS2_9StringValE")
+ .build();
+
private static final Map<Type, String> PC_UPDATE_SYMBOL =
ImmutableMap.<Type, String>builder()
.put(Type.BOOLEAN,
@@ -788,6 +812,19 @@
"_Z20IncrementNdvFinalizePN10impala_udf15FunctionContextERKNS_9StringValE",
true, false, true));
+ // SAMPLED_NDV.
+ // Size needs to be kept in sync with SampledNdvState in the BE.
+ int NUM_HLL_BUCKETS = 32;
+ int size = 16 + NUM_HLL_BUCKETS * (8 + HLL_INTERMEDIATE_SIZE);
+ Type sampledIntermediateType = ScalarType.createFixedUdaIntermediateType(size);
+ db.addBuiltin(AggregateFunction.createBuiltin(db, "sampled_ndv",
+ Lists.newArrayList(t, Type.DOUBLE), Type.BIGINT, sampledIntermediateType,
+ prefix + "14SampledNdvInitEPN10impala_udf15FunctionContextEPNS1_9StringValE",
+ prefix + SAMPLED_NDV_UPDATE_SYMBOL.get(t),
+ prefix + "15SampledNdvMergeEPN10impala_udf15FunctionContextERKNS1_9StringValEPS4_",
+ null,
+ prefix + "18SampledNdvFinalizeEPN10impala_udf15FunctionContextERKNS1_9StringValE",
+ true, false, true));
Type pcIntermediateType =
ScalarType.createFixedUdaIntermediateType(PC_INTERMEDIATE_SIZE);
diff --git a/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java b/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
index c2417f6..4de25fe 100644
--- a/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
@@ -2132,7 +2132,6 @@
parts[selectedIdx] = parts[numFilesRemaining - 1];
--numFilesRemaining;
}
-
return result;
}
}
diff --git a/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java b/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
index 54aa098..133b6e2 100644
--- a/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
+++ b/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
@@ -18,19 +18,23 @@
package org.apache.impala.analysis;
import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
import static org.junit.Assert.fail;
import java.lang.reflect.Field;
import java.util.List;
+import org.apache.impala.catalog.Column;
import org.apache.impala.catalog.PrimitiveType;
import org.apache.impala.catalog.ScalarType;
+import org.apache.impala.catalog.Table;
import org.apache.impala.catalog.Type;
import org.apache.impala.common.AnalysisException;
import org.apache.impala.common.RuntimeEnv;
import org.junit.Assert;
import org.junit.Test;
+import com.google.common.base.Joiner;
import com.google.common.base.Preconditions;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.Lists;
@@ -2051,6 +2055,55 @@
}
@Test
+ public void TestSampledNdv() throws AnalysisException {
+ Table allScalarTypes = addAllScalarTypesTestTable();
+ String tblName = allScalarTypes.getFullName();
+
+ // Positive tests: Test all scalar types and valid sampling percents.
+ double validSamplePercs[] = new double[] { 0.0, 0.1, 0.2, 0.5, 0.8, 1.0 };
+ for (double perc: validSamplePercs) {
+ List<String> allAggFnCalls = Lists.newArrayList();
+ for (Column col: allScalarTypes.getColumns()) {
+ String aggFnCall = String.format("sampled_ndv(%s, %s)", col.getName(), perc);
+ allAggFnCalls.add(aggFnCall);
+ String stmtSql = String.format("select %s from %s", aggFnCall, tblName);
+ SelectStmt stmt = (SelectStmt) AnalyzesOk(stmtSql);
+ // Verify that the resolved function signature matches as expected.
+ Type[] args = stmt.getAggInfo().getAggregateExprs().get(0).getFn().getArgs();
+ assertEquals(args.length, 2);
+ assertTrue(col.getType().matchesType(args[0]) ||
+ col.getType().isStringType() && args[0].equals(Type.STRING));
+ assertEquals(Type.DOUBLE, args[1]);
+ }
+ // Test several calls in the same query block.
+ AnalyzesOk(String.format(
+ "select %s from %s", Joiner.on(",").join(allAggFnCalls), tblName));
+ }
+
+ // Negative tests: Incorrect number of args.
+ AnalysisError(
+ String.format("select sampled_ndv() from %s", tblName),
+ "No matching function with signature: sampled_ndv().");
+ AnalysisError(
+ String.format("select sampled_ndv(int_col) from %s", tblName),
+ "No matching function with signature: sampled_ndv(INT).");
+ AnalysisError(
+ String.format("select sampled_ndv(int_col, 0.1, 10) from %s", tblName),
+ "No matching function with signature: sampled_ndv(INT, DECIMAL(1,1), TINYINT).");
+
+ // Negative tests: Invalid sampling percent.
+ String invalidSamplePercs[] = new String[] {
+ "int_col", "double_col", "100 / 10", "-0.1", "1.1", "100", "50", "-50", "NULL"
+ };
+ for (String invalidPerc: invalidSamplePercs) {
+ AnalysisError(
+ String.format("select sampled_ndv(int_col, %s) from %s", invalidPerc, tblName),
+ "Second parameter of SAMPLED_NDV() must be a numeric literal in [0,1]: " +
+ invalidPerc);
+ }
+ }
+
+ @Test
public void TestGroupConcat() throws AnalysisException {
// Test valid and invalid parameters
AnalyzesOk("select group_concat(distinct name) from functional.testtbl");
diff --git a/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java b/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
index c014dff..6718cb4 100644
--- a/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
+++ b/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
@@ -58,8 +58,8 @@
import org.apache.impala.thrift.TQueryCtx;
import org.apache.impala.thrift.TQueryOptions;
import org.junit.After;
-import org.junit.Assert;
import org.junit.AfterClass;
+import org.junit.Assert;
import org.junit.BeforeClass;
import com.google.common.base.Joiner;
@@ -233,6 +233,16 @@
return dummyView;
}
+ protected Table addAllScalarTypesTestTable() {
+ addTestDb("allscalartypesdb", "");
+ return addTestTable("create table allscalartypes (" +
+ "bool_col boolean, tinyint_col tinyint, smallint_col smallint, int_col int, " +
+ "bigint_col bigint, float_col float, double_col double, dec1 decimal(9,0), " +
+ "d2 decimal(10, 0), d3 decimal(20, 10), d4 decimal(38, 38), d5 decimal(10, 5), " +
+ "timestamp_col timestamp, string_col string, varchar_col varchar(50), " +
+ "char_col char (30))");
+ }
+
protected void clearTestTables() {
for (Table testTable: testTables_) {
testTable.getDb().removeTable(testTable.getName());
diff --git a/tests/query_test/test_aggregation.py b/tests/query_test/test_aggregation.py
index 9e0be6d..233c33a 100644
--- a/tests/query_test/test_aggregation.py
+++ b/tests/query_test/test_aggregation.py
@@ -275,6 +275,75 @@
vector.get_value('exec_option')['batch_size'] = 1
self.run_test_case('QueryTest/parquet-stats-agg', vector, unique_database)
+ def test_sampled_ndv(self, vector, unique_database):
+ """The SAMPLED_NDV() function is inherently non-deterministic and cannot be
+ reasonably made deterministic with existing options so we test it separately.
+ The goal of this test is to ensure that SAMPLED_NDV() works on all data types
+ and returns approximately sensible estimates. It is not the goal of this test
+ to ensure tight error bounds on the NDV estimates. SAMPLED_NDV() is expected
+ be inaccurate on small data sets like the ones we use in this test."""
+ if (vector.get_value('table_format').file_format != 'text' or
+ vector.get_value('table_format').compression_codec != 'none'):
+ # No need to run this test on all file formats
+ pytest.skip()
+
+ # NDV() is used a baseline to compare SAMPLED_NDV(). Both NDV() and SAMPLED_NDV()
+ # are based on HyperLogLog so NDV() is roughly the best that SAMPLED_NDV() can do.
+ # Expectations: All columns except 'id' and 'timestmap_col' have low NDVs and are
+ # expected to be reasonably accurate with SAMPLED_NDV(). For the two high-NDV columns
+ # SAMPLED_NDV() is expected to have high variance and error.
+ ndv_stmt = """
+ select ndv(bool_col), ndv(tinyint_col),
+ ndv(smallint_col), ndv(int_col),
+ ndv(bigint_col), ndv(float_col),
+ ndv(double_col), ndv(string_col),
+ ndv(cast(double_col as decimal(3, 0))),
+ ndv(cast(double_col as decimal(10, 5))),
+ ndv(cast(double_col as decimal(20, 10))),
+ ndv(cast(double_col as decimal(38, 35))),
+ ndv(cast(string_col as varchar(20))),
+ ndv(cast(string_col as char(10))),
+ ndv(timestamp_col), ndv(id)
+ from functional_parquet.alltypesagg"""
+ ndv_result = self.execute_query(ndv_stmt)
+ ndv_vals = ndv_result.data[0].split('\t')
+
+ for sample_perc in [0.1, 0.2, 0.5, 1.0]:
+ sampled_ndv_stmt = """
+ select sampled_ndv(bool_col, {0}), sampled_ndv(tinyint_col, {0}),
+ sampled_ndv(smallint_col, {0}), sampled_ndv(int_col, {0}),
+ sampled_ndv(bigint_col, {0}), sampled_ndv(float_col, {0}),
+ sampled_ndv(double_col, {0}), sampled_ndv(string_col, {0}),
+ sampled_ndv(cast(double_col as decimal(3, 0)), {0}),
+ sampled_ndv(cast(double_col as decimal(10, 5)), {0}),
+ sampled_ndv(cast(double_col as decimal(20, 10)), {0}),
+ sampled_ndv(cast(double_col as decimal(38, 35)), {0}),
+ sampled_ndv(cast(string_col as varchar(20)), {0}),
+ sampled_ndv(cast(string_col as char(10)), {0}),
+ sampled_ndv(timestamp_col, {0}), sampled_ndv(id, {0})
+ from functional_parquet.alltypesagg""".format(sample_perc)
+ sampled_ndv_result = self.execute_query(sampled_ndv_stmt)
+ sampled_ndv_vals = sampled_ndv_result.data[0].split('\t')
+
+ assert len(sampled_ndv_vals) == len(ndv_vals)
+ # Low NDV columns. We expect a reasonaby accurate estimate regardless of the
+ # sampling percent.
+ for i in xrange(0, 14):
+ self.__appx_equals(int(sampled_ndv_vals[i]), int(ndv_vals[i]), 0.1)
+ # High NDV columns. We expect the estimate to have high variance and error.
+ # Since we give NDV() and SAMPLED_NDV() the same input data, i.e., we are not
+ # actually sampling for SAMPLED_NDV(), we expect the result of SAMPLED_NDV() to
+ # be bigger than NDV() proportional to the sampling percent.
+ # For example, the column 'id' is a PK so we expect the result of SAMPLED_NDV()
+ # with a sampling percent of 0.1 to be approximately 10x of the NDV().
+ for i in xrange(14, 16):
+ self.__appx_equals(int(sampled_ndv_vals[i]) * sample_perc, int(ndv_vals[i]), 2.0)
+
+ def __appx_equals(self, a, b, diff_perc):
+ """Returns True if 'a' and 'b' are within 'diff_perc' percent of each other,
+ False otherwise. 'diff_perc' must be a float in [0,1]."""
+ assert abs(a - b) / float(max(a, b)) <= diff_perc
+
class TestWideAggregationQueries(ImpalaTestSuite):
"""Test that aggregations with many grouping columns work"""
@classmethod