Business Analysis Module User’s Guide : Chapter 5 Using the Classes : Regression Classes
Regression Classes
Here is a simple example of how to perform a multiple linear regression using class RWLinearRegression:
 
int main()
{
RWGenMat<double> predictorMatrix; // Values of the predictor
// variables.
RWMathVec<double> observationVector; // Values of the dependent
// variable.
 
// A routine that reads in the regression data from a file.
if ( !getDataFromFile("lnrexam.dat",
predictorMatrix, observationVector) )
{
return 0;
}
 
RWLinearRegression lr( predictorMatrix, observationVector );
// Make sure parameter calculation succeeded.
if ( lr.fail() )
{
return 0;
}
 
// Print out model parameters.
cout << "Model Parameters: " << lr.parameters();
return 0;
}
Performing a logistic regression using class RWLogisticRegression is almost identical:
 
int main()
{
RWGenMat<double> predictorMatrix; // Values of the predictor
// variables.
RWMathVec<bool> observationVector; // Values of the
// dependent variable.
 
// A routine that reads in the regression data from a file.
if ( !getDataFromFile("lnrlog.dat", predictorMatrix,
observationVector) )
{
return 0;
}
 
RWLogisticRegression lr( predictorMatrix, observationVector );
// Make sure parameter calculation succeeded.
if ( lr.fail() )
{
return 0;
}
 
// Print out model parameters.
cout << "Model Parameters: " << lr.parameters();
return 0;
}
Updating Parameter Estimates
Parameter calculations are performed automatically when you construct a regression object, and when you modify the data using one of the class methods listed below. The methods below are member functions of the base class RWRegression and are inherited by both RWLinearRegression and RWLogisticRegression. Note that S is the type double for linear regression and the type bool for logistic regression.
 
void addInterceptParameter();
void removeInterceptParameter();
 
void addObservations(const RWGenMat<double>&,const RWMathVec<S>&);
void addObservation(const RWMathVec<double>&, S);
void removeObservations(size_t startAt, size_t numToRemove);
 
void addPredictors(const RWGenMat<double>&);
void addPredictor(const RWMathVec<double>&);
void removePredictors(size_t startAt, size_t numToRemove);
In addition to the methods listed above, the regression classes have methods that provide handles to the underlying data. Although these handle methods may be used to modify that data, they do not perform the parameter calculations automatically. For example, you may change all the values of the third predictor variable in a model as shown in the following example:
 
RWLinearRegresion lr;
RWMathVec<double> newPredictorValues(lr.numObservations(),
rwUninitialized);
.
.
.
// Fill in the predictor values.
.
.
.
lr.predictorMatrix().col(2) = newPredictorValues;
// Must recompute the parameters explicitly...
lr.reCalculateParameters();
.
.
.
Similarly, you may change the value of the observation vector:
 
RWLogisticRegression lr;
RWMathVec<bool> newObservations( lr.numObservations(),
rwUninitialized );
.
.
.
// Fill in the observation vector.
.
.
.
lr.observationVector() = newObservations;
// Must recompute the parameters explicitly...
lr.reCalculateParameters();
.
.
.
When you change data using the handle functions, it is your responsibility to update the parameters with a call to the reCalculateParameters() method.
Intercept Option
When constructing a regression object, you must specify an intercept option for the model along with the predictor and observation data. The intercept option is an enumeration defined in the base class RWRegression. The enumeration has three possible values:
noIntercept means the model does not have an intercept. In this case, the input data provided at construction time is the regression matrix. Once the regression object is constructed, calls to the RWLinearRegression and RWLogisticRegression methods predictorMatrix() and regressionMatrix() yield the same matrices.
addIntercept means the model has an intercept parameter, but it is not represented in the input data matrix as a leading column of 1s. When you specify this option, a column of 1s is prepended to the input data matrix to obtain the regression matrix. In this case, calls to the RWLinearRegression and RWLogisticRegression methods predictorMatrix() and regressionMatrix() will yield different results. Column 0 of the regression matrix will contain all 1s for the intercept parameter, and columns 1 through n-1, where n is the number of predictors in the model, will contain the predictor matrix.
intercept means the model has an intercept parameter, which is represented in the input data matrix as a column of 1s. In this case, the column of 1s representing the intercept parameter must be the first column. If the intercept option is specified and the first column is not 1s, an error will occur.
If the intercept option isn’t specified at construction time, it defaults to addIntercept. The intercept parameter may be added or removed from the regression model at any time using the RWLinearRegression and RWLogisticRegression methods addInterceptParameter() and removeInterceptParameter().
The following examples demonstrate each of the intercept options, using linear regression:
 
// m1: the regression model has one predictor variable and an
// intercept parameter:
// y = B0 + B1x1
// The leading column of 1s is included in the data.
// m1 = [ 1 2
// 1 5
// 1 6 ]
RWGenMat<double> m1( "3x2 [1 2 1 5 1 6]" );
RWMathVec<double> y1( "[ 1 4 9]" );
 
// m2: the regression model has one predictor variable and an
// intercept parameter:
// y = B0 + B1x1
// The leading column of 1s is NOT included in the data.
// m2 = [ 2
// 5
// 6 ]
RWGenMat<double> m2( "3x1 [2 5 6]" );
RWMathVec<double> y2( "[ 1 4 9]" );
 
// m3: the regression model has two predictor variables and NO
// intercept parameter:
// y = B0x0 + B1x1
// The leading column of 1s is included in the data.
// m3 = [ 3 2
// 9 5
// 7 6 ]
RWGenMat<double> m3( "3x2 [3 2 9 5 7 6]" );
RWMathVec<double> y3( "[ 1 4 7]" );
 
// Model 1 contains an intercept parameter that is represented by a
// leading column of 1s in the input data matrix m1.
 
RWLinearRegression model1( m1, y1, RWLinearRegression::intercept );
cout << "Number of predictors: " << model1.numPredictors() << endl;
cout << "Number of parameters: " << model1.numParameters() << endl;
// Model 2 contains an intercept parameter, but the input data
// matrix m2 contains only the data for the predictor variable.
// The class should add a leading column of 1s when
// forming the regression matrix.
 
RWLinearRegression model2( m2, y2 );
// This is equivalent to:
// RWLinearRegression model2( m2, y2, RWLinearRegression::addIntercept);
// since the default argument for the third parameter in the
// constructor is RWLinearRegression::addIntercept.
cout << "Number of predictors: " << model2.numPredictors() << endl;
cout << "Number of parameters: " << model2.numParameters() << endl;
 
// Model 3 does not contains an intercept parameter. The input data
// matrix m2 should be used as the regression matrix.
RWLinearRegression model3( m3, y3, RWLinearRegression::noIntercept );
cout << "Number of predictors: " << model3.numPredictors() << endl;
cout << "Number of parameters: " << model3.numParameters() << endl;
.
.
.
Note that in the example above, objects model1 and model2 are identical.