RogueWave

imsl.regression.LogisticRegression.aggregate

LogisticRegression.aggregate(*models)

Combine separate fits of the logistic regression model.

Parameters:

models : tuple

A collection of LogisticRegression instances. All objects in the collection must describe the same logistic regression model as the current instance.

Notes

Let a, b, c be LogisticRegression instances with the same model structure. If c has a fit, then c.aggregate(a, b) aggregates the fits described by a, b and c. If c has no existing fit, then c.aggregate(a, b) aggregates fits a and b in c.

Examples

A logistic regression model consisting of three predictor variables, an intercept and four response classes is fit to two different data sets. The two model fits are then aggregated. Regression coefficients and coefficient standard errors are printed for the individual fits and the aggregated model.

>>> import numpy as np
>>> import imsl.regression as reg
>>> y1 = np.array([6, 13, 18, 28, 52, 53, 61, 60])
>>> x1 = np.array([1.69, 1.724, 1.755, 1.784, 1.811,
...                1.836, 1.861, 1.883])
>>> # Array 1 of predictors
>>> x1 = np.array([[3, 25.92869, 1], [2,51.63245, 2],
...                [2, 25.78432, 1], [1, 39.37948, 1],
...                [3,24.65058, 1], [3, 45.20084, 1],
...                [3, 52.6796, 2], [2, 44.28342, 2],
...                [3, 40.63523, 2], [3, 51.76094, 1],
...                [3, 26.30368, 1], [3, 20.70230, 2],
...                [3, 38.74273, 2], [3,19.47333, 1],
...                [2, 26.42211, 1], [3, 37.05986, 2],
...                [2, 51.67043, 2], [1, 42.40156, 1],
...                [3, 33.90027, 2], [2, 35.43282, 1],
...                [2, 44.30369, 1], [1, 46.72387, 1],
...                [2, 46.99262, 1], [1, 36.05923, 1],
...                [3, 36.83197, 2], [2, 61.66257, 2],
...                [1, 25.67714, 1], [2, 39.08567, 2],
...                [1, 48.84341, 2], [2, 39.34391, 1],
...                [3, 24.73522, 1], [2, 50.55251, 2],
...                [1, 31.34263, 2], [2, 27.15795, 2],
...                [1, 31.72685, 1], [1, 25.00408, 1],
...                [2, 26.35457, 2], [3, 38.12343, 1],
...                [1, 49.9403, 1], [2, 42.45779, 2],
...                [1, 38.80948, 2], [1, 43.22799, 2],
...                [1, 41.87624, 1], [3, 48.0782, 1],
...                [1, 43.23673, 2], [3, 39.41294, 1],
...                [2, 23.93346, 1], [3, 42.8413, 2],
...                [3, 30.40669, 1], [1, 37.77389, 1]])
>>> # Array 2 of predictors
>>> x2 = np.array([[1, 35.66064, 1], [1, 26.68771, 1],
...                [3, 23.11251, 2], [3, 58.14765, 1],
...                [2, 44.95038, 1], [3, 42.45634, 1],
...                [3, 34.97379, 2], [3, 53.54269, 2],
...                [2, 32.57257, 2], [1, 46.91201, 1],
...                [1, 30.93306, 1], [1, 51.63743, 2],
...                [1, 34.67712, 2], [3, 53.84584, 1],
...                [3, 14.97474, 1], [2, 44.4485, 2],
...                [2, 47.10448, 1], [3, 43.96467, 1],
...                [3, 55.55741, 2], [2, 36.63123, 2],
...                [3, 32.35164, 2], [2, 55.75668, 1],
...                [1, 36.83637, 2], [3, 46.7913, 1],
...                [3, 44.24153, 2], [2, 49.94011, 1],
...                [2, 41.91916, 1], [3, 24.78584, 2],
...                [3, 50.79019, 2], [2, 39.97886, 2],
...                [1, 34.42149, 2], [2, 41.93271, 2],
...                [1, 28.59433, 2], [2, 38.47255, 2],
...                [3, 32.11676, 2], [3, 37.19347, 1],
...                [1, 52.89337, 1], [1, 34.64874, 1],
...                [2, 48.61935, 2], [2, 33.99104, 1],
...                [3, 38.32489, 2], [1, 35.53967, 2],
...                [1, 29.59645, 1], [2, 21.14665, 1],
...                [2, 51.11257, 2], [1, 34.20155, 1],
...                [1, 44.40374, 1], [2, 49.67626, 2],
...                [3, 58.35377, 1], [1, 28.03744, 1]])
>>> # Array 1 of response IDs
>>> y1 = np.array([1, 2, 3, 4, 3, 3, 4, 4, 4, 4, 2, 1, 4, 1, 1, 1, 4,
...                4, 3, 1, 2, 3, 3, 4, 2, 3, 4, 1, 2, 4, 3, 4, 4, 1,
...                3, 4, 4, 2, 3, 4, 2, 2, 4, 3, 1, 4, 3, 4, 2, 3])
>>> # Array 2 of response IDs
>>> y2 = np.array([1, 4, 1, 4, 1, 1, 3, 1, 2, 4, 3, 1, 3, 2, 4, 4, 4,
...                2, 3, 2, 1, 4, 4, 4, 4, 3, 1, 1, 3, 1, 4, 2, 4, 2,
...                1, 2, 3, 1, 1, 4, 1, 2, 4, 3, 4, 2, 4, 3, 2, 4])
>>> n_predictors = 3
>>> n_classes = 4
>>> resp1 = reg.ResponseFormatID(y1, n_classes)
>>> # Fit first model to x1, resp1
>>> model1 = reg.LogisticRegression(n_predictors, n_classes)
>>> model1.fit(x1, resp1)
>>> np.set_printoptions(formatter={'float': '{: 0.3f}'.format})
>>> print("First Model Coefficients:\n" +
...       str(model1.coefficients)) 
First Model Coefficients:
[[ 1.691  0.350 -0.137  1.057]
 [-1.254  0.242 -0.004  0.115]
 [ 1.032  0.278  0.016 -1.954]
 [ 0.000  0.000  0.000  0.000]]
>>> print("\nFirst Model Standard Errors:\n" +
...       str(model1.stderrs)) 
First Model Standard Errors:
[[ 2.389  0.565  0.061  1.025]
 [ 2.197  0.509  0.047  0.885]
 [ 2.007  0.461  0.043  0.958]]
>>> # Fit second model to x2, resp2
>>> resp2 = reg.ResponseFormatID(y2, n_classes)
>>> model2 = reg.LogisticRegression(n_predictors, n_classes)
>>> model2.fit(x2, resp2)
>>> print("\nSecond Model Coefficients:\n" +
...       str(model2.coefficients)) 
Second Model Coefficients:
[[-2.668  0.758 -0.016  1.050]
 [-2.719  0.611  0.006  0.511]
 [-3.281  0.229  0.025  0.812]
 [ 0.000  0.000  0.000  0.000]]
>>> print("\nSecond Model Standard Errors:\n" +
...       str(model2.stderrs)) 
Second Model Standard Errors:
[[ 2.042  0.485  0.038  0.777]
 [ 2.187  0.522  0.041  0.829]
 [ 2.334  0.545  0.045  0.853]]
>>> # Aggregate models
>>> model1.aggregate(model2)
>>> print("\nAggregated Model Coefficients:\n" +
...       str(model1.coefficients)) 
Aggregated Model Coefficients:
[[-1.169  0.649 -0.038  0.608]
 [-1.935  0.435  0.002  0.215]
 [-0.193  0.282  0.002 -0.630]
 [ 0.000  0.000  0.000  0.000]]
>>> print("\nAggregated Model Standard Errors:\n" +
...       str(model1.stderrs)) 
Aggregated Model Standard Errors:
[[ 1.489  0.359  0.029  0.588]
 [ 1.523  0.358  0.030  0.584]
 [ 1.461  0.344  0.030  0.596]]
>>> # Put back the default options
>>> np.set_printoptions()