normal_two_sample

IMSLS_INTERMEDIATE_RESULTS, float stats[] (Input/Output)
Array of length 25 containing intermediate results. On input, stats contains intermediate statistics about a previous function invocation. When invoking the function the first time, set all stats elements to 0.0. On output, imsls_f_normal_two_sample combines the results on the current data sets and the intermediate statistics in stats.

This option would typically be used in conjunction with the IMSLS_INTERMEDIATE_RESULTS option to process a large data set using separate threads or compute nodes. For example, a data set could be split into two subsets, where each subset of data is passed into a separate thread or compute node and processed through imsls_f_normal_two_sample with the IMSLS_INTERMEDIATE_RESULTS option. The output from each thread is then saved and input to a final call of imsls_f_normal_two_sample using option IMSLS_UNON and IMSLS_FINAL_RESULTS.

index	final_stats[i]
0	Mean of the first sample.
1	Mean of the second sample.
2	Variance of the first sample.
3	Variance of the second sample.
4	Number of observations in the first sample.
5	Number of observations in the second sample.
Note: final_stats[6] through final_stats[13] depend on the assumption of equal variances.
6	Pooled variance.
7	t value, assuming equal variances.
8	Probability of a larger t in absolute value, assuming normality, equal means, and equal variances.
9	Degrees of freedom assuming equal variances.
10	Lower confidence limit for the mean of the first population minus the mean of the second, assuming equal variances.
11	Upper confidence limit for the mean of the first population minus the mean of the second, assuming equal variances.
12	Lower confidence limit for the common variance.
13	Upper confidence limit for the common variance.
Note: final_stats[14] through final_stats[18] use approximations that do not depend on an assumption of equal variances.
14	t value, assuming unequal variances.
15	Approximate probability of a larger t in absolute value, assuming normality, equal means, and unequal variances.
16	Degrees of freedom assuming unequal variances, for Satterthwaite's approximation.
17	Approximate lower confidence limit for the mean of the first population minus the mean of the second, assuming equal variances.
18	Approximate upper confidence limit for the mean of the first population minus the mean of the second, assuming equal variances.
19	F value (greater than or equal to 1.0).
20	Probability of a larger F in absolute value, assuming normality and equal variances.
21	Lower confidence limit for the ratio of the variance of the first population to the second.
22	Upper confidence limit for the ratio of the variance of the first population to the second.
23	Number of missing values of first sample.
24	Number of missing values of second sample.

IMSLS_T_TEST_FOR_EQUAL_VARS, int *df, float *t, float *p_value (Output)
A t test for μ1 − μ2 = c, where c is the null hypothesis value. (See the description of IMSLS_T_TEST_NULL.) Argument df contains the degrees of freedom, argument t contains the t value, and argument p_value contains the probability of a larger t in absolute value, assuming equal means. This test assumes equal variances.

IMSLS_T_TEST_FOR_UNEQUAL_VARS, float *df, float *t, float *p_value (Output)
A t test for μ1 − μ2 = c, where c is the null hypothesis value. (See the description of IMSLS_T_TEST_NULL.) Argument df contains the degrees of freedom for Satterthwaite’s approximation, argument t contains the t value, and argument p_value contains the approximate probability of a larger t in absolute value, assuming equal means. This test does not assume equal variances.

IMSLS_CONFIDENCE_VARIANCE, float confidence_variance (Input)
Confidence level for inference on variances. Under the assumption of equal variances, the pooled variance is used to obtain a two-sided confidence_variance percent confidence interval for the common variance if IMSLS_CI_COMMON_VARIANCE is specified. Without making the assumption of equal variances, the ratio of the variances is of interest. A two-sided confidence_variance percent confidence interval for the ratio of the variance of the first sample to that of the second sample is computed and is returned if IMSLS_CI_RATIO_VARIANCES is specified. The confidence intervals are symmetric in probability.

Let μ1 and

be the mean and variance of the first population, and let μ2 and

be the corresponding quantities of the second population. The function contains test confidence intervals for difference in means, equality of variances, and the pooled variance.

The test that the difference in means equals a certain value, for example, μ0, depends on whether or not the variances of the two populations can be considered equal. If the variances are equal and mean_hypothesis_value equals 0, the test is the two-sample t test, which is equivalent to an analysis-of-variance test. The pooled variance for the difference-in-means test is as follows:

If the population variances are not equal, the ordinary t statistic does not have a t distribution and several approximate tests for the equality of means have been proposed. (See, for example, Anderson and Bancroft 1952, and Kendall and Stuart 1979.) One of the earliest tests devised for this situation is the Fisher-Behrens test, based on Fisher’s concept of fiducial probability. A procedure used if IMSLS_T_TEST_FOR_UNEQUAL_VARS and/or IMSLS_CI_DIFF_FOR_UNEQUAL_VARS are specified is the Satterthwaite’s procedure, as suggested by H.F. Smith and modified by F.E. Satterthwaite (Anderson and Bancroft 1952, p. 83).

The F statistic for testing the equality of variances is given by

, where

is the larger of

and

. If the variances are equal, this quantity has an F distribution with n1 − 1 and n2 − 1 degrees of freedom.

Scores for Standard Group	Scores for Experimental Group
72	111
75	118
77	128
80	138
104	140
110	150
125	163
	164
	169

The same data is used for this example as for the initial example. Here, the results of the t test are output. The variances of the two populations are assumed to be equal. It is seen from the output that there is strong reason to believe that the two means are different (t value of −4.804). Since the lower 97.5-percent confidence limit does not include 0, the null hypothesis is that μ1 ≤ μ 2 would be rejected at the 0.05 significance level. (The closeness of the values of the sample variances provides some qualitative substantiation of the assumption of equal variances.)

This example demonstrates how the analysis can be applied to subsets of the original data sets and then later combined for final results. These techniques may be useful when analyzing data sets too large to fit into memory, and also allow subsets of the data to be analyzed in separate threads (though this example does not show the use of separate threads) and later combined for final results.