Solvers Statistics

Pooled Variance Calculator

Instructions : This calculator computes the pooled variance and standard deviation for two given sample standard deviations \(s_1\) and \(s_2\), with corresponding sample sizes \(n_1\) and \(n_2\).

How to Compute Pooled Variances

A pooled variance is an estimate of population variance obtained from two sample variances when it is assumed that the two samples come from population with the same population standard deviation.

In that situation, none of the sample variances is a better estimate than the other, and the two sample variances provided are "pooled" together, in a sort of weighted average manner, to compute the pooled variance

How do you calculate pooled variance?

The formula for calculating the pooled variance given two sample variances is:

\[ s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2} \]

On the other hand, from the pooled variance formula we can derived that the pooled standard deviation is:

\[s_p = \sqrt{ \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}\]

Relationship Between Pooled Variance and Sum of Squares

On cool way of expressing the above formulas is based on the idea of the Sum of Squares (\(SS\)). In Social Sciences the sum of squares of a sample is defined as

\[SS = \sum_{i=1}^n \left( X - \bar X\right)^2 \]

But using the definition of sample variance, it is direct to see that

\[SS = \sum_{i=1}^n \left( X - \bar X \right)^2 = (n-1) s^2\]

So then, we multiply the sample variance by \(n-1\) and we get the sum of squares \(SS\). Also, we know that for the one-sample case, we have that \(df = n-1\). Therefore, the pooled variance can be written very simply as:

\[ s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2} = \frac{ SS_1 + SS_2}{df_1+df_2}\]

When to Use Pooled Variances

The idea of pooled variances requires the assumption that the population variances are equal. For the case of unequal population variances, you should use this unpooled variances calculator .

One context in which the idea of pooled variances is used is for t-test for two independent variances. For a t-test calculator (where the idea of pooled variances is used), check it here.

What is the pooled variance in Z test?

The pooled variance does not apply in the case of a z-test, because in that case the population variances are assumed to be known and there is no need to pool them to make the best possible estimate.

The idea of a pooled variance is more relevant when the population variances are not known, and there is a need to come up with a good estimate, in which case the pooling of the variances does a good job at that.

What is the purpose of the pooled variance?

As it was explained above, the purpose of computing a pool variance is to estimate the common population variance when the actual population variance is not known.

That is why it is relevant to know the pooled variance for the t-test formula, because that is a case where precisely the population variances are unknown.

So in a way, the pooled variance is a kind of weighted average of variances, so try to get the best possible estimate, based on sample information.

Is pooled variance the same as MSE?

In the context of an ANOVA, it is. The MSE formula takes the pooled variance of the samples. In that case, the pooling can include more that two samples.

The pooled variance formula for more than two samples is a simple extension of the formula for two samples.