General Statistics
Chi-Square Analysis
Variance and the chi-square distribution

Variance and the chi-square distribution

When the population variance is treated as an unknown quantity and there is a need to form estimated confidence interval about its 
expected or unknown value or to test if a sample variance belong to an expected populated variance, the chi-square test is a good 
choice for these analyses since the variance distribution is similar to the chi-square distribution in some ways.

1. Know when it is appropriate the use the chi-square statistics / distribution for estimates and inference about the 
pupulation variance.

Like the pupulation mean, the variance , is in most cases unknown and its value determined from sample data. The sample variance, 
is often used to make point estimates of; However, also a random variable and is related to the chi-square distributions as follows:

, where the number of degrees of freedom is n-1.

Often situations arises when there is a need to reduce the variable even though doing so may not change the mean. Waiting in a service 
queque or line at a post office, even though there are many attendants there are one line where customers move from the front of the line 
to the next available teller.

This one line served by many attendants is an example of reducing the variabliity of time waiting to be served though the average time 
between service remain unchange. When variability is smaller, the inconvenience of waiting is more predictable and so customers tend to wait.

The chi-square distributions models well this process of evaluating the effectiveness of reducing variability.

2. Know how to estimate the population  given a desired confidence interval.

Lower and Upper bounds for  is determined by constructing a confidence interval around a desired significance level, .

Confidence Interval (CI) Estimate of 

Example: The 90% CI or 0.90 probability () is determined as follows:

Since two-tail (lower and upper limits - CI) tails will be , the chi-square statistics for variance estimates are as follows:

, where  is an unknown random variable.

Or Lower limit and Upper Limits for a 90% CI for  (variance) is given by

Example: An etimated variablilty in rates of return for 25 clients of a financial firm showed

Mean = 14.5% and s = 11.2 %

Using a 98% confidence interval estimate of the variance  in rates of return find the confidence interval for the population standard deviation, .

Since 98% CI, : Using . The df = (n-1) = 25-1 = 24

Using chi-square lookup table:
 
df
24 9.89 10.86 12.4 13.85 15.66 33.2 36.42 39.36 42.98 45.56

and 

So the standard deviation, .

3. Know how to construct and evaluate hypthesis testing regarding .

Problem: A bank manager observed that the standard deviation in waiting in line for service during the Christmas holidays season is 
about 10 minutes per customer. Hoping to implement a new policy of single line service, the standard deviation of waiting in line for 
25 customers were observed with the new policy by a pilot study and was calculated to be 5 minutes. Should the manager adopt the 
new single line policy based on this pilot study?

Step 1. Make a problems statement: (becomes the hypothesis statement, Ho ).

Assume that variable in waiting times will be at least greater under the experimental single line policy.

Critical values of tests
 
Lower-Tailed Test Upper-Tailed Test
Critical value =  Critical value = 

Hypothesis:

Given population variance is 100 (

(that is the new variability is the same or worse as the old at the significant level of alpha)

Or

(one tail lower test - since null hypothesis is true if chi-square statistics is greater than or equal to the lower bound 
of the estimated chi-square distribution)

Ha: Ho is not true. (alternate hypothesis): s = 5 is truely lower than . (One tail lower test)

Step 2. Choose , the significance level of the test.

If you want be 99 % certain that the test is true, then  = 0.01 =(100-99)/100

The df = n-1=25-1=24

So df = 24

Step 3. Look up  from chi-square table:
 
df
24 9.89 10.86 12.4 13.85 15.66 33.2 36.42 39.36 42.98 45.56

For d.f. = 12,  (the lower bound on a one-tailed test of the chi-square statistics)

,

Step 4. Determine or compute ,

,

Step 5 Perform test chi-square test: 

Since , i.e. 7.25 < 10.86, Then we reject the null hypothesis that s=5 belongs to a population whose standard deviation is 10.

So indead 5 is truely smaller than 10.

Make Conclusion or inference:

We conclude that a policy of single waiting line improves the variance of waiting time from 10 to 5 minutes. So adopt the single line 
policy since it improves variability.

So Ha (alternate hypothesis) is favored by this test.

Workshop Problem (Variance Tests)

The follwoing sample data for the transportation costs (in dollars) for moving a pallet of raw materials 500 miles, in 1970.

Construct 90% confidence interval estimates of the following.

(a) The variance in cost per pallet.

(b) The standard deviation in cost per pallet.