Central Limit Theorem with Examples and Solutions

\( \) \( \)\( \) \( \)

Central Limit Theorem [1]

If within a population, with any distribution, that has a mean \( \mu \) and a standard deviation \( \sigma \) we take random samples of size \( n \ge 30 \) with replacement, then the distribution of the sample means is close to a normal distribution with mean \( \mu_{\bar X} \) and standard deviation \( \sigma_{\bar X} \) given by: \[ \mu_{\bar X} = \mu \] \[ \sigma_{\bar X} = \dfrac{\sigma}{\sqrt n} \]
It is important to note that the central limit theorem states that the distribution of the sample mean \( \bar X \) tends to a normal distribution regardless of the distribution of the population from which the random samples are drawn. Therefore the central limit theorem allows us to use all normal distribution computational techniques to the distribution of the sample mean as long as the sample \( n \) size is large. \( ( n \ge 30 ) \) and \( \mu \) and \( \sigma \) are known.
Note
1) If the population has a normal distribution, the central limit theorem holds even for smaller sample size \( n \).
2) The central limit theorem also holds for populations with binomial distributions as long as \( n(1-p) \ge 5 \).



Sampling Distributions

We present an example with small samples (\( n = 2 \)) in order to explain the distribution of the sample means cited in the central limit theorem above.

Let us consider a population of integers uniformly distributed over the integers 1, 2, 3, 4, 5, 6 whose probability distribution is shown below.
probability distribution of population
The mean \( \mu \) of this population is given by:
\( \mu = \dfrac{1+2+3+4+5+6}{6} = 3.5\)
The standard deviation \( \sigma \) of this population is given by:
\( \sigma = \sqrt {\dfrac{(1-3.5)^2+(2-3.5)^2+(3-3.5)^2+(4-3.5)^2+(5-3.5)^2+(6-3.5)^2}{6}} = 1.7078 \)

We now make samples from this population by drawing 2 integers (with replacement) at a time. The list of all possible samples are

1,1 1,2 1,3 1,4 1,5 1, 6
2,1 2,2 2,3 2,4 2,5 2, 6
3,1 3,2 3,3 3,4 3,5 3, 6
4,1 4,2 4,3 4,4 4,5 4, 6
5,1 5,2 5,3 5,4 5,5 5, 6
6,1 6,2 6,3 6,4 6,5 6, 6
The sample mean of each sample is given by: \( \displaystyle \bar X = \sum\limits_{i=1}^{2} x_i /2\)
For example for the sample \( 1 , 1 \), the mean is: \( \displaystyle \bar X = \sum\limits_{i=1}^{2} x_i /2 = (1+1)/2 = 1\)
For example for the sample \( 1,2 \), the mean is: \( \displaystyle \bar X = \sum\limits_{i=1}^{2} x_i /2 = (1+2)/2 = 1.5\)
and so on.
The list of the means of all the samples above is given below.
1 1.5 2 2.5 3 3.5
1.5 2 2.5 3 3.5 4
2 2.5 3 3.5 4 4.5
2.5 3 3.5 4 4.5 5
3 3.5 4 4.5 5 5.5
3.5 4 4.5 5 5.5 6

The sample mean probability distribution is given in the table below.
\( \bar X \) 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
P \( (\bar X) \) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

The histogram of the mean distribution is shown below.
probability distribution of the means
The expected value \( E(\bar X) \) of the sample mean probability distribution in the table above is given by
\( E(\bar X) = \displaystyle \mu_{\bar X} = \sum\limits_{i=1}^{11} X_i P(X_i) = 1 \times \dfrac{1}{36} + 1.5 \times \dfrac{2}{36} + 2 \times \dfrac{3}{36} + 2.5 \times \dfrac{4}{36} + 3 \times \dfrac{5}{36} + 3.5 \times \dfrac{6}{36} \)
\( \qquad \qquad + \; 4 \times \dfrac{5}{36} + 4.5 \times \dfrac{4}{36} + 5 \times \dfrac{3}{36} + 5.5 \times \dfrac{2}{36} + 6 \times \dfrac{1}{36} = 3.5\)
The standard deviation \( \sigma_{\bar X} \) of the sample mean probability distribution in the table above is given by
\( \displaystyle \sigma_{\bar X} = \sqrt {\sum\limits_{i=1}^{11} (\bar X_i - E(\bar X))^2 P(\bar X_i )} = \sqrt { (1-3.5)^2 \times \dfrac{1}{36} + (1.5-3.5)^2 \times \dfrac{2}{36} +....+ (6-3.5)^2 \times \dfrac{1}{36} } \approx 1.215 \)
Note that
\( \displaystyle \mu_{\bar X} = \mu = 3.5\)
and
\( \displaystyle \sigma_{\bar X} \approx 1.215 \) and \( \dfrac{\sigma}{\sqrt 2} \approx 1.20759 \) which are close



Examples Using the Central Limit Theorem with Detailed Solutions

Example 1
Let \( X \) be a random variable with mean \( \mu = 20 \) and standard deviation \( \sigma = 4\). A sample of size 64 is randomly selected from this population. What is the approximate probability that the sample mean \( \bar X \) of the selected sample is less than \( 19 \)?
Solution to Example 1
No information about the population distribution is given. However, the mean and the standard deviation of the population are given. The sample size \( n = 64 \) is greater than \( 30 \) and we are asked a question related to the sample mean, we therefore may use the central limit theorem to answer the above question.
According to the central limit theorem, the distribution of the sample mean \( \bar X \) is close to a normal distribution with the mean \( \mu_{\bar X} \) and standard deviation \( \sigma_{\bar X} \) given by
\( \mu_{\bar X} = \mu = 20 \)
\( \sigma_{\bar X} = \dfrac{\sigma}{\sqrt n} = \dfrac{4}{\sqrt {64}} \)
We are looking for the probability \( P ( \bar X \lt 19 ) \)
The Z-score \( Z \) corresponding to \( \bar X = 19 \) is given by
\( Z = \dfrac{\bar X - \mu_{\bar X}}{\sigma_{\bar X}} = \dfrac{19 - 20}{\dfrac{4}{\sqrt {64}}} = - 2 \)
Use a table or a normal probability calculator to obtain the probability that the mean of the sample is less than \( 19 \).
\( P ( \bar X \lt 19 ) = P ( Z \lt -2 ) \approx 0.0228\)



Example 2
In the first semester of the year 2003, the average return for a group of 251 investing companies was \( 4.5\% \) and the standard deviation was \( 1.5\% \). If a sample of 40 companies is randomly selected from this group, what is the approximate probability that the average return of the companies in this sample was between \( 4\% \) and \( 5\% \) in the first semester of the year 2003?
Solution to Example 2
The population is made up of 251 companies with average (mean) return equal to \( 4.5\% \) with standard deviation equal to \( 1.5\% \)
The sample is large enough: \( n = 40 (\ge 30) \). We are looking for the probability concerning the average (mean) return, we therefore may use the central limit theorem.
Let \( \bar X \) be the random variable representing the mean. According to the central limit theorem, the distribution of \( \bar X \) is close to a normal distribution with the mean and standard deviation given by
\( \mu_{\bar X} = \mu = 4.5\% \)
\( \sigma_{\bar X} = \dfrac{\sigma}{\sqrt n} = \dfrac{1.5\%}{\sqrt {40}} \)
We are looking for the probability \( P ( 4\% \lt \bar X \lt 5\% ) \)
The Z-scores \( Z_1 \) and \( Z_2 \) corresponding to \( \bar X_1 = 4\% \) and \( \bar X_2 = 5\% \), respectively, are given by

\( Z_1 = \dfrac{\bar X_1 - \mu_{\bar X}}{\sigma_{\bar X}} = \dfrac{4\% - 4.5\%}{\dfrac{1.5\%}{\sqrt {40}}} \approx -2.10818 \)

\( Z_2 = \dfrac{\bar X_2 - \mu_{\bar X}}{\sigma_{\bar X}} = \dfrac{5\% - 4.5\%}{\dfrac{1.5\%}{\sqrt {40}}} \approx 2.10818\)
Use a table or a normal probability calculator to obtain the probability that average return of the companies in the sample was between \( 4\% \) and \( 5\% \).
\( P ( 4\% \lt X \lt 5\% ) = P ( -2.10818 \lt Z \lt 2.10818 ) \approx 0.965\)



Example 3
A pension fund company carries out a study of a large group of mutual funds and find that their average return over a period of 5 years was \( 80\% \) with a standard deviation equal to \( 30\% \). If a sample of \( 50 \) mutual funds is randomly selected from the group, what is the approximate probability that the sample had an average return greater than \( 90\% \) over the 5 year period?
Solution to Example 3
The question is related to the average (mean) return and the sample size \( n = 50 \) is large enough (\( \ge 30 ) \), we may therefore use the central limit theorem.
Let \( \bar X \) be the random variable representing the mean of the sample. According to the central limit theorem, the distribution of \( \bar X \) is close to a normal distribution with the mean and standard deviation given by
\( \mu_{\bar X} = \mu = 80\% \)
\( \sigma_{\bar X} = \dfrac{\sigma}{\sqrt n} = \dfrac{30\%}{\sqrt {50}} \)
We are looking for the probability \( P ( \bar X \gt 90\% ) \)
The Z-scores \( Z \) corresponding to \( 90\% \) is given by
\( Z = \dfrac{90\% - 80\%}{\dfrac{30\%}{\sqrt {50}}} \approx 2.35702\)
Use a table or a normal probability calculator to obtain the probability that average return of the companies in the sample was greater than \( 90\% \).
\( P ( X \gt 90\% ) = P ( Z \gt 2.35702 ) \approx 0.0092 \)



Example 4
The daily number of tools produced by a company is 2000. The average length of the tools is \( 10 \) centimeters with a standard deviation equal to \( 0.3 \) centimeters. If a sample of \( 200 \) tools is selected at random, what is the approximate probability that the average length of the tools in the sample is within \( 0.05 \) centimeter of the average length?
Solution to Example 4
The question is related to the average (mean) length of the tool and the sample size \( n = 200 \) is large enough, we may therefore use the central limit theorem.
Let \( \bar X \) be the random variable representing the average (mean) of the sample. According to the central limit theorem, the distribution of \( \bar X \) is close to a normal distribution with the mean and standard deviation given by
\( \mu_{\bar X} = \mu = 10 \)
\( \sigma_{\bar X} = \dfrac{\sigma}{\sqrt n} = \dfrac{0.3}{\sqrt {200}} \)
We are looking for the probability that \( \bar X \) is within \( 0.05 \) centimeter of the average length means we are looking for the probability: \( P ( 10 - 0.05 \le \bar X \le 10 + 0.05) \)
The Z-scores \( Z_1 \) corresponding to to \( \bar X = 10 - 0.05 = 9.95 \) is given by
\( Z_1 = \dfrac{9.95 - 10}{\dfrac{0.3}{\sqrt {200}}} \approx -2.35702\)
The Z-scores \( Z_2 \) corresponding to to \( \bar X = 10 + 0.05 = 10.05 \) is given by
\( Z_2 = \dfrac{10.05 - 10}{\dfrac{0.3}{\sqrt {200}}} \approx 2.35702\)
Use a table or a normal probability calculator to obtain the probability that the average length of the tools in the sample is within \( 0.05 \) centimeter of the average length.
\( P ( 9.96 \le \bar X \lt 10.05 ) = P ( -2.35702 \le Z \le 2.35702 ) \approx 0.9816 \)



Example 5
An airplane has a capacity of 200 seats and a total baggage limit of 6000 kilograms. Assume the total weight \( X \) checked by each passenger is a random variable with a mean of 28 kilograms and standard deviation 15 kilograms. If 200 passengers board a flight, what is the approximate probability that the total weight of their baggage will not exceed the limit?
Solution to Example 5
For the luggage of the 200 passengers not to exceed 6000 kilograms, the average of the weight \( X \) checked by each passenger must not exceed \( \dfrac{6000}{200} = 30 \) kilograms. Therefore the problem is reduced to find the probability: \( P (\bar X \lt 30) \) where \( \bar X \) is the sample mean of the weight \( X \).
Since the sample size is \( n = 200 \), the distribution of \( \bar X \) is close to a normal distribution with
mean: \( \mu_{\bar X} = \mu = 28 \)
standard deviation: \( \sigma_{\bar X} = \dfrac{\sigma}{\sqrt n} = \dfrac{15}{\sqrt {200}} \)
We are looking for the probability that \( \bar X \) is less \( 30 \) written as \( P ( \bar X \le 30) \)
The Z-scores \( Z \) corresponding to to \( \bar X = 30 \) is given by
\( Z = \dfrac{30 - 28}{\dfrac{15}{\sqrt {200}}} \approx 1.88561\)
Use a table or a normal probability calculator to obtain the probability that the total weight of their baggage will not exceed the limit.
\( P ( \bar X \le 30) = P ( Z \lt 1.88561 ) \approx 0.9703 \)



More References and links