The normal distribution is a fundamental concept in statistics describing how data clusters around a mean. This guide explains the probability density function, standard normal distribution, z-scores, and practical applications with examples. Learn how mean (μ) and standard deviation (σ) shape the distribution, calculate probabilities using conversion to standard normal form, and apply the empirical rule for data analysis.
Probability histograms visualize data distribution. Figure 1 shows three distributions: A (left-concentrated), B (right-concentrated), and C (spread out).
Figure 1: Different Data Distribution Types
Figure 2 displays a symmetric histogram with data concentrated at the center. The mean (μ) and median are approximately equal: \[ \mu \approx 3.5 \quad \text{and} \quad \text{median} \approx 3.5 \]
Figure 2: Symmetric Distribution
Consider a dataset with calculated mean μ = 3.5 and standard deviation σ = 1 (Figure 3). The mean and median are nearly identical, indicating symmetry.
Figure 3: Dataset Statistics
Figure 4: Frequency and Probability Distribution
Figure 5 compares the probability histogram with the normal density function: \[ f_X(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}(x-3.5)^2} \]
Figure 5: Normal Distribution Fit
Probability Comparison: For \( P(3 \leq X \leq 5) \):
The general normal distribution with mean μ and standard deviation σ has probability density function: \[ \boxed{ f_X(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} } \] where \( X \) is the normally distributed random variable.
Figure 6: Same standard deviation (σ = 2), different means.
Figure 7: Same mean (μ = 4), different standard deviations.
Mean μ determines horizontal position; σ controls spread (smaller σ = more concentrated data).
Special case with μ = 0 and σ = 1: \[ f_X(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}x^2} \] Cumulative distribution function: \[ F_X(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{x} e^{-\frac{1}{2}t^2} dt \] This gives \( P(X \leq a) = F_X(a) \), represented by the shaded area in Figure 11.
Figure 11: Cumulative Probability Area
Values are computed using statistical software, calculators, or tables.
Google Sheets function: =NORM.S.DIST(a, TRUE).
Figure 12: Google Sheets Implementation
Any normal distribution can be transformed using the z-score: \[ z = \frac{x - \mu}{\sigma} \] For probability \( P(X \leq a) \): \[ P(X \leq a) = \frac{1}{\sigma\sqrt{2\pi}} \int_{-\infty}^{a} e^{-\frac{1}{2}\left(\frac{t-\mu}{\sigma}\right)^2} dt \] Substitute \( z = \frac{t-\mu}{\sigma} \), \( dz = \frac{dt}{\sigma} \): \[ P(X \leq a) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\frac{a-\mu}{\sigma}} e^{-\frac{1}{2}z^2} dz = F_Z\left(\frac{a-\mu}{\sigma}\right) \] Thus, only the standard normal distribution table is needed.
Given \( X \sim N(\mu = 2.2, \sigma = 2.5) \), find \( P(X \leq 1.2) \).
Solution:
Given \( X \sim N(\mu = -2.5, \sigma = 2) \), find:
Solution:
Part (a):
Part (b):