Normal Distribution: Definition, Properties & Applications

The normal distribution is a fundamental concept in statistics describing how data clusters around a mean. This guide explains the probability density function, standard normal distribution, z-scores, and practical applications with examples. Learn how mean (μ) and standard deviation (σ) shape the distribution, calculate probabilities using conversion to standard normal form, and apply the empirical rule for data analysis.

Probability Histograms & Data Distributions

Probability histograms visualize data distribution. Figure 1 shows three distributions: A (left-concentrated), B (right-concentrated), and C (spread out).

Probability Histogram of Various Data Distributions

Figure 1: Different Data Distribution Types

Figure 2 displays a symmetric histogram with data concentrated at the center. The mean (μ) and median are approximately equal: \[ \mu \approx 3.5 \quad \text{and} \quad \text{median} \approx 3.5 \]

Symmetric Probability Histogram

Figure 2: Symmetric Distribution

Normally Distributed Data Example

Consider a dataset with calculated mean μ = 3.5 and standard deviation σ = 1 (Figure 3). The mean and median are nearly identical, indicating symmetry.

Mean, Median, Standard Deviation Calculation

Figure 3: Dataset Statistics

Frequency and Probability Table

Figure 4: Frequency and Probability Distribution

Figure 5 compares the probability histogram with the normal density function: \[ f_X(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}(x-3.5)^2} \]

Normal Distribution Overlay on Histogram

Figure 5: Normal Distribution Fit

Probability Comparison: For \( P(3 \leq X \leq 5) \):

  1. Histogram method: \( 0.377 + 0.229 = 0.606 \)
  2. Normal distribution integral: \[ P(3 \leq X \leq 5) = \frac{1}{\sqrt{2\pi}} \int_3^5 e^{-\frac{1}{2}(x-3.5)^2} dx \approx 0.62465 \]
The close match validates using the normal density function for probability calculations.

Probability Density Function of Normal Distribution

The general normal distribution with mean μ and standard deviation σ has probability density function: \[ \boxed{ f_X(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} } \] where \( X \) is the normally distributed random variable.

Graphs of Normal Distributions

Figure 6: Same standard deviation (σ = 2), different means. Normal Distributions with Different Means
Figure 7: Same mean (μ = 4), different standard deviations. Normal Distributions with Different Standard Deviations

Mean μ determines horizontal position; σ controls spread (smaller σ = more concentrated data).

Properties of Normal Distributions

  1. Distribution is symmetric and centered at the mean μ.
  2. Mean, median, and mode are equal.
  3. Total area under the curve equals 1: \[ \frac{1}{\sigma\sqrt{2\pi}} \int_{-\infty}^{\infty} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} dx = 1 \]
  4. Empirical Rule (68-95-99.7 Rule):
    1. ≈68% of data within μ ± σ: \[ \frac{1}{\sigma\sqrt{2\pi}} \int_{\mu-\sigma}^{\mu+\sigma} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} dx \approx 0.68268 \] Area Within One Standard Deviation
    2. ≈95% within μ ± 2σ: \[ \frac{1}{\sigma\sqrt{2\pi}} \int_{\mu-2\sigma}^{\mu+2\sigma} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} dx \approx 0.95449 \] Area Within Two Standard Deviations
    3. ≈99.7% within μ ± 3σ: \[ \frac{1}{\sigma\sqrt{2\pi}} \int_{\mu-3\sigma}^{\mu+3\sigma} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} dx \approx 0.99730 \] Area Within Three Standard Deviations

Standard Normal Distribution

Special case with μ = 0 and σ = 1: \[ f_X(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}x^2} \] Cumulative distribution function: \[ F_X(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{x} e^{-\frac{1}{2}t^2} dt \] This gives \( P(X \leq a) = F_X(a) \), represented by the shaded area in Figure 11.

Cumulative Probability Visualization

Figure 11: Cumulative Probability Area

Values are computed using statistical software, calculators, or tables. Google Sheets function: =NORM.S.DIST(a, TRUE).

Google Sheets Calculation Example

Figure 12: Google Sheets Implementation

Converting to Standard Normal Distribution

Any normal distribution can be transformed using the z-score: \[ z = \frac{x - \mu}{\sigma} \] For probability \( P(X \leq a) \): \[ P(X \leq a) = \frac{1}{\sigma\sqrt{2\pi}} \int_{-\infty}^{a} e^{-\frac{1}{2}\left(\frac{t-\mu}{\sigma}\right)^2} dt \] Substitute \( z = \frac{t-\mu}{\sigma} \), \( dz = \frac{dt}{\sigma} \): \[ P(X \leq a) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\frac{a-\mu}{\sigma}} e^{-\frac{1}{2}z^2} dz = F_Z\left(\frac{a-\mu}{\sigma}\right) \] Thus, only the standard normal distribution table is needed.

Probability Calculation Examples

Example 1

Given \( X \sim N(\mu = 2.2, \sigma = 2.5) \), find \( P(X \leq 1.2) \).

Solution:

  1. Calculate z-score: \[ z = \frac{1.2 - 2.2}{2.5} = -0.4 \]
  2. Using standard normal table or calculator: \[ P(X \leq 1.2) = P(Z \leq -0.4) \approx 0.3446 \]

Example 2

Given \( X \sim N(\mu = -2.5, \sigma = 2) \), find:

  1. \( P(0.5 \leq X \leq 3.1) \)
  2. \( P(X \geq 0.8) \)

Solution:

Part (a):

  1. Convert to z-scores: \[ z_1 = \frac{0.5 - (-2.5)}{2} = 1.5, \quad z_2 = \frac{3.1 - (-2.5)}{2} = 2.8 \]
  2. Use standard normal table: \[ P(Z \leq 2.8) = 0.99744, \quad P(Z \leq 1.5) = 0.93319 \]
  3. Calculate: \[ P(0.5 \leq X \leq 3.1) = 0.99744 - 0.93319 = 0.06425 \]

Part (b):

  1. z-score: \[ z_3 = \frac{0.8 - (-2.5)}{2} = 1.65 \]
  2. \( P(X \leq 0.8) = P(Z \leq 1.65) = 0.95053 \)
  3. \( P(X \geq 0.8) = 1 - 0.95053 = 0.04947 \)

References & Resources

  1. Probability Density for Continuous Variables
  2. Introduction to Probability
  3. Frequency Distributions with Google Sheets
  4. Normal Probability Calculator
  5. Normal Distribution Problems with Solutions
  6. Standard Normal Distribution Table (PDF)