Probability Density Function for Continuous Random Variables

This guide explains the probability density function (PDF) concept, starting from frequency histograms to probability histograms. Includes solved examples with detailed calculus-based solutions.

From Frequency Histograms to Probability Distributions

We use 2560 time measurements (in hours) for project completion—a continuous random variable. The frequency histogram (class width = 1) shows data distribution.

Frequency histogram with class width 1
Figure 1: Frequency histogram (sum of frequencies = 2560)

Probability Histograms

Class Width = 1

Convert frequencies to probabilities by dividing by 2560. Each rectangle's area equals the probability for that interval. Total area = 1.

Probability histogram with class width 1
Figure 2: Probability histogram (class width = 1)

Class Width = 0.5

Smaller class widths produce smoother distributions. The red trendline approximates the underlying probability density function.

Probability histogram with class width 0.5
Figure 3: Narrower classes (width = 0.5) with trendline

Class Width = 0.1

As class width approaches zero, the histogram approximates a continuous curve—the probability density function.

Probability histogram with class width 0.1
Figure 4: Very narrow classes (width = 0.1) showing smooth trend

Probability Density Function (PDF) Definition

For a continuous random variable \(X\) with PDF \(f_X(x)\):

\[ P(a \le X \le b) = \text{Area under curve between } x=a \text{ and } x=b \]
Visual representation of PDF area calculation
Figure 5: Probability as area under the PDF curve

PDF Properties

  1. Non-negativity: \(f_X(x) \ge 0\) for all \(x\)
  2. Total area = 1: \(\displaystyle \int_{-\infty}^{\infty} f_X(x) \, dx = 1\)
  3. Probability via integration: \(\displaystyle P(a \le X \le b) = \int_a^b f_X(x) \, dx\)

Solved Examples

Example 1: Uniform Distribution

A random variable \(X\) has PDF:

\[ f_X(x) = \begin{cases} \dfrac{k}{b-a}, & a \le x \le b \\[1em] 0, & x < a \text{ or } x > b \end{cases} \]

where \(k > 0\) is constant.

  1. Graph \(f_X(x)\)
  2. Find \(k\) using PDF properties
  3. Verify total area = 1 using integration
  4. For \(a=3, b=8\), calculate \(P(4 \le X \le 7)\)

Solution

a) Graph is a rectangle between \(x=a\) and \(x=b\) with height \(\frac{k}{b-a}\):

Uniform PDF graph
Figure 6: Uniform PDF

b) Area under PDF must equal 1. The rectangular area is:

\[ \text{Area} = (b-a) \times \frac{k}{b-a} = k \]

Setting \(k = 1\) gives total area = 1. Thus:

\[ f_X(x) = \begin{cases} \dfrac{1}{b-a}, & a \le x \le b \\[1em] 0, & \text{otherwise} \end{cases} \]

c) Using integration:

\[ \begin{aligned} \int_{-\infty}^{\infty} f_X(x) \, dx &= \int_a^b \frac{1}{b-a} \, dx \\ &= \left[ \frac{x}{b-a} \right]_a^b \\ &= \frac{b}{b-a} - \frac{a}{b-a} = 1 \end{aligned} \]

d) For \(a=3, b=8\):

\[ \begin{aligned} P(4 \le X \le 7) &= \int_4^7 \frac{1}{8-3} \, dx \\ &= \int_4^7 \frac{1}{5} \, dx \\ &= \left[ \frac{x}{5} \right]_4^7 \\ &= \frac{7}{5} - \frac{4}{5} = \frac{3}{5} = 0.6 \end{aligned} \]

Example 2: Exponential Distribution

A random variable \(X\) has PDF:

\[ f_X(x) = \begin{cases} e^{-kx}, & x \ge 0 \\ 0, & x < 0 \end{cases} \]

where \(k > 0\).

  1. Determine \(k\)
  2. Find \(a\) such that \(P(0 \le X \le a) = 0.9\)

Solution

a) Total area must be 1:

\[ \begin{aligned} \int_{-\infty}^{\infty} f_X(x) \, dx &= \int_0^{\infty} e^{-kx} \, dx \\ &= \lim_{b \to \infty} \left[ -\frac{1}{k} e^{-kx} \right]_0^b \\ &= \lim_{b \to \infty} \left( -\frac{1}{k} e^{-kb} + \frac{1}{k} \right) \\ &= \frac{1}{k} \end{aligned} \]

Set \(\frac{1}{k} = 1 \Rightarrow k = 1\).

b) Find \(a\) for \(P(0 \le X \le a) = 0.9\):

\[ \begin{aligned} P(0 \le X \le a) &= \int_0^a e^{-x} \, dx \\ &= \left[ -e^{-x} \right]_0^a \\ &= -e^{-a} + 1 \end{aligned} \]

Set equal to 0.9:

\[ \begin{aligned} -e^{-a} + 1 &= 0.9 \\ e^{-a} &= 0.1 \\ -a &= \ln(0.1) \\ a &= -\ln(0.1) \approx 2.3026 \end{aligned} \]

References & Further Reading

  1. Introduction to Probability
  2. Frequency Distributions with Google Sheets
  3. Histograms for Grouped Data
  4. Integral Calculus Basics