# Quartiles and box plots

Quartiles split a given a data set of real numbers x_{1}, x_{2}, x_{3} ... x_{N} into four groups, sorted in ascending order, and each group includes approximately 25% (or a quarter) of all the data values included in the data set.

Let Q1 be the lower quartile, Q2 be the median and Q3 be the be the upper quartile. The four groups of data values are defined by the intervals:

Group 1: From the minimum data value to Q1 , Q1 is also called the 25th percentile because 25% of the data values in the data set are below Q1

Group 2: From Q1 to Q2 , Q2 is also called the 50th percentile because 50% of the data values in the data set are below Q2

Group 3: From Q2 to Q3 , Q3 is also called the 75th percentile because 75% of the data values in the data set are below Q3

Group 4: From Q3 to maximum data value.

## Methods in Calculating QuartilesThere are different methods to calculate the quartiles. Two methods, that differ only if the number of data values is odd, will described and used. ## Examples on Computing Quartiles and Drawing Box PlotExample 1 Example 2 ## Examples on Reading Quartiles from Box plotsExample 3
b) Class B has the highest score of 100 c) Class B has the lowest score of 20 d) The median splits the ordered scores into two halves and therefore half the class scores above the median class A: (1/2) total = (1/2) 12 = 6 students class B: (1/2) total = (1/2) 19 = 9.5 , round to 10 students (number of students must be an integer) class C: (1/2) total = (1/2) 22 = 11 students class D: (1/2) total = (1/2) 28 = 14 students e) Quartiles splits the data set (scores in this example) into 4 groups with 1/4 each. Hence, for each class, one quarter of the scores are below the lower quartile class A: (1/4) total = (1/4) 12 = 3 students class B: (1/4) total = (1/4) 19 = 4.75 , round to 5 students (number of students must be an integer) class C: (1/4) total = (1/4) 22 = 5.5 , round to 6 students (number of students must be an integer) class D: (1/4) total = (1/4) 28 = 7 students f) Quartiles splits the data set (scores in this example) into 4 groups with 1/4 each. Hence, for each class, 3/4 quarters of the scores are between the lower quartile and the maximum (or above the lower quartile) class A: (3/4) total = (3/4) 12 = 9 students class B: (3/4) total = (3/4) 19 = 14.25, round to 14 students (number of students must be an integer) class C: (3/4) total = (3/4) 22 = 16.5 , round to 17 students (number of students must be an integer|) class D: (3/4) total = (3/4) 28 = 21 students h) Class A has the smallest range and interquartile range; 44 and 26 respectively. Class B has the largest range and interquartile rangep; 80 and 34 respectively. Using the box plots and the range and interquartile range, we may conclude that the scores in class A has the smallest dispersion and the scores in class B has the largest dispersion. ## More References and LinksQuartileMean, Median and Mode standard deviation Mean and Standard deviation. John W. Tukey (1977). Exploratory Data Analysis. Addison-Wesley. |