Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (2024)

Graphics such as stemplot, boxplot, and histogram help us determine whether a distribution is approximately symmetric or not. We are now going to add another graphics to check for normality.

17.1 Symmetric Distribution

Let us look at the data frame, birthwt, found in the package MASS. The data frame consists of 10 columns and 189 rows. However, we will only focus on the variable, bwt, the baby’s birthweight which is measured in grams.

Let us see how the histogram of the baby’s birth weight looks.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (1)

The histogram for the baby’s birth weight looks approximately normal.

Using Basic R

Let us draw the normal quantile plot using the function qqnorm( ). If a distribution is approximately normal, points on the normal quantile plot will lie close to a straight line.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (2)

Sometimes, a line is superimposed onto the normal quantile plot. This helps visualize whether the points lie close to a straight line or not. Use the function qqline( ) to draw the line.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (3)

To further help with visualization, you can let the plots and/or line take on a different color other than black.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (4)

From the histogram and normal quantile plot, we can conclude that the baby’s weight distribution is approximately normal.

Using Ggplot2

To draw the normal quantile plot, use the geometric shape called geom_qq( ). Note that the aesthetic mapping in the function ggplot should use the argument, sample because the vertical axis in this case is called sample.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (5)

To superimpose a line to the normal quantile plot, add the geometric shape, geom_qq_line( ).

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (6)

If you want to use colors other than black to help with visualization, add the argument color to geom_qq to change the plot color and geom_qq_line to change the line color.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (7)

We can conclude that the baby weight distribution is approximately normal since the normal quantile plots lie approximately in a staright line.

17.2 Skewed Distribution

Right-Skewed Distribution

Let us look skewed distributions like those of rivers.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (8)

The histogram shows a right-skewed distribution.

In Basic R

Let us take a look at how the normal quantile plot looks for a right-skewed distribution.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (9)

Using Ggplot2

Remember that rivers is a vector so leave the argument in the function ggplot blank and put the aesthetic mapping in the function geom_qq( ).

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (10)

The normal quantile plot shows that the distribution of rivers is skewed.

Left-Skewed Distribution

Let us look at the dataset called quakes which gives locations of seismic events near Fiji. We will focus on the variable, lat, which is the numeric latitude of a seismic event.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (11)

The histogram shows a distribution that is slightly left-skewed.

17.3 Other Distributions

What if we have a non-symmetric, non-skewed distibution? We will compare the histogram and normal quantile plot of the following. Each of the dataset is built into R.

Let us look at the variable, eruptions, in the dataset, faithful.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (14)

We see a bimodal distribution. Let us see how the normal quantile plot will look.

Using Ggplot2

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (16)

Another interesting one to look at is the variable, conc (for the plant study’s ambient carbon dioxide concentrations in mL/L), found in the dataset called CO2.

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (17)

Chapter 17 Normal Quantile Plot | Basic R Guide for NSC Statistics (2024)

FAQs

How to do a normal quantile-quantile plot in R? ›

In R, there are two functions to create QQ plots: qqnorm() and qqplot() . qqnorm() creates a normal QQ plot. You give it a vector of data, and R plots the data in sorted order versus quantiles from a standard normal distribution. For example, consider the trees data set that comes with R.

How do you know if a normal quantile plot is normal? ›

Examining data distributions using QQ plots

Points on the Normal QQ plot provide an indication of univariate normality of the dataset. If the data is normally distributed, the points will fall on the 45-degree reference line. If the data is not normally distributed, the points will deviate from the reference line.

What is the Z score of the Q-Q plot? ›

When the option Q-Q plot is selected, the horizontal axis shows the z-scores of the observed values, z=(x−mean)/SD. A straight reference line represents the Normal distribution. If the sample data are near a Normal distribution, the data points will be near this straight line.

How do you find the normal quantile? ›

Because the cumulative distribution function (CDF) is strictly monotonically increasing, the quantile function is equal to the inverse of the CDF: QX(p)=F−1X(x).

How to interpret quantile-quantile plot? ›

Interpreting QQ plots is intuitive. When all the dots generally follow the straight line y = x, the sample distribution is similar to the theoretical one. The data points don't have to fall right on the line. Instead, they only need to follow a line generally—with random variability placing them above and below it.

How to interpret Q-Q plot in R? ›

On the horizontal axis, it shows the expected value of an individual with the same quantile if the distribution were normal (“theoretical quantiles” in the same figure). The QQ plot should follow more or less along a straight line if the data come from a normal distribution (with some tolerance for sampling variation).

What does a good Q-Q plot look like? ›

Normal Q-Q Plot: This is used to assess if your residuals are normally distributed. basically what you are looking for here is the data points closely following the straight line at a 45% angle upwards (left to right).

What does the normal quantile plot shown to the right represent? ›

The normal quantile plot shown to the right represents duration times (in seconds) of eruptions of a certain geyser from the accompanying data set.

What is a normal quantile probability plot? ›

A normal probability plot, or more specifically a quantile-quantile (Q-Q) plot, shows the distribution of the data against the expected normal distribution.

How do you draw a quantile-quantile plot? ›

To draw a Quantile-Quantile (Q-Q) plot, you can follow these steps: Collect the Data: Gather the dataset for which you want to create the Q-Q plot. Ensure that the data are numerical and represent a random sample from the population of interest. Sort the Data: Arrange the data in either ascending or descending order.

What if a Q-Q plot is not normal? ›

When we create a Normal Q-Q Plot against data that is not normally distributed, we end up with a plot that is not a straight line from one corner to the next. This is because the quantiles do not increase at the same rate.

How to make a normal quantile plot? ›

Here are steps for creating a normal quantile plot in Excel:
  1. Place or load your data values into the first column. ...
  2. Label the second column as Rank. ...
  3. Label the third column as Rank Proportion. ...
  4. Label the fourth column as Rank-based z-scores. ...
  5. Copy the first column to the fifth column. ...
  6. Select the fourth and fifth column.
Jan 18, 2023

What is the difference between a quantile plot and a Q-Q plot? ›

A q-q plot is a plot of the quantiles of the first data set against the quantiles of the second data set. By a quantile, we mean the fraction (or percent) of points below the given value. That is, the 0.3 (or 30%) quantile is the point at which 30% percent of the data fall below and 70% fall above that value.

How do you graph a normal quantile plot? ›

Here are steps for creating a normal quantile plot in Excel:
  1. Place or load your data values into the first column. ...
  2. Label the second column as Rank. ...
  3. Label the third column as Rank Proportion. ...
  4. Label the fourth column as Rank-based z-scores. ...
  5. Copy the first column to the fifth column. ...
  6. Select the fourth and fifth column.
Jan 18, 2023

How to make a normal probability plot in R? ›

To create a Normal Probability Plot in R using ggplot2, you can use the ggplot() function to create the plot and then add the data using the stat_qq() function. In this example, we first generate a random sample of 100 observations from a normal distribution using the rnorm() function.

How to plot qqline in R? ›

Draw a Quantile-Quantile Plot in R Programming – qqline() Function
  1. Syntax: qqline(x, y, col)
  2. Parameters:
  3. Returns: A QQ Line plot of the coordinates provided.
Mar 26, 2024

Top Articles
Latest Posts
Article information

Author: Annamae Dooley

Last Updated:

Views: 5355

Rating: 4.4 / 5 (65 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Annamae Dooley

Birthday: 2001-07-26

Address: 9687 Tambra Meadow, Bradleyhaven, TN 53219

Phone: +9316045904039

Job: Future Coordinator

Hobby: Archery, Couponing, Poi, Kite flying, Knitting, Rappelling, Baseball

Introduction: My name is Annamae Dooley, I am a witty, quaint, lovely, clever, rich, sparkling, powerful person who loves writing and wants to share my knowledge and understanding with you.