Another super-power of R is its capabilities to create great figures. We will use ggplot2 but R has more plotting systems of which base R (see this tutorial, or this and lattice are the most well described and often used. We choose ggplot2 because it works well with the tidyverse and is well described.
There are several good resources aimed at ggplot2:
After this session, you should be able to answer:
ggplot2
systemWhen using the ggplot2 system we can think of the resulting graph as containing data that has been mapped to different shapes, coordinates, colors and sizes. We are using geometric representations to display the data.
We map data using the aes()
function inside the ggplot()
function. Different geometric representations is added to the plot these are called geoms, geom_point()
is commonly used.
Differently from piping, we add objects together with the +
sign.
The following is a short introduction.
library(tidyverse) # loads ggplot2 and other packages
library(readxl) # needed for loading the cycling data set
We will use the cycling data set in this tutorial. Let’s first plot squat jump performance. We might be interested in the development over time. The ggplot function can be used in the end of a pipe.
# Assuming that the data is stored in the relative path /data/cyclingStudy.xlsx
cyclingStudy <- read_excel("./data/cyclingStudy.xlsx", na = "NA") # Remember to specify NA:s
cyclingStudy %>%
select(subject, timepoint, group, sj.max) %>%
ggplot(aes(x = timepoint, y = sj.max))
The above code gives a plot without geometric representations. The x- and y-axis has values corresponding to the range of values in the data. We can add a geometric representation by adding a geom. Let’s try geom_point()
.
cyclingStudy %>%
select(subject, timepoint, group, sj.max) %>%
ggplot(aes(x = timepoint, y = sj.max)) + geom_point()
The addition of this geom maps the data to points on the graph. Each point represents an observation. There are several problems with this graph, these problems can be thought of as problems due to the fact that we have not formatted the data well and that we have not formatted the graph well.
Most changes can be made in ggplot, but some are more easily understood if we change the data first.
The time-point variable is interpreted as a factor variable (read more about factors in R for data science, Chapter 15). Factor variables are categorical variables that can be ordered and labeled. This is convenient when making a graph. By modifying the factor variable we can change the way ggplot2 creates the graph, and control what each level of the timepoint
variable should be called.
Let’s say that instead of meso1
, meso2
, meso3
and pre
we want the graph to read “Pre-training”, “Period 1”, “Period 2” and “Period 3”. We can change this by mutating the timepoint
variable before the graph.
cyclingStudy %>%
select(subject, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Pre-training", "Period 1", "Period 2", "Period 3"))) %>%
ggplot(aes(x = timepoint, y = sj.max)) + geom_point()
In the above the mutate function is used, we modify timepoint
to be a factor using the factor()
function. The factor function first takes the variable we want to convert to a factor (or change), then we specify levels of the factor. Levels are the order of the factor, this will transfer to the graph where the first level will be plotted first. Second we use the labels =
argument which puts labels on each factor. The order of the labels are the same as specified in the levels =
argument.
Often the labels are a bit awkward to work with but nice to plot. This is because we might want to have spaces and special signs in there, something that makes it difficult to handle in e.g. filtering functions. So, if we need to change a factor before plotting, do it last in the pipe prior to the ggplot call.
We can look at the factor prior to plotting to see what we have done so far.
cyclingStudy %>%
select(subject, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Pre-training", "Period 1", "Period 2", "Period 3"))) %>%
print()
The labels can have row-breaks. This is handy when you want to describe a lot in the label in the graph. Let’s say that the first level should read “Prior to training period”. This is a long label and it will probably over plot other aspects of the graph. We can use “\n” as a row break indicator.
cyclingStudy %>%
select(subject, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
ggplot(aes(x = timepoint, y = sj.max)) + geom_point()
Notice that you also get a warnings message, Removed 4 rows containing missing values (geom_point)
. This is because four points has missing values. It is good practice to check that these values correspond to what youy expect. How many missing values are there in the data set?
Sometimes we want to plot summaries of the data instead of all data points. There some geoms that does statistical summaries of the data. geom_boxplot
and geom_violin
for example. These geoms do not need summary statistics, instead they convert the underlying data to a summary and plot it.
cyclingStudy %>%
select(subject, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
ggplot(aes(timepoint, sj.max)) + geom_boxplot()
Look in the reference under Computed variables to see what statistics are calculated and plotted.
Another “borderline summary” plotting method is the geom_violin
. It adds a dimension to the plot where the width of the geom corresponds to the number of points.
cyclingStudy %>%
select(subject, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
ggplot(aes(timepoint, sj.max)) + geom_violin()
If we want to plot other summary statistics we usually have to compute them prior to plotting.
Let’s say that we want to plot the mean of each group (INCR, DECR, MIX) at each time-point. Then we would first calculate the summary followed by plotting.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max)) + geom_point()
I’ve added the group variable in the select call and made a group_by()
/summarise()
to calculate the mean. I called the new mean variable the same (sj.max
). Each point now represents the mean of each group. Let’s remove the plotting and examine the underlying data.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sj.max = mean(sj.max, na.rm = TRUE)) %>%
print()
The factor variable timepoint
and the group
variable exists in a new data frame, sj.max
represents the mean.
Since have not mapped the group
variable to data, it is impossible to see in the graph what group corresponds to what point. We need to add aesthetics to make this happen. Let’s add different colors to different groups.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group)) + geom_point()
The above code maps group to color. We get a legend in the graph telling us about the grouping of the data. Another nice thing to do is to connect the dots to be able to look at development over time. We can do this by adding geom_line()
.
However, we will need some extra information in the aes()
call. Let’s try
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group)) +
geom_point() +
geom_line()
The warning message should say that “each group consists of only one observation”, this is because we have not told ggplot the kind of grouping we want. Right now ggplot draws a line per time point per group resulting in no line. We can add group = group
in aes()
.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) + geom_point() + geom_line()
Let’s say that we want another statistic, the standard deviation represented in the graph. We can first calculate it in the pipe.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
print()
Notice that I add the sd()
function first as I call the mean the same thing as the variable we calculate the SD from. Placing them in the in the reversed order would mean that we calculate SD from on the mean (only one number would result in a NA).
The SD can now be mapped to the graph. Let’s try something crazy, by using size in aes()
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group, size = sd.sj.max)) + geom_point() + geom_line()
The standard deviation is now mapped to size which control the size of both the points and lines. This is not a common procedure. To remove an aesthetic from a geom we can specify aes inside that specific geom. Let’s say we want to keep size on the points but remove it on the lines. The we would set size = NULL
in the geom_line()
like this:
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group, size = sd.sj.max)) +
geom_point() +
geom_line(aes(size = NULL))
Now only the points has the SD mapped to them. As mentioned above, mapping the SD to size is not something you wold normally do as the scale do not correspond to the scale on the y axis. Instead we would like to use error bars.
Error bars can be added to the plot by using another geom, geom_errorbar()
. This geom need some extra information in aes()
, ymin
and ymax
which tells ggplot to map something to each error bar. Let’s map the standard deviation to \(\pm\) the mean.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max)) +
geom_point() +
geom_line()
The distance from the mean to the top and bottom of the error bars represents a standard deviation, nice! However the plot is “over plotted” a lot of data on top of each other.
We can do a lot to remedy the situation. Let’s start by moving the data just a little bit. We can control the position of geoms by adding information to the position =
argument inside each geom. In this situation we would use the position_dodge()
function with a with or let’s say 0.2. Notice that this argument has to be present in all geoms.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2)) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2))
Now each group gets “dodged” to the side. With the middle group (INCR) in the middle. This also changed the behavior of the error-bars. We can tweak the width of the error bars by adding a width =
argument.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2))
Labels describes axes and plots. See the help file ?labs
for the labs()
function. Using this function we can add labels on all aesthetics, this means that the x- and y-axis labels can be modified together with e.g. size and color.
Let’s modify the axis labels in the previous plot. We do this by adding labs()
to the plot.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)")
The aesthetic color can also be modified, the name in the above plot is “group” as the name of the variable we mapped onto color was group
. We can change the name of the legend by using labs()
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)", color = "Group")
We can also add a title, subtitle and caption to the graph.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)", color = "Group",
title = "Squat jump height per group",
subtitle = "Different patterns per group",
caption = "Values are mean and SD" )
A ggplot has theme components, these are appearances of the graph that are not used to map data to. For example the background color of the plot or the size of labels. These components can be modified in the theme()
function. There are some pre-made themes, these are listed here. A nice one is theme_bw()
. We can add it to the plot.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)", color = "Group",
title = "Squat jump height per group",
subtitle = "Different patterns per group",
caption = "Values are mean and SD" ) +
theme_bw()
If we still want to modify components of the theme we need to specify changes in the theme()
function. Parameters that are possible to change are listed here. All components of a theme are changed based on their theme_element
. A theme element can be text, as for example labels, these can be modified with element_text()
.
Let’s make the text bigger of the axis labels.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)", color = "Group",
title = "Squat jump height per group",
subtitle = "Different patterns per group",
caption = "Values are mean and SD" ) +
theme_bw() +
theme(axis.title = element_text(size = 14))
The background of the plot is modified using another theme element as this is not text.
cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)", color = "Group",
title = "Squat jump height per group",
subtitle = "Different patterns per group",
caption = "Values are mean and SD" ) +
theme_bw() +
theme(plot.background = element_rect(fill = "gray80"))
Basically all components can be modified with its specific theme component function. See the reference page for more information.
Saving a plot can be done from R Studio by clicking “export” in the plot tab. We can also use the ggsave()
function.
We first need to make the plot to an object and then specify arguments to decide what file type and size we want the saved image to be.
# Savining the figure as an object "figure1"
figure1 <- cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)", color = "Group",
title = "Squat jump height per group",
subtitle = "Different patterns per group",
caption = "Values are mean and SD" ) +
theme_bw() +
theme(plot.background = element_rect(fill = "gray80"))
# Savining the figure to disk with ggsave
ggsave("./output/figure1.png", plot = figure1)
Saving a .png file as in the example above is not suitable for publication or your master thesis, there you should aim for vector graphics such as a pdf file. ggsave
can create a pdf file also.
ggsave("./output/figure1.pdf", plot = figure1)
Although this might need some tweeking of your software.
To create the figure in real life, I would do it in a new R-script and name the script figure1.R
. This script would be fully self contained, meaning that all code needed to create the graph is in the script. The script ends with the ggsave function. The script could look like this:
## Figure 1
# This scripts creates and saves Figure 1 in "./output/figure1.pdf"
# Load packages
library(tidyverse)
library(readxl)
# load data
cyclingStudy <- read_excel("./data/cyclingStudy.xlsx", na = "NA")
# Savining the figure as an object "figure1"
figure1 <- cyclingStudy %>%
select(subject, group, timepoint, group, sj.max) %>%
mutate(timepoint = factor(timepoint,
levels = c("pre", "meso1", "meso2", "meso3"),
labels = c("Prior to\ntrainining\nperiod", "Period 1", "Period 2", "Period 3"))) %>%
group_by(timepoint, group) %>%
summarise(sd.sj.max = sd(sj.max, na.rm = TRUE),
sj.max = mean(sj.max, na.rm = TRUE)) %>%
ggplot(aes(timepoint, sj.max, color = group, group = group)) +
geom_errorbar(aes(ymin = sj.max - sd.sj.max,
ymax = sj.max + sd.sj.max),
position = position_dodge(width = 0.2),
width = 0.1) +
geom_point(position = position_dodge(width = 0.2)) +
geom_line(position = position_dodge(width = 0.2)) +
labs(x = "Time-point", y = "Squat jump (cm)", color = "Group",
title = "Squat jump height per group",
subtitle = "Different patterns per group",
caption = "Values are mean and SD" ) +
theme_bw() +
theme(plot.background = element_rect(fill = "gray80"))
# Savining the figure to disk with ggsave
ggsave("./output/figure1.png", plot = figure1)
# End of script
An alternative is to add the figure to a R Markdown document. If you are creating a pdf from R Markdown you can add a figure caption in the code-chunk. Read more about this here.