ggplot2 is a data visualization package for the statistical programming language R. ggplot2 is an implementation of Leland Wilkinson’s Grammar of Graphics—a scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 is an alternative to the base graphics in R, and contains a number of plotting defaults.
Since Hadley Wickham created ggplot2 in 2005, it has grown in use to become one of the most popular R packages. ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. For example, ggplot2 can map values to the x-axis and y-axis, map different variables to different colors, and display best-fit lines by default.
In contrast to base R graphics, ggplot2 allows you to add, remove or alter components in a plot at a high level of abstraction. ggplot2’s popularity is likely due in large part to the defaults being aesthetically pleasing, at least relative to most programming software. To see an example of this, let’s say you have a dataset of the R programmers with the most programs downloaded during a given month:
The code to make the following decent looking column chart is relatively straightforward:
ggplot(dataset, aes(x = Programmer, y = Downloads)) + geom_col()
Example of ggplot2’s aesthetically-pleasing default settings
Base plotting in R, on the other hand, is imperative. You set up your layout(), you add points for a variable along with a title, and then you fit and plot a best-fit-line for the first variable, and then the second variable, and so on. Then you go on to the next plot. After several repetitions, you end with a legend.
The easiest way to get ggplot2 is to install the whole tidyverse:
Alternatively, you can install just ggplot2: