The Ggplot2 System
The Ggplot2 System
The Ggplot2 System
The `ggplot2` system is one of the most popular and powerful plotting systems in R. It is based
on the Grammar of Graphics, a coherent system for describing and building graphics that was
created by Leland Wilkinson. `ggplot2` allows for the creation of complex, multi-layered plots
with a high degree of customization and flexibility, making it a favorite among data analysts
and data scientists.
Key Features of `ggplot2`
1. Grammar of Graphics : The core idea behind `ggplot2` is the Grammar of Graphics, which
provides a systematic way of thinking about and creating graphics. Plots are built up from a
combination of different components such as data, aesthetics, geometries, statistics,
coordinates, and facets.
2. Layered Approach : `ggplot2` uses a layered approach to build plots. You start with the base
data and aesthetics, and then add layers for different plot elements like points, lines, bars, and
more.
3. Data-Driven : Plots are tightly integrated with the data frame, and most aspects of the plot
are automatically determined from the data, including scales and labels.
4. Highly Customizable : `ggplot2` provides a high degree of customization. You can control
almost every aspect of a plot, including the colors, shapes, sizes, themes, and more.
5.Faceting : `ggplot2` has powerful tools for creating multi-panel plots (facets) that allow you
to split your data by one or more variables and create a grid of plots.
Basic Syntax
The basic structure of a `ggplot2` plot is as follows:
library(ggplot2)
ggplot(data = <DATA>, aes(x = <X>, y = <Y>)) +
<GEOM_FUNCTION>() +
<OTHER_LAYERS>()
`ggplot(data, aes(...))` : This initializes the plot with the data and defines the aesthetic mappings
(`aes`). The aesthetics typically include variables mapped to the x and y axes.
`geom_*()`: Functions that add layers of geometric objects, such as points (`geom_point()`),
lines (`geom_line()`), bars (`geom_bar()`), etc.
theme() : Functions and settings to customize the overall appearance of the plot.
Example: Basic Scatter Plot
library(ggplot2)
Basic scatter plot
ggplot(data = airquality, aes(x = Temp, y = Ozone)) +
geom_point() +
labs(title = "Ozone vs Temperature", x = "Temperature (F)", y = "Ozone (ppb)")
In this example:
- `geom_point(color = "blue", size = 3)`: Adds blue points to the plot.
- `geom_smooth(method = "lm", color = "red", se = FALSE)`: Adds a red linear regression line
without a confidence interval.
- `theme_minimal()`: Applies a minimal theme to the plot, which changes the background and
gridlines.
Faceting Example
ggplot(data = airquality, aes(x = Temp, y = Ozone)) +
geom_point() +
facet_wrap(~ Month) +
labs(title = "Ozone vs Temperature by Month",
x = "Temperature (F)", y = "Ozone (ppb)")
This example creates a series of scatter plots for `Ozone` vs. `Temp`, with each plot
corresponding to a different month.
Example (Scatter plot)
library(ggplot2)
ggplot(data = mtcars, aes(x = wt, y = mpg, color = factor(cyl)))
+ geom_point(size = 3) + facet_wrap(~gear)
+ labs(title = "Fuel Efficiency vs. Weight by Cylinders", x = "Weight (1000 lbs)", y = "Miles
per Gallon")
+ theme_minimal()
Example(Line plot)
ggplot(data = economics, aes(x = date, y = unemploy))
+ geom_line(color = "blue", size = 1)
+ labs( title = "Unemployment Rate in the U.S. Over Time", x = "Year", y = "Number of
Unemployed (in thousands)" )
+ theme_minimal()
Example(Bar plot)
ggplot(data = diamonds, aes(x = cut))
+ geom_bar(fill = "steelblue")
+ labs( title = "Distribution of Diamond Cuts", x = "Cut Quality", y = "Count of Diamonds" )
+ theme_minimal()
Example (Histogram)
ggplot(data = mtcars, aes(x = wt))
+ geom_histogram(binwidth = 0.5, fill = "darkorange", color = "black")
+ labs( title = "Distribution of Car Weights", x = "Weight (1000 lbs)", y = "Frequency" )
+ theme_minimal()
https://ggplot2.tidyverse.org/
install.packages("tidyverse")
install.packages("ggplot2")
(or)
>install.packages("pak")
>pak::pak("tidyverse/ggplot2")
> install.packages("ggplot2")
and then load it into R via the library function.
> library(ggplot2)
Superficially, the ggplot2 functions are similar to lattice, but the system is generally easier and
more intuitive to use. The defaults used in ggplot2 make many choices for you, but you can
still customize plots to your heart’s desire.
A typical plot with the ggplot package looks as follows.
> data(mpg)
> qplot(displ, hwy, data = mpg)