September 22, 2024
Let’s start with the first two, the data and the aesthetic, with a column chart example…
This gives us the axes without any visualization:
Now let’s add a geom. In this case we want a column chart so we add geom_col()
.
That gets the idea across but looks a little depressing, so…
…let’s change the color of the columns by specifying fill = "steelblue"
.
Tip
See here for more available
ggplot2` colors.
Note how color of original columns is simply overwritten:
Now let’s add some labels with the labs()
function:
And that gives us…
Next, we reorder the bars with fct_reorder()
from the forcats
package.
Note that we could also use the base R reorder()
function here.
This way, we get a nice, visually appealing ordering of the bars according to levels of democracy…
Now let’s change the theme to theme_minimal()
.
Tip
See here for available ggplot2
themes.
Gives us a clean, elegant look.
Note that you can also save your plot as an object to modify later.
Which gives us…
Now let’s add back our labels…
So now we have…
And now we’ll add back our theme…
Voila!
Change the theme. There are many themes to choose from.
glimpse()
the data10:00
# load dplyr
library(dplyr)
# load data
dem_women <- read_csv("data/dem_women.csv")
# filter to 2022
dem_women_2022 <- dem_women |>
filter(year == 2022)
# create histogram
ggplot(dem_women_2022, aes(x = flfp)) +
geom_histogram(fill = "steelblue") +
labs(
x = "Percentage of Working Aged Women in Labor Force",
y = "Number of Countries",
title = "Female labor force participation rates, 2022",
caption = "Source: World Bank"
) + theme_minimal()
Note that you only need to specify the x axis variable in the aes()
function. ggplot2
will automatically visualize the y-axis for a histogram.
Change number of bins (bars) using bins
or binwidth
arguments (default number of bins = 30):
At 50 bins…
At 100 bins…probably too many!
Using binwidth
instead of bins
…
Setting binwidth
to 2…
For densities, the total area sums to 1. The height of a bar represents the probability of observations in that bin (rather than the number of observations).
Which gives us…
x =
in aes()
geom_histogram
10:00