Do you want to share your content on R-bloggers? Click here if you have a blog, or here If you don’t.
Due to delays with my stock market payment, if this message is useful for you, I kindly request a minimal donation Buy a coffee for me That will be used to continue my open source efforts. If you need an R package or a shiny dashboard for your team, you can e -mail me or ask me Fiverr. The complete explanation is here: A personal message from an Open Source employee
You can send me questions for the blog using This form.
I received this question from a reader: How do you make a histogram with equal dots or squares for every observation and do you color them through a different variable?
I will use the Penguins dataset to answer this, which contains observation about the species and body mass for a penguins sample:
library(palmerpenguins) library(dplyr) glimpse(penguins)
Rows: 344 Columns: 8 $ speciesAdelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel… $ island Torgersen, Torgersen, Torgersen, Torgersen, Torgerse… $ bill_length_mm 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, … $ bill_depth_mm 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, … $ flipper_length_mm 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186… $ body_mass_g 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, … $ sex male, female, female, NA, female, male, female, male… $ year 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…
To use squares is a possibility to create a discreet body mass variable with intervals and to count by species and interval:
library(tidyr) library(ggplot2) library(tintin) # Create quantile-based bins (wider bins) n_bins <- 5 # number of quantile bins d <- penguins %>% drop_na(body_mass_g, species) %>% mutate(body_mass_d = cut(body_mass_g, breaks = 4, dig.lab = 6)) %>% group_by(species, body_mass_d) %>% count() d
# A tibble: 9 Ă— 3 # Groups: species, body_mass_d [9] species body_mass_d n1 Adelie (2696.4,3600] 71 2 Adelie (3600,4500] 73 3 Adelie (4500,5400] 7 4 Chinstrap (2696.4,3600] 26 5 Chinstrap (3600,4500] 40 6 Chinstrap (4500,5400] 2 7 Gentoo (3600,4500] 17 8 Gentoo (4500,5400] 72 9 Gentoo (5400,6303.6] 34
Now I can make a Tetris style column plot in which each square represents 5 penguins:
square_size <- 5 # square = 5 observations
d_squares <- d %>%
mutate(
full_squares = n %/% square_size, # number of full squares
remainder = n %% square_size, # remaining observations
partial_height = remainder / square_size # height of partial square
)
# Create full squares
full_squares_df <- d_squares %>%
filter(full_squares > 0) %>%
uncount(full_squares) %>%
group_by(species, body_mass_d) %>%
mutate(square_id = row_number() - 1,
y = square_id + 0.5,
height = 1,
square_type = "full") %>%
ungroup()
# Create partial squares
partial_squares_df <- d_squares %>%
filter(remainder > 0) %>%
mutate(square_id = full_squares,
y = full_squares + partial_height/2,
height = partial_height,
square_type = "partial")
# Combine both
d_squares <- bind_rows(full_squares_df, partial_squares_df)
# Create Tetris-style column plot with grouped squares
ggplot(d_squares, aes(x = body_mass_d, y = y, fill = species)) +
geom_tile(aes(height = height), width = 0.9, color = "white", linewidth = 0.5) +
scale_x_discrete(name = "Body mass intervals") +
scale_y_continuous(name = paste0("Count (each full square = ", square_size, " penguins)"),
expand = expansion(add = 0)) +
scale_fill_tintin_d(option = "the black island", direction = -1) +
labs(title = "Column plot with grouped squares") +
facet_wrap(~species) +
theme_minimal(base_size = 13) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))I hope this is useful 🙂
Related
#histogram #dots #squares #equal #size #observation #color #variable #RBloggers


