Do you want to share your content on R-bloggers? Click here if you have a blog, or here If you don’t.
If this message is useful for you, I kindly request a minimum donation Buy a coffee for me. It will be used to continue my open source efforts. The complete explanation is here: A personal message from an Open Source employee.
You can send me questions for the blog using This form And subscribe to receive an e -mail when there is a new message.
I worked on my own research and after three days of downloading and cleaning data from UN Comtrade Plus I was Ableto a tidy database in Postgres.
I had to go through 65 GB of data to organize this in a slightly transferable.
Connect to my local postgresql database (otherwise reading and filtering this would be insane):
# remember to load postgres first
library(RPostgres)
con <- dbConnect(
Postgres(),
dbname = Sys.getenv("COMTRADE_NAME"),
host = Sys.getenv("COMTRADE_HOST"),
user = Sys.getenv("COMTRADE_USER"),
password = Sys.getenv("COMTRADE_PASSWORD"),
port = Sys.getenv("COMTRADE_PORT")
)
dbListTables(con)[1] "classification_codes" "commodities" "commodities_short" [4] "commodity_codes" "commodity_correlations" "countries" [7] "country_codes" "customs_codes" "exports" [10] "flow_codes" "gdp_deflator" "imports" [13] "mos_codes" "mot_codes" "unit_codes" [16] "yc" "yr" "yrc" [19] "yrp" "yrpc"
As you can see in the previous part, it is a good habit to use environmental variables to store sensitive information, such as database data, even for local databases without external access (for example, this database is stored on my laptop). I manage these references with usethis::edit_r_environ().
Now I can use the Tidyverse for efficient filtering and analysis of my data:
library(dplyr) tbl(con, "commodities") %>% glimpse()
Rows: ?? Columns: 8 Database: postgres [pacha@localhost:5432/comtrade] $ commodity_code"010121", "010129", "010130", "010190", "010221",… $ commodity_code_short "0101", "0101", "0101", "0101", "0102", "0102", "… $ commodity_name "Horses; live, pure-bred breeding animals", "Hors… $ chapter_code "01", "01", "01", "01", "01", "01", "01", "01", "… $ chapter_name "Animals; live", "Animals; live", "Animals; live"… $ section_code "01", "01", "01", "01", "01", "01", "01", "01", "… $ section_name "Live Animals; Animal Products", "Live Animals; A… $ section_color "#74c0e2", "#74c0e2", "#74c0e2", "#74c0e2", "#74c…
In the harmonized system (HS) the international customs codes are used to classify traded products, and this dataset uses the HS Revision 2022. The HS codes are constantly being expanded to display changes in technology (for example, the HS Revision 2002 has fewer codes for “Cellphones” than the HS). These codes are hierarchical, the first two digits represent the chapter, and in total this data has 99 chapters, with the 99th chapter my addition to “unknown/non -specific” products (eg code “999999”).
The section code consists of 21 codes plus one “unknown/non -specific” addition I made. These are also hierarchical, but cannot be generalized in the same way as the chapter codes by cutting the product code. Section one contains, for example, chapters one to five. I had to add this with my own R scripts after finding a web page with the section codes and their chapter varies from the World Customs Organization (WCO). I had to get it from the internetway machine, because the original page was not available.
Because of how the sections are presented, I had to do this:
section_names <- str_to_title(c(
"LIVE ANIMALS; ANIMAL PRODUCTS",
"VEGETABLE PRODUCTS",
"ANIMAL, VEGETABLE OR MICROBIAL FATS AND OILS AND THEIR CLEAVAGE PRODUCTS; PREPARED EDIBLE FATS; ANIMAL OR VEGETABLE WAXES",
...
))
d <- d %>%
mutate(
section_name = case_when(
section_code == "01" ~ section_names[1],
section_code == "02" ~ section_names[2],
section_code == "03" ~ section_names[3],
...And something similar for the colors, what my addition and corresponds to 22 colors that I found visually attractive.
With this in mind it is easier to represent 22 sections than 99 chapters visually. The sections are:
tbl(con, "commodities") %>% distinct(section_code, section_name) %>% arrange(section_code) %>% collect() %>% print(n = 22)
# A tibble: 22 × 2 section_code section_name1 01 Live Animals; Animal Products 2 02 Vegetable Products 3 03 Animal, Vegetable Or Microbial Fats And Oils And Their Cleavage… 4 04 Prepared Foodstuffs; Beverages, Spirits And Vinegar; Tobacco An… 5 05 Mineral Products 6 06 Products Of The Chemical Or Allied Industries 7 07 Plastics And Articles Thereof; Rubber And Articles Thereof 8 08 Raw Hides And Skins, Leather, Furskins And Articles Thereof; Sa… 9 09 Wood And Articles Of Wood; Wood Charcoal; Cork And Articles Of … 10 10 Pulp Of Wood Or Of Other Fibrous Cellulosic Material; Recovered… 11 11 Textiles And Textile Articles 12 12 Footwear, Headgear, Umbrellas, Sun Umbrellas, Walking-Sticks, S… 13 13 Articles Of Stone, Plaster, Cement, Asbestos, Mica Or Similar M… 14 14 Natural Or Cultured Pearls, Precious Or Semi-Precious Stones, P… 15 15 Base Metals And Articles Of Base Metal 16 16 Machinery And Mechanical Appliances; Electrical Equipment; Part… 17 17 Vehicles, Aircraft, Vessels And Associated Transport Equipment 18 18 Optical, Photographic, Cinematographic, Measuring, Checking, Pr… 19 19 Arms And Ammunition; Parts And Accessories Thereof 20 20 Miscellaneous Manufactured Articles 21 21 Works Of Art, Collectors' Pieces And Antiques 22 99 Unknown Or Unspecified
According to Statistics Canada’s research into household editions, 2023: “Hiding place accounted for 32.1%of the total consumption of goods and services in 2023, followed by transport (15.8%) and food (15.7%), which remained the three largest spending categories.” As a result, let’s only focus on sections one to four.
How many do American products represent in 2023 of Canadian food (and food -related) products? We can go to the YRPC table (annual reporter partner-community). The HS presents products such as raw materials, but for simplicity I call these products. This table is not part of the official UN Comtrade data, it is my own creation with extra steps with ISO-3 alpha modes for countries and the section and chapter codes that I mentioned earlier.
can_foods <- tbl(con, "yrpc") %>%
filter(
year == 2023L,
partner_iso == "CAN",
section_code %in% c("01", "02", "03", "04")
) %>%
mutate(
exporter = case_when(
reporter_iso == "USA" ~ "United States",
TRUE ~ "Rest of the World"
)
) %>%
group_by(exporter) %>%
summarise(
imports = sum(trade_value_usd_exp, na.rm = TRUE)
) %>%
ungroup() %>%
mutate(
share = imports / sum(imports) * 100
) %>%
collect()
can_foods# A tibble: 2 × 3 exporter imports share1 Rest of the World 44360282786. 57.4 2 United States 32885888490 42.6
In the previous chunk I used the figures based on exporter to exclude the costs of freight and insurance from the importer -based values. This gives a clearer picture of the actual trade flows between the countries involved without taking into account the logistical costs.
We can also refine this a little more by also grouping on section code:
can_foods_refined <- tbl(con, "yrpc") %>%
filter(
year == 2023L,
partner_iso == "CAN",
section_code %in% c("01", "02", "03", "04")
) %>%
mutate(
exporter = case_when(
reporter_iso == "USA" ~ "United States",
TRUE ~ "Rest of the World"
)
) %>%
inner_join(
tbl(con, "commodities") %>%
select(commodity_code, section_code, section_name, section_color)
) %>%
group_by(exporter, section_name, section_color) %>%
summarise(
imports = sum(trade_value_usd_exp, na.rm = TRUE)
) %>%
ungroup() %>%
group_by(section_name, section_color) %>%
mutate(
share = imports / sum(imports) * 100
) %>%
ungroup() %>%
arrange(section_name, exporter) %>%
collect()
can_foods_refined# A tibble: 8 × 5 exporter section_name section_color imports share1 Rest of the World Animal, Vegetable Or Microbial … #549e95 1.85e 9 66.8 2 United States Animal, Vegetable Or Microbial … #549e95 9.19e 8 33.2 3 Rest of the World Live Animals; Animal Products #74c0e2 6.13e 9 60.7 4 United States Live Animals; Animal Products #74c0e2 3.97e 9 39.3 5 Rest of the World Prepared Foodstuffs; Beverages,… #8abdb6 2.57e10 60.0 6 United States Prepared Foodstuffs; Beverages,… #8abdb6 1.71e10 40.0 7 Rest of the World Vegetable Products #406662 1.07e10 49.7 8 United States Vegetable Products #406662 1.09e10 50.3
A plot of the previous table can show how much of the foods imported by Canada (and food -related) from the United States:
library(stringr)
library(ggplot2)
# trim section name to 40 characters
can_foods_refined <- can_foods_refined %>%
mutate(section_name = paste0(str_sub(section_name, 1, 40), "..."))
ggplot(can_foods_refined, aes(x = exporter, y = share, fill = section_color)) +
geom_col(width = 0.7) +
facet_wrap(~ section_name, ncol = 1) +
scale_fill_identity() +
labs(
title = "Share of Food Products Imported by Canada (2023)",
subtitle = "Source: Own creation based on UN Comtrade data",
x = "",
y = "Share (%)"
) +
theme_minimal(base_size = 13) +
theme(
axis.text.x = element_text(angle = 30, hjust = 1),
legend.position = "none"
)The mutual rates of 25% applied to some animal and vegetable products that are effective on 12 September 2025 (Source: Canada) have a direct influence on the bag of Canadian households with a rough size of an extra 25% x 15.7% x 60% ~ = 2.5% only on foods. Food is now 25% x 60% ~ = 15% more expensive for Canadian families. This is a significant impact on the costs of living for many families when we start adding other affected products to the mix and the extra impact on employment, because these rates also influence steel and other Canadian industries. This of course ignores general balance effects such as trade defects and changes in other exporters and consumer behavior.
If you read here, I have offered my international trade data for sale, it relates to 1988-2023 for all countries at raw materials (HS 6 figures) and has already been cleaned and organized in a PostgresQL database. You can get 65 GB of clean data from Buy a coffee for me With documentation and support.
Related
#President #Trumps #rates #influence #Canadian #households #RBloggers


