Country codes | R bloggers

Country codes | R bloggers

[This article was first published on r.iresmi.net, and kindly contributed to R-bloggers]. (You can report a problem with the content on this page here)


Want to share your content on R bloggers? click here if you have a blog, or here if you don’t.

Physicists of many countries – CC BY-NC-ND by Ann Fisher

Day 3 of 30 Day Card Challenge: « Polygons » (earlier).

An interesting challenge a few weeks ago https://en.osm.town/@opencage/115271196316302891:

List countries whose ISO 3166-1 alpha-2 code is included as a substring in the common English version of the country name.

For example: Italy has the ISO code “IT” in its name; “SE” is the ISO code for Sweden, but does not appear in the name.

Let’s map that out…

Installation

library(dplyr)
library(stringr)
library(purrr)
library(emoji)
library(rvest)
library(glue)
library(gt)
library(ggplot2)
library(giscoR)
library(janitor)
library(sf)
library(jsonlite)

Facts

# ISO_3166-1_alpha-2 codes
iso_3166_a2 <- read_html("https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2") |> 
  html_table(na.strings = "") |> # take care not to interpret Namibia as NA!
  pluck(4) |> 
  rename(code = Code,
         name = `Country name (using title case)`) 

# Using data from {giscoR}, more adapted than {rnaturalearth} for alpha-2 codes
# but we still need some manual cleaning
countries <- gisco_countries |> 
  clean_names() |> 
  mutate(cntr_id = case_match(cntr_id,
                              "EL" ~ "GR",
                              "GB" ~ "UK",
                              .default = cntr_id))

# I want an Equal Earth projection centered on the Pacific (EPSG:8859)
# we must correct the geometry at the anti meridian, so we must get the 
# projection origin
epsg <- "8859"
origin <- fromJSON(glue("https://epsg.io/{epsg}.json")) |> 
  pluck("conversion", "parameters") |>
  filter(name == "Longitude of natural origin") |> 
  pull(value) # should be 150

# For the map background
ocean <- st_bbox(countries) |> 
  st_as_sfc() |> 
  st_break_antimeridian(lon_0 = origin) |> 
  st_segmentize(units::set_units(100, km))

Solve the problem

Actually, this is the easiest part, once the data is clean!

results <- iso_3166_a2 |> 
  filter(str_detect(name, regex(code, ignore_case = TRUE)))

Of the 249 countries with an ISO code, 59 match our question (Table 1). For all of them, the code appears in the first two characters.

At least one is missing! New Caledonia code is NC; this code is not in the name, but this area is part of FraNCe. According to flexible rules, it counts…

# using emoji flags for display, we need some more data wrangling
results |> 
  mutate(name_flag = name |> 
           str_split_i(",", 1)|> 
           str_replace_all(
             c("Lao People's Democratic Republic" = "Laos",
               "Russian Federation" = "Russia",
               "Syrian Arab Republic" = "Syria",
               "Virgin Islands \\(U\\.S\\.\\)" = "U.S. Virgin Islands"))) |> 
  mutate(flag = map(name_flag, possibly(\(x) flag(x), otherwise = "")),
         display = glue("{flag} {name} ({code})")) |> 
  arrange(display) |> 
  select(display) |> 
  gt() |> 
  cols_label(display = "Country") |> 
  cols_align(align = "left")
🇦🇫 Afghanistan (AF)
🇦🇱 Albania (AL)
🇦🇷 Argentina (AR)
🇦🇺 Australia (I)
🇦🇿 Azerbaijan (AZ)
🇧🇪 Belgium (BE)
🇧🇴 Bolivia, plurinational state (BO)
🇧🇷 Brazil (BR)
🇨🇦 Canada (CA)
🇨🇴Colombia (CO)
🇨🇺 Cuba (CU)
🇨🇾Cyprus (CY)
🇨🇿 Czech Republic (CZ)
🇩🇯 Djibouti (DJ)
🇩🇴 Dominican Republic (DO)
🇪🇨Ecuador (EC)
🇪🇬 Egypt (EC)
🇪🇷 Eritrea (ER)
🇪🇹 Ethiopia (ET)
🇫🇮Finland (FI)
🇫🇷 France (FR)
🇬🇦 Gabon (GA)
🇬🇪 Georgia (GE)
🇬🇭 Ghana (GH)
🇬🇮 Gibraltar (GI)
🇬🇷 Greece (GR)
🇬🇺 Guam (GU)
🇭🇺 Hungary (HU)
🇮🇳India (IN)
🇮🇷 Iran, Islamic Republic (IR)
🇮🇹 Italy (IT)
🇯🇪 Jersey (JE)
🇯🇴 Jordan (JO)
🇰🇪 Kenya (UK)
🇰🇮 Kiribati (KI)
🇱🇦 Lao People’s Democratic Republic (LA)
🇱🇮 Liechtenstein (LI)
🇱🇺 Luxembourg (LU)
🇳🇦 Namibia (NA)
🇳🇮 Nicaragua (NI)
🇳🇴 Norway (NO)
🇴🇲 Oman (OM)
🇵🇦Panama (PA)
🇵🇪Peru (OR)
🇵🇭 Philippines (PH)
🇶🇦Qatar (QA)
🇷🇴 Romania (RO)
🇷🇺 Russian Federation (RU)
🇷🇼 Rwanda (RW)
🇸🇦 Saudi Arabia (SA)
🇸🇴 Somalia (SO)
🇸🇾 Syrian Arab Republic (SY)
🇹🇭Thailand (TH)
🇹🇴 Tonga (TO)
🇺🇬 Uganda (UG)
🇺🇿 Uzbekistan (UZ)
🇻🇪 Venezuela, Bolivarian Republic (VE)
🇻🇮Virgin Islands (USA) (VI)
🇾🇪 Yemen (YE)

Table 1: Countries that have the ISO 3166-1 alpha-2 code as a substring in their names

Card

Now, to fulfill the 30DayMapChallenge, a classic choropleth map.

countries |> 
  st_break_antimeridian(lon_0 = origin) |> 
  left_join(results,
            join_by(cntr_id == code)) |> 
  ggplot() +
  geom_sf(data = ocean, fill = "paleturquoise", color = NA, alpha = .4) +
  geom_sf(aes(fill = !is.na(name), color = !is.na(name))) +
  scale_fill_manual(values =  c("TRUE" = "darkolivegreen3",
                                "FALSE" = "snow2"),
                    labels = c("TRUE" = "yes",
                               "FALSE" = "no")) +
  scale_color_manual(values =  c("TRUE" = "darkolivegreen4",
                                 "FALSE" = "snow3"),
                     labels = c("TRUE" = "yes",
                                "FALSE" = "no")) +
  coord_sf(crs = glue("EPSG:{epsg}")) +
  guides(fill = guide_legend(reverse = TRUE),
         color = guide_legend(reverse = TRUE)) +
  labs(title = glue("Countries whose ISO 3166-1 alpha-2 code is contained as a \\
                    substring in their name"),
       fill = "name has ISO alpha-2 ?",
       color = "name has ISO alpha-2 ?",
       caption = glue("data : Gisco, Wikipedia
                      https://r.iresmi.net/ {Sys.Date()}")) +
  theme_minimal() +
  theme(plot.caption = element_text(size = 6),
        legend.position = "bottom",
        plot.background = element_rect(fill = "white", color = NA))
A map of countries that have the ISO 3166-1 alpha-2 code as a substring in their names

Figure 1: Countries that have the ISO 3166-1 alpha-2 code as a substring in their names


#Country #codes #bloggers

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *