Do you want to share your content on R-bloggers? Click here if you have a blog, or here If you don’t.
A centralized metadata object Focus on clinical workflows for programming clinical examinations • Metacore {Metacore} And
Hello, everyone! I am Liam And I am excited to announce that I have taken over as a package manager for both
Christina remains obvious as a mentor and I want to thank both her and Ben Straub for the constant support before we dive into the details of
What’s new in Metacore?
The purpose of version 0.2.0 was to clarify the distinction between an imported Metacore specification, which contained information about multiple data sets and a substantiated specification with information about just a single data set (as achieved via
We have received a number of questions and problems where users tried to use a metacore object that contained metadata for multiple data sets in functions of
Now a metacore object with multiple data sets or one with a single data set has been redesigned to be programmatically different, whereby the single data set is implemented as a subclass of metacore called “Datasetmeta”.
From the perspective of users there is one important change. A metadata object on a single data set is required for users to work with
The print statements of both combined and subseted metacore objects are refined to better illustrate the differences between them and to offer more useful information to the user.
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lagThe following objects are masked from 'package:base':
intersect, setdiff, setequal, unionlibrary(metacore)
library(metatools)
library(tibble)
library(haven)
load(metacore_example("pilot_ADaM.rda"))
metacore── Metacore object contains metadata for 5 datasets ────────────────────────────
→ ADSL (Subject-Level Analysis Dataset)
→ ADADAS (ADAS-Cog Analysis)
→ ADLBC (Analysis Dataset Lab Blood Chemistry)
→ ADTTE (AE Time To 1st Derm. Event Analysis)
→ ADAE (Adverse Events Analysis Dataset)
ℹ To use the Metacore object with metatools package, first subset a dataset using `metacore::select_dataset()`
The
adsl_spec <- select_dataset(metacore, "ADSL", quiet = TRUE)
✔ ADSL dataset successfully selected
ℹ Dataset metadata specification subsetted with suppressed warnings
Printing the substitute object now offers more detailed information:
adsl_spec
── Dataset specification object for ADSL (Subject-Level Analysis Dataset) ──────
The dataset contains 51 variables
The structure of the specification object is:
→ codelist: character [16 x 4] code_id, name, type, codes
→ derivations: character [50 x 2] derivation_id derivation
→ ds_spec: character [1 x 3] dataset, structure, label
→ ds_vars: character [51 x 7] dataset, variable, key_seq, order, keep, core, supp_flag
→ supp: character [0 x 4] dataset, variable, idvar, qeval
→ value_spec: character [51 x 8] dataset, variable, code_id, derivation_id, type, origin, where, sig_dig
→ var_spec: character [51 x 6] variable, type, length, label, format, common
To inspect the specification object use `View()` in the console.
Functions that take a Metacore object as input will broadcast a useful message if a subseted object is not delivered.
ds_list <- list(DM = read_xpt(metatools_example("dm.xpt")))
create_var_from_codelist(data.frame(), metacore)Error in `verify_DatasetMeta()`: ! The object supplied to the argument `metacore` is not a subsetted Metacore object. Use `metacore::select_dataset()` to subset metadata for the required dataset.
create_var_from_codelist()
PARAM by PARAMCD But the code for the out_var ((PARAM) does not contain the values of PARAMCD.
| Param | 1 | Alanine Aminotransferase | Alanine Aminotransferase |
| Param | 2 | Bilirubin | Bilirubin |
| Param | 3 | Creatine | Creatine |
Example of standard use that does not offer the right result:
adlbc_spec <- suppressMessages(select_dataset(metacore, "ADLBC", quiet = TRUE))
data <- tibble(PARAMCD = c("ALB", "ALP", "ALT"))
create_var_from_codelist(data, adlbc_spec, input_var = PARAMCD, out_var = PARAM, strict = FALSE)# A tibble: 3 × 2 PARAMCD PARAM1 ALB 2 ALP 3 ALT
Standard, out_var as input. The user can now overwrite this standard with a specific code (in this case PARAMCD Below) to achieve the desired result.
| Paramcd | 1 | Alto | Alanine Aminotransferase |
| Paramcd | 2 | Were | Bilirubin |
| Paramcd | 3 | Scoop | Creatine |
create_var_from_codelist(data, adlbc_spec, input_var = PARAMCD, out_var = PARAM, codelist = get_control_term(adlbc_spec, PARAMCD), decode_to_code = FALSE)
# A tibble: 3 × 2 PARAMCD PARAM1 ALB Albumin (g/L) 2 ALP Alkaline Phosphatase (U/L) 3 ALT Alanine Aminotransferase (U/L)
This function also offers a new option strictset to TRUE (Standard) will give a warning that indicates all values in your input column that do not appear in the code.
data <- tibble(PARAMCD = c("ALB", "ALP", "ALT", "DUMMY1", "DUMMY2"))
x <- create_var_from_codelist(data, adlbc_spec, input_var = PARAMCD, out_var = PARAM, codelist = get_control_term(adlbc_spec, PARAMCD), decode_to_code = FALSE, strict = TRUE)Warning: In `create_var_from_codelist()`: The following values present in the input dataset are not present in the codelist: DUMMY1 and DUMMY2
create_cat_var ()
code or decode Column of the controlled terminology. Earlier, a code -to -be set -up would be evaluated from the code Only column, so that the “years” text of the new variable is omitted.
| AGEGR2 | Pooled Age Group 2 | text | 1 | <35 | <35 years |
| AGEGR2 | Pooled Age Group 2 | text | 2 | 35-49 | 35-49 years |
| AGEGR2 | Pooled Age Group 2 | text | 3 | > = 50 | > = 50 years |
Now, specify the option create_from_decode = TRUE This allows you to create the variable based on the text in the decode column. If you use this option to also make a numeric coded variable (in this case AGEGR2N), make sure your CT is set up like that decode Match columns.
dm <- read_xpt(metatools_example("dm.xpt"))
create_cat_var(dm, adsl_spec, AGE, AGEGR2, AGEGR2N, create_from_decode = TRUE) %>%
select(USUBJID, AGE, AGEGR2, AGEGR2N) %>%
head(5)# A tibble: 5 × 4 USUBJID AGE AGEGR2 AGEGR2N1 01-701-1015 63 18-64 years 1 2 01-701-1023 64 18-64 years 1 3 01-701-1028 71 65-80 years 2 4 01-701-1033 74 65-80 years 2 5 01-701-1034 77 65-80 years 2
This function now also offers a standard strict = TRUE Option, which gives a warning message if there are values in the reference column that do not fit in the categories in the controlled terminology. This can be switched off with strict = FALSE.
dm2 <- dm |> tibble::add_row(AGE = 15) |> tibble::add_row(AGE = 16) x <- create_cat_var(dm2, adsl_spec, AGE, AGEGR2, create_from_decode = TRUE)
Warning: There are 2 observations in AGE that do not fit into the provided categories for AGEGR2. Please check your controlled terminology.
Summary of other changes
A bug dissolved in which the presence of variables with VLM in the
value_specTable would prevent variables with the same name in different data sets to be entered in thevalue_spectable.Build a dataset of derived – build_from_derived • Metatools Metatools :: build_from_derived () Adds new options for thekeepparameter with which users can distract bothallor aloneprerequisiteColumns of Brondatasets. Thanks to Matt Bearham for this amendment!Combine the domain and additional qualification – Combine_Supp • Metatools Metatools :: Combine_Supp () Now adds the label that has been found inQLABELat theQNAMColumns that are derived from additional datasets. Thanks to Bill Deany for this amendment!Check variable names – Check_variables • Metatools Metatools :: Check_variables () Now offers onestrictOption that will give a warning instead of throwing an error whenstrict = FALSE.Helpers for the development of assignment line interfaces • CLI {cli} Export is now used in both packages and messages for different functions has been improved.
What is the following?
The next step for both packages will continue problems and take out problems from the backlog, update the examples and vignettes and improve the user experience via more informative messages.
For
I hope to release the next update towards the end of the year, looking at an approximately 6 -month release schedule in the future. Until then, I encourage people to explore some of the new functions and to give feedback about the changes via Github on the links below:
Thank you for reading!
Last updated
2025-08-07 01: 11: 34.287895
Detailed
Reuse
Quote
Bibtex quote:
@online{hobby2025,
author = {Hobby, Liam},
title = {Metacore and {Metatools} 0.2.0},
date = {2025-08-04},
url = {https://pharmaverse.github.io/blog/posts/2025-08-04_metacore_0.2.0/metacore_0.2.0.html},
langid = {en}
}
Name this work for:
Hobby, Liam. 2025. “Metacore and Metatools 0.2.0.” August 4, 2025. https://pharmaverse.github.io/blog/posts/2025-08-04_metacore_0.2.0/metacore_0.2.0.html.
Related
#Metacore #Metatools #0.2.0 #RBloggers


