Faster clinical trial report: a beginner’s manual for implementing CDisc SDTM and Adam standards with open-source R-packages | R-Bloggers

Faster clinical trial report: a beginner’s manual for implementing CDisc SDTM and Adam standards with open-source R-packages | R-Bloggers

[This article was first published on pharmaverse blog, and kindly contributed to R-bloggers]. (You can report problems here about the content on this page)


Do you want to share your content on R-bloggers? Click here if you have a blog, or here If you don’t.


Clinical studies are a crucial part of the pharmaceutical industry, which offer the evidence that is necessary to demonstrate safety, efficacy and general benefit of new medicines or treatments.

Reporting Clinical Tests, a detailed process in itself, is led by regulatory authorities such as the US Food and Drug Administration (FDA), the European Medicines Agency (EMA) and the International Conference on Harmonization (ICH).

The results of these studies must be carefully reported to protect public health and to promote trust in the pharmaceutical industry. Pharmaverse and R support this need directly by offering open -source, transparent and standardized tools that make consistent data processing and reporting possible.

Current challenges

Clinical tests face various challenges in their current state, which can be broadly categorized in three important areas:

Ethical and transparency problems

International ethical standards, including the Explanation of HelsinkiEmphasize that the results of all clinical studies – positive, negative or not decisive – must be made publicly accessible. Nevertheless, empirical studies show that compliance remains inconsistent. In particular, early phase or smaller studies, as well as tasting with negative results, are less likely reported in a timely or visible way (Socra; BMJ/PMC -analysis). Moreover, it suggests evidence that sponsorship in industry can introduce bias in different stages of test design, analysis and reporting, which strengthens the importance of transparent distribution of all findings (((Cochrane Review, 2017).

Cost and efficiency

Clinical studies are duration, often more than hundreds of millions of dollars, and this can hinder the development of drugs for rare diseases, whereby the efficiency on investment is limited by the small size of the patient population. Failure in clinical examinations – especially in late phase studies that often include more than 1000 patients – can increase enormous losses, exacerbated by the complex and varying nature of the global regulations around them.

Data and reproducibility problems

Reproducibility is the key to independent verification of results of clinical tests. Data collection, formats and methods methods often hinder cross-study comparisons, especially if an investigation is not designed with international standards that are consistent and reproducible.

Pharmaceutical standards

The Clinical Data Interchange Standards Consortium (CDISC) Offers worldwide that standards, which are a series of guidelines and models that are used to standardize the collection, exchange and analysis of data from clinical tests. These standards are aimed at improving efficiency, data quality and interoperability in various clinical research areas.

In this article specifically the most important CDisc stands that we are going to emphasize, clinical data acquisition standards are harmonization (CDASH), Study Data Tabulation Model (SDTM) and Analysis Data Model (Adam).

CDASH – Harmonization of clinical data collection standards

CDASH offer Guidelines for standardized data collection About clinical study studies. It ensures that data collected at clinical locations are structured, consistent and easily translator in formats for regulating submission such as SDTM.

SDTM – Tabulation model for study data

SDTM Implementation guide is a standard for Structuring and organizing clinical study data sets For entries of the regulations (eg at the FDA, EMA). SDTM data is generally treated by the FDA if the RAW of Brong dataBecause it represents the collected research information in a standardized format.

It is important that SDTM restructures the collected data For consistency and transparency between studies, but it does do not distract or create new analysis variables – That role is covered by Adam.

Adam – Nalyse Data Model

Adam Implementation Guide Ensures traceability of SDTM data by analysis-ready datasets. This model sets one Standardized way to make data sets for statistical analysis of SDTM organized data, facilitating efficient analysis and reporting.

The role of R in clinical studies

Nowadays R is widely used in the analysis of clinical test because of the flexibility, powerful statistical possibilities and strong support for compliance with the regulations. R offers:

  • Robust statistical analysis Used for survival analysis, mixed models and Bayesian statistics.
  • Data cleaning and transformation Cilitated by specialized packages and modeling tools.
  • Reproducibility and open source solutions To ensure that clinical reports and submissions can be automated, reducing errors.
  • Integration with CDisc -standards With the help of different R -packages that can structure data based on SDTM and Adam.

Pharmaceutical

Pharmaverse is a collaborative open-source ecosystem of tools for data processing of clinical examinations and compliance with the regulations, which currently mainly consists of R-packages, but now also including emerging Python software. It offers sources that streamline the implementation of CDISC standards such as SDTM, Adam and Define-XML.

  • Key Pharma verses packages:
    • {sdtm.oak} – helps to structure SDTM – data sets.
    • {admiral} – facilitates the making of Adam -data set.
    • {teal} – Builds interactive r/shiny dashboards for exploratory clinical data analysis.
  • Related R -Packages:
    • {rtables} -Creates regulatory tables for clinical reports.
    • {tern} -Creates regulatory tables/graphs for clinical reports.

These are not the only packages that are used, but these are those referred to in this blog post.

Pharmaverse does not have a special CDash package, but the metadata-driven approach means that you can still work with CDASH concepts to connect source data collection with SDTM-Mapping.

A more extensive list of packages can be found here: https://pharmaverse.org/e2eclinical/

Pipeline

In the reporting of the clinical test, the Datapijplijn can follow this standard:

The unprocessed data from patients are collected according to CDASH guidelines and then processed with SDTM and Adam standards to produce related tables, lists and graphs/figure outputs that correspond to all aspects of a specific clinical study.

Starting with the CDash data From a study, the preparation process begins with {sdtm.oak} And {sdtmchecks} Packets of the Pharma verses to prepare them for SDTM -Formatted data sets.

Adam -standards are then applied to this data using the {admiral} package, which can then be analyzed and converted into tables, lists and graphs/figures using non-fharma-fresh (but still related) packages such as such as such as {rtables} And {tern}.

The Pharmaverse contains some repositories with test data that can help visualize and understand this, namely pharmaversesdtm And pharmaverseadam.

There is also the {random.cdisc.data} Package that can be used to generate randomized CDisc data for exploratory purposes.

More information about these data sets can be found in their respective documentation sites.

Tables, lists, graphs/numbers

The content of clinical test reports contains 3 different types of outputs:

  1. Tables
  2. Frame
  3. Graphs/Figures

These outputs play an essential part to summarize the results of a clinical test during the report. They can summarize safety, efficacy and exploratory results (such as biomarker analyzes).

In the following sections we will use the data sets in the {random.cdisc.data} Package to create example outputs used in clinical test reports. These will each be split into two sections, namely “Data Setup” and “Standard Output Generation”.

Tables

In this example we will Standard side effects table (AET02) use of the cadsl And cadae data sets of the random.cdisc.data package (see the TLG -Catalogus for reference).

> Table for side effects – AET02

Data institutions:

library(random.cdisc.data)
library(dplyr)
library(tern)

# Ensure character variables are converted to factors so that
# empty strings and NAs are explicit missing levels.

adsl <- df_explicit_na(cadsl)
adae <- df_explicit_na(cadae) %>%
  var_relabel(
    AEBODSYS = "MedDRA System Organ Class",
    AEDECOD = "MedDRA Preferred Term"
  ) %>%
  filter(ANL01FL == "Y")

# Define the split function
split_fun <- drop_split_levels

AET02 OUTPUT Generation:

# Standard Table for AET02 --------------------------------------------------------------
lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ACTARM") %>%
  add_overall_col(label = "All Patients") %>%
  analyze_num_patients(
    vars = "USUBJID",
    .stats = c("unique", "nonunique"),
    .labels = c(
      unique = "Total number of patients with at least one adverse event",
      nonunique = "Overall total number of events"
    )
  ) %>%
  split_rows_by(
    "AEBODSYS",
    child_labels = "visible",
    nested = FALSE,
    split_fun = split_fun,
    label_pos = "topleft",
    split_label = obj_label(adae$AEBODSYS)
  ) %>%
  summarize_num_patients(
    var = "USUBJID",
    .stats = c("unique", "nonunique"),
    .labels = c(
      unique = "Total number of patients with at least one adverse event",
      nonunique = "Total number of events"
    )
  ) %>%
  count_occurrences(
    vars = "AEDECOD",
    .indent_mods = -1L
  ) %>%
  append_varlabels(adae, "AEDECOD", indent = 1L)

result <- build_table(lyt, df = adae, alt_counts_df = adsl)

result

Frame

In this example we will Standard list of preferred conditions, terms at the lowest level and researcher-specific conditions for side effects (AEL01) use of the {cadae} Data set of the {random.cdisc.data} package.

> List of preferred conditions, terms at lowest level and researcher specified side effects for side effects – AEL01

Data institutions:

library(random.cdisc.data)
library(dplyr)
library(rlistings)

out <- cadae %>%
  select(AESOC, AEDECOD, AELLT, AETERM) %>%
  unique()

var_labels(out) <- c(
  AESOC = "MedDRA System Organ Class",
  AEDECOD = "MedDRA Preferred Term",
  AELLT = "MedDRA Lowest Level Term",
  AETERM = "Investigator-Specified\nAdverse Event Term"
)

Ael01 Output Generation:

# Standard Listing for AEL01 ------------------------------------------------------------
lsting <- as_listing(
  out,
  key_cols = c("AESOC", "AEDECOD", "AELLT"),
  disp_cols = names(out),
  main_title = "Listing of Preferred Terms, Lowest Level Terms, and Investigator-Specified Adverse Event Terms"
)

head(lsting, 20)

Graphs/Figures

In this example we will Standard plot for subgroup analysis of the survival duration (FSTG02) use of the {caddtte} Data set of the {random.cdisc.data} package.

Subgroup analysis of survival duration – FSTG02 “>
> Subgroup analysis of survival duration – FSTG02

Data institutions:

library(random.cdisc.data)
library(tern)
library(dplyr)
library(forcats)
library(nestcolor)

preprocess_adtte <- function(adtte) {
  # Save variable labels before data processing steps.
  adtte_labels <- var_labels(adtte)

  adtte <- adtte %>%
    df_explicit_na() %>%
    dplyr::filter(
      PARAMCD == "OS",
      ARM %in% c("B: Placebo", "A: Drug X"),
      SEX %in% c("M", "F")
    ) %>%
    dplyr::mutate(
      # Reorder levels of ARM to display reference arm before treatment arm.
      ARM = droplevels(forcats::fct_relevel(ARM, "B: Placebo")),
      SEX = droplevels(SEX),
      is_event = CNSR == 0,
      # Convert time to MONTH
      AVAL = day2month(AVAL),
      AVALU = "Months"
    ) %>%
    var_relabel(
      ARM = adtte_labels["ARM"],
      SEX = adtte_labels["SEX"],
      is_event = "Event Flag",
      AVAL = adtte_labels["AVAL"],
      AVALU = adtte_labels["AVALU"]
    )

  adtte
}

anl <- cadtte %>%
  preprocess_adtte()

FSTG02 OUTPUT Generation:

# Standard Plot for FSTG02 -------------------------------------------------------
anl1 <- anl

df <- extract_survival_subgroups(
  variables = list(tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2")),
  data = anl1
)

result <- basic_table() %>%
  tabulate_survival_subgroups(
    df = df,
    vars = c("n_tot", "n", "median", "hr", "ci"),
    time_unit = anl1$AVALU[1]
  )

result

plot <- g_forest(tbl = result)

plot

Conclusion

Conclusion

Reporting Clinical test remains of fundamental importance for the trust of the public and for the integrity of medicines development. Nevertheless, the field continues to be confronted in a persistent transparency shortagesescalating costsAnd reproducibility challenges.

Take open standards and collaboration platforms – such as the Pharmaceutical Ecosystem-with broader open-source initiatives in R and related technologies, offers a practical path ahead. These approaches make greater transparency, improved efficiency and rigorous, standards -based data practices that strengthen the scientific basis of clinical research. Consequently, the level of trust can be further improved and public well -being can be insured.

Participate in part 2 of this series, where we take a look at interactive data analysis of clinical tests with the help of packages that make tailor -made shiny applications, specifically for this space.

Sources


Last updated

2025-09-12 09: 11: 58.119603

Detailed

Reuse

Quote

Bibtex quote:

@online{hee2025,
  author = {Hee, Fabian and Kouretsis, Alexandros and , APPSILON},
  title = {Faster {Clinical} {Trial} {Reporting:} {A} {Beginner’s}
    {Guide} to {Implementing} {CDISC} {SDTM} and {ADaM} {Standards} with
    {Open-Source} {R} {Packages}},
  date = {2025-09-12},
  url = {https://pharmaverse.github.io/blog/posts/2025-09-12_faster_clinical_trial_reporting/faster_clinical_trial_reporting.html},
  langid = {en}
}

Name this work for:

Hee, Fabian, Alexandros Kouretsis and Appsilon. 2025. “Faster clinical trial report: a beginner’s manual for implementing CDisc SDTM and Adam standards with open-source R packages.” September 12, 2025. https://pharmaverse.github.io/blog/posts/2025-09-12_faster_clinical_trial_reporting/faster_clinical_trial_reporting.html.


#Faster #clinical #trial #report #beginners #manual #implementing #CDisc #SDTM #Adam #standards #opensource #Rpackages #RBloggers

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *