Forecast Benchmark: Dynrmf (A New Serious Competitor in Town) vs. Theta Method for M-competitions and Tourism Competition | R bloggers

Forecast Benchmark: Dynrmf (A New Serious Competitor in Town) vs. Theta Method for M-competitions and Tourism Competition | R bloggers

1 minute, 17 seconds Read

In the world of time series forecasting, benchmarking methods against established competitors are critical. Today I’m sharing the results of an extensive comparison between two prediction approaches: the classic Theta method (notoriously difficult to beat on chosen benchmark datasets) and the newer Dynrmf method (from R and Python packages forward) algorithm, tested in three major prediction competitions.

The setup

I evaluated both methods for:

  • M3 competition: 3,003 various time series
  • M1 competition: The original M-competition dataset (1001 time series)
  • Tourist competition: Tourism-specific time series data (1311 time series)

For each dataset, I trained both models on historical data and evaluated their predictions against existing test sets using standard accuracy metrics (ME, RMSE, MAE, MPE, MAPE, MASE, ACF1). Each model uses its default parameters.

Implementation details

The analysis used dual-core parallel processing to speed up calculations over thousands of sequences. Both methods received identical train/test splits for fair comparison – full code at the end of the post.

# Example forecasting loop structure
metrics <- foreach(i = 1:n, .combine = rbind)%dopar%{
  series <- dataset[[i]]
  train <- series$x
  test <- series$xx
  fit <- method(y = train, h = length(test))
  forecast::accuracy(fit)
}

Results

It’s worth mentioning that Theta was the clear winner of the M3 competition (and it has improved in many ways), and of course it’s always easier to beat the winner in retrospect.

I performed paired t-tests on the difference in statistics (Theta – Dynrmf) in all series of each competition. Here’s what the numbers reveal:

M3 competition results

Metrict-statisticp-valueInterpretation
ME16.96<0.001Theta significantly more biased
RMSE10.17<0.001Theta significantly worse
MAE7.65<0.001Theta significantly worse
M.P.E-5.59<0.001Dynrmf more biased
CARD1.290.197No significant difference
MASE14.09<0.001Theta significantly worse
ACF1-1.370.171No significant difference

Pronunciation: Dynrmf demonstrates significantly better accuracy on M3 data for most error metrics.

M1 competition results

Metrict-statisticp-valueInterpretation
ME2.560.011Theta is more biased
RMSE2.940.003Theta significantly worse
MAE2.280.023Theta significantly worse
M.P.E-0.610.540No significant difference
CARD0.380.703No significant difference
MASE-0.740.461No significant difference
ACF1-6.50<0.001Dynrmf residuals better

Pronunciation: Dynrmf shows moderate but statistically significant improvements, especially in RMSE and MAE.

Results of the tourism competition

Metrict-statisticp-valueInterpretation
ME1.620.105No significant difference
RMSE1.880.060Marginal benefit for Dynrmf
MAE1.160.245No significant difference
M.P.E4.52<0.001Theta less biased
CARD-5.03<0.001Dynrmf significantly better
MASE-5.17<0.001Dynrmf significantly better
ACF1-8.65<0.001Dynrmf residuals better

Pronunciation: In the field of tourism data, Dynrmf excels in scaled metrics (MAPE, MASE) and produces better behaved residuals.

Key Takeaways

  1. Dynrmf consistently outperforms Theta on accuracy metrics such as RMSE, MAE and MASE in all competitions
  2. The advantage is strongest on M3which suggests that Dynrmf can handle different series well
  3. Tourism data shows the power of Dynrmf in percentage and scaled error statistics
  4. Residual autocorrelation (ACF1) is consistently better with Dynrmf, indicating that patterns are captured more completely
  5. Model-agnostic flexibility: It is important to mention that unlike method-specific approaches, Dynrmf accepts any fitting function (and the default here is automatic Ridge regression that minimizes generalized cross-validation errors) via its fit_func parameter (see here), allowing the use of Ridge regression (default), Random Forest, XGBoost, Support Vector Machines or any other model with standard fit/predict interfaces

Code availability

The full R implementation is available below. The analysis uses parallel processing for efficiency on large competitive datasets.

library(ahead)
library(Mcomp)
library(Tcomp)
library(foreach)
library(doParallel)
library(doSNOW)

setwd("~/Documents/Papers/to_submit/2026-01-01-dynrmf-vs-Theta-on-M1-M3-Tourism")

# 1. M3 comp. results --------------

data(M3)

n <- length(Mcomp::M3)

training_indices <- seq_len(n)

# Theta -----------

metrics_theta <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_theta <- foreach::foreach(i = 1:n, 
                 .combine = rbind, .verbose = TRUE)%dopar%{
  series <- M3[[i]]
  train <- series$x
  test <- series$xx
  fit <- forecast::thetaf(
    y = train,
    h = length(test))
  utils::setTxtProgressBar(pb, i)
  forecast::accuracy(fit)
}
close(pb)

print(metrics_theta)

# Dynrmf -----------


metrics_dynrmf <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_dynrmf <- foreach::foreach(i = 1:n, 
                                  .combine = rbind, .verbose = TRUE)%dopar%{
                                    series <- M3[[i]]
                                    train <- series$x
                                    test <- series$xx
                                    fit <- ahead::dynrmf(
                                      y = train,
                                      h = length(test))
                                    utils::setTxtProgressBar(pb, i)
                                    forecast::accuracy(fit)
                                  }
close(pb)

print(metrics_dynrmf)

diff_metrics <- metrics_theta - metrics_dynrmf

M3_results <- apply(diff_metrics, 2, t.test)

M3_results <- apply(diff_metrics, 2, function(x) {z <- t.test(x); return(c(z$statistic, z$p.value))})
rownames(M3_results) <- c("statistic", "p-value")



# 2. M1 comp. results -----------------

data(M1)

n <- length(Mcomp::M1)

training_indices <- seq_len(n)

# Theta -----------

metrics_theta <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_theta <- foreach::foreach(i = 1:n, 
                                  .combine = rbind, .verbose = TRUE)%dopar%{
                                    series <- M1[[i]]
                                    train <- series$x
                                    test <- series$xx
                                    fit <- forecast::thetaf(
                                      y = train,
                                      h = length(test))
                                    utils::setTxtProgressBar(pb, i)
                                    forecast::accuracy(fit)
                                  }
close(pb)

print(metrics_theta)

# Dynrmf -----------

metrics_dynrmf <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_dynrmf <- foreach::foreach(i = 1:n, 
                                   .combine = rbind, .verbose = TRUE)%dopar%{
                                     series <- M1[[i]]
                                     train <- series$x
                                     test <- series$xx
                                     fit <- ahead::dynrmf(
                                       y = train,
                                       h = length(test))
                                     utils::setTxtProgressBar(pb, i)
                                     forecast::accuracy(fit)
                                   }
close(pb)

print(metrics_dynrmf)

diff_metrics <- metrics_theta - metrics_dynrmf

M1_results <- apply(diff_metrics, 2, function(x) {z <- t.test(x); return(c(z$statistic, z$p.value))})
rownames(M1_results) <- c("statistic", "p-value")


# 2. Tourism comp. results -----------------

data(tourism)

n <- length(Tcomp::tourism)

training_indices <- seq_len(n)

# Theta -----------

metrics_theta <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_theta <- foreach::foreach(i = 1:n, 
                                  .combine = rbind, .verbose = TRUE)%dopar%{
                                    series <- tourism[[i]]
                                    train <- series$x
                                    test <- series$xx
                                    fit <- forecast::thetaf(
                                      y = train,
                                      h = length(test))
                                    utils::setTxtProgressBar(pb, i)
                                    forecast::accuracy(fit)
                                  }
close(pb)

print(metrics_theta)

# Dynrmf -----------

metrics_dynrmf <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_dynrmf <- foreach::foreach(i = 1:n, 
                                   .combine = rbind, .verbose = TRUE)%dopar%{
                                     series <- tourism[[i]]
                                     train <- series$x
                                     test <- series$xx
                                     fit <- ahead::dynrmf(
                                       y = train,
                                       h = length(test))
                                     utils::setTxtProgressBar(pb, i)
                                     forecast::accuracy(fit)
                                   }
close(pb)

print(metrics_dynrmf)

diff_metrics <- metrics_theta - metrics_dynrmf

diff_metrics <- diff_metrics[is.finite(diff_metrics[, 4]), ]

Tourism_results <- apply(diff_metrics, 2, function(x) {z <- t.test(x); return(c(z$statistic, z$p.value))})
rownames(Tourism_results) <- c("statistic", "p-value")

# # Theta - dynrmf+default
# 
# ```R
# kableExtra::kable(M3_results)
# kableExtra::kable(M1_results)
# kableExtra::kable(Tourism_results)
# 
# |          |       ME|     RMSE|      MAE|       MPE|      MAPE|    MASE|       ACF1|
#   |:---------|--------:|--------:|--------:|---------:|---------:|-------:|----------:|
#   |statistic | 16.96265| 10.16816| 7.652445| -5.586289| 1.2900974| 14.0924| -1.3680305|
#   |p-value   |  0.00000|  0.00000| 0.000000|  0.000000| 0.1971162|  0.0000|  0.1714049|
# 
# |          |        ME|      RMSE|       MAE|        MPE|      MAPE|       MASE|      ACF1|
#   |:---------|---------:|---------:|---------:|----------:|---------:|----------:|---------:|
#   |statistic | 2.5575962| 2.9406984| 2.2815166| -0.6129693| 0.3808239| -0.7378457| -6.497644|
#   |p-value   | 0.0106864| 0.0033502| 0.0227274|  0.5400360| 0.7034148|  0.4607813|  0.000000|
# 
# |          |        ME|      RMSE|       MAE|       MPE|       MAPE|       MASE|      ACF1|
#   |:---------|---------:|---------:|---------:|---------:|----------:|----------:|---------:|
#   |statistic | 1.6239829| 1.8845405| 1.1628106| 4.5207962| -5.0276196| -5.1743478| -8.648504|
#   |p-value   | 0.1046342| 0.0597262| 0.2451306| 0.0000068|  0.0000006|  0.0000003|  0.000000|
#   
# ```

What forecasting methods have you found most reliable in your work? Share your experiences in the comments below.

Sign and share this petition https://www.change.org/stop_torturing_T_Moudiki – after seriously researching my background and contributions to the field.


#Forecast #Benchmark #Dynrmf #Competitor #Town #Theta #Method #Mcompetitions #Tourism #Competition #bloggers

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *