Make reports and tutorials with generative AI van R | R-Bloggers

Make reports and tutorials with generative AI van R | R-Bloggers

[This article was first published on Bluecology blog, and kindly contributed to R-bloggers]. (You can report problems here about the content on this page)


Do you want to share your content on R-bloggers? Click here if you have a blog, or here If you don’t.

Various AI model providers have integrated web search options in their large language models. I tried to use these functions through the R ellmer package. However, changes in LLMs are so common ellmer does not keep track of.

I got ellmer To carry out the Sonar Web Search model from Pertlexity, but it did not give me the references that are essential.

If you have seen this by AI not -generated reports, view the example at the end. It is a useful way to get a fast literature overview or to make customized R -tutorials.

Here I just announce a few simple R scripts that you can use to search for AI reports in web with the open router service.

I have just made two functions (with the help of Copilot of course), one that calls the open router API to send a demand for a model. The second function processes the output (which is located in JSON classification) to make a nice QMD, with the references Hyper-Linked (make sure you check the URL before clicking on it, who knows what the AI ​​appears!). From there you can display the QMD to get a PDF/Word/HTML report.

The functions are easy to use. First Download or copy these functions from My Github.

NOTE NEVER Trust a code from someone else who sends requests to LLMS! It may contain harmful instructions. I recommend reading all the code that send instructions to LLMS, just to make sure you know what it is doing.

Use the code to perform a search and make a report

Once you have my two functions, you have to do that Set your open router API key and save the key somewhere (for example, you can use usethis::edit_r_environ() and save it like open router_api_key = “my-key-here”)

Here is an example of using the function:

library(httr)
library(jsonlite)

source("perplexity-search-functions.R")

openrouter_api_key <- Sys.getenv("OPENROUTER_API_KEY")

user_message <- "I want to learn how to use the NIMBLE package to fit autoregressive time-series models"

system_message <- "You are a helpful AI agent who creates statistical analysis tutorials in R. 
        Rules: 
        1. Include text and examples of code in your responses. 
        2. Produce reports that are less than 10000 words."

#Send response to openrouter 
response <- call_openrouter_api(
  openrouter_api_key,
  model = "perplexity/sonar-deep-research",
  system_message = system_message,
  user_message,
  search_context_size = "medium"
  #Options "low"  "medium", "high"
)

#Save the response as a qmd
save_response_as_qmd(response, "results/AR-models-in-NIMBLE.qmd")

Inputs for the LLM

The user message is your prompt to search. The Systemberight sets the scope of how the report was made. Note that everything goes for the search for web in the user message, not the Systemberight View the guidelines of the confusion for more information about requesting advice, it is different from regular LLMS.

Another idea for a system prompt could be, for example:

system_message <- "You are a helpful AI agent who creates summary reports of the scientific literature. 
        Rules: 
        1. Produce reports that are less than 2000 words.
        2. Include a Summary section that summarizes key research trends. "

user_message <- "What are the impacts of climate warming on fish physiology documented in the peer-reviewed academic literature"

search_context_size is intended to determine how much effort it is, it is hard to say whether it affects the results or not, see OpenRouter’s documents for more information..

Model choices

Other models to try are:

  • perplexity/sonar For simpler, cheaper searches, including quotes.

  • perplexity/sonar-deep-research For deeper, more expensive searches with quotes and reasoning.

  • openai/o4-mini Is another option, but does not return any quotes.

Explore the open router site for other web search enabled LLMS.

Amendment

The call_openrouter_api.R Function is a template for adjusting, not an extensive framework for using the API open router. It is actually very easy to connect to LLMS from R (although most online are online examples in Python or Typscript). Here is a basic template:

library(httr)
library(jsonlite)
response <- POST(
    url = "https://openrouter.ai/api/v1/chat/completions",
    add_headers(
      "Content-Type" = "application/json",
      "Authorization" = paste("Bearer", openrouter_api_key)
    ),
    body = toJSON(list(
      model = model,
      messages = list(
        list(
          role = "system",
          content = system_message
        ),  
        list(
          role = "user",
          content = user_message
        )
      )
    ), auto_unbox = TRUE),
    encode = "raw"
  )

The trick is then formatting the output, which does my other function (focus on pertlexitions models).

Meld Example

Below is an example, I asked the system prompt above and asked to fit surplus production models (for fishing science) with the agile package. Note a small bug with my function that converts QMD references to hyperlinks, the potential also does R-Subscripts. Something to work on later. Here is the AI ​​report:

You can one Surplus Production Model (SPM) With the agile package in R by writing the model in Bugs language, which extends Nimble, which specifies your probability (for example, production dynamics with catch data) and Priors for the parameters, and then using the MCMC options of Nimble to estimate the rear distributions of parameters. Here is a step -by -step approach with sample code features:

  1. Specify the surplus production model in Bugs code:
    This usually models biomass (B_T) over time according to a logistical growth or Gompertz model minus catches (C_T). For example, the Schaefer model can be expressed as:
    [ B_{t+1} = B_t + r B_t \left(1 – \frac{B_t}{K}\right) – C_t + \epsilon_t
    ]
    Where (r) intrinsic growth rate, (k) carrying engine, and (\ Epsilon_t) models process noise.
library(nimble)

# Define the model in BUGS syntax
spm_code <- nimbleCode({
  # Priors for parameters
  r ~ dunif(0, 2)           # growth rate
  K ~ dunif(maxCatch, 10 * maxCatch)  # carrying capacity (expand as appropriate)
  sigma ~ dunif(0, 5)       # process noise SD

  B[1] ~ dunif(0, K)       # initial biomass
  
  for(t in 1:(nYears - 1)) {
    mu[t] <- B[t] + r * B[t] * (1 - B[t]/K) - catch[t]
    B[t + 1] ~ dnorm(mu[t], sd = sigma)
  }
})
  1. Prepare the data and constants (eg time series, number of years):
data <- list(catch = catch_vector)  # your observed catches
constants <- list(nYears = length(catch_vector), maxCatch = max(catch_vector))
inits <- list(r = 0.5, K = max(catch_vector)*5, sigma = 0.1, B = rep(NA, length(catch_vector)))
  1. Build, compile and perform the model with skilled:
# Create model object
spm_model <- nimbleModel(spm_code, data = data, inits = inits, constants = constants)

# Compile model
c_spm_model <- compileNimble(spm_model)

# Configure MCMC; default samplers are usually fine
mcmc_conf <- configureMCMC(spm_model)
mcmc <- buildMCMC(mcmc_conf)
c_mcmc <- compileNimble(mcmc, project = spm_model)

# Run MCMC sampling
samples <- runMCMC(c_mcmc, niter = 10000, nburnin = 2000, thin = 5)

# Check results
summary(samples)

This workflow follows Nimble’s general approach to specifying hierarchical Bayesian models in an extensive bugs language, compiling fast C ++ versions of the model and the samplers and performing MCMC algorithms[1][2][4][7]. The key is the coding of the surplus production dynamics explicitly as state transfer comparisons with observation or process error if necessary.

To be clear, there is one here Minimum example with dummy catch data:

library(nimble)

catch_vector <- c(100, 120, 90, 80, 110) # example catches

spm_code <- nimbleCode({
  r ~ dunif(0, 2)
  K ~ dunif(maxCatch, 10 * maxCatch)
  sigma ~ dunif(0, 5)

  B[1] ~ dunif(0, K)
  for(t in 1:(nYears - 1)) {
    mu[t] <- B[t] + r * B[t] * (1 - B[t]/K) - catch[t]
    B[t+1] ~ dnorm(mu[t], sd = sigma)
  }
})

data <- list(catch = catch_vector)
constants <- list(nYears = length(catch_vector), maxCatch = max(catch_vector))
inits <- list(r = 0.5, K = max(catch_vector)*5, sigma = 0.1, B = rep(NA, length(catch_vector)))

spm_model <- nimbleModel(spm_code, data = data, inits = inits, constants = constants)
c_spm_model <- compileNimble(spm_model)

mcmc_conf <- configureMCMC(spm_model)
mcmc <- buildMCMC(mcmc_conf)
c_mcmc <- compileNimble(mcmc, project = spm_model)

samples <- runMCMC(c_mcmc, niter = 5000, nburnin = 1000, thin = 2)
print(summary(samples))

This example probably models biomass dynamics and fits parameters (R), (K) and noise level (\ Sigma) well -known catches using Bayesian inference.

If you want to implement maximum probability adjustment instead, you can compile the model with Nimble and evaluate the probability for optimization, as demonstrated in tutorials[1].

In summary:

  • Write your excess production model in the Bugs language of NimbleDefining biomassadnamics, catches and priors for parameters
  • Compilate and perform MCMC Sampling in skilling to fit into the model
  • Extraheer posterior samples For growth rate, capacity, biomass and uncertainty
  • Optional, use Methods based on probability By putting together the model and defining your own optimizer function in R[1][7].

This approach uses the power of Nimble for hierarchical, Bayesian time series of supply models, including surplus production models.

References

  1. oliviergimenez.github.io/banana book/intronimble.html
  2. www.rdocumentation.org/packages/nimble/versions/1.3.0/topics/nimble-package
  3. www.youtube.com/watch
  4. github.com/nimble-dev/nimble
  5. r-n-n-nocumentation-2
  6. cran.r-project.org/packages/nimblecarbon/vignettes/nimble_carbon_vignette.html
  7. r-n-n-n-nemble.org/
  8. www.rdocumentation.org/packages/nimble/versions/1.3.0


#reports #tutorials #generative #van #RBloggers

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *