Predict concentrations (and efficiencies, if applicable), absolute abundances, and relative abundances based on the posterior distributions from a previously fitted model resulting from a call to run_paramedic.

predict_paramedic(
  W,
  V,
  X = V[, 1, drop = FALSE],
  beta_0,
  beta_1 = array(0, dim = c(nrow(beta_0), ncol(X) - 1, ncol(W) - 1)),
  Sigma,
  sigma_e = matrix(0, nrow = nrow(Sigma), ncol = ncol(Sigma)),
  phi = matrix(1, nrow = nrow(Sigma), ncol = nrow(X)),
  k = 0,
  n_iter = 10500,
  n_burnin = 10000,
  n_chains = 4,
  stan_seed = 4747,
  alpha_sigma = 2,
  kappa_sigma = 1,
  alpha_phi = 0,
  beta_phi = 0,
  sigma_xi = 1,
  ...
)

Arguments

W

The new relative abundance data, e.g., from broad range 16S sequencing with "universal" primers. Expects data (e.g., matrix, data.frame, tibble) with sample identifiers in the first column. Sample identifiers must be the same between W and V, and the column must have the same name in W and V.

V

The new absolute abundance data, e.g., from taxon-specific absolute primers. Expects data (e.g., matrix, data.frame, tibble) with sample identifiers in the first column. Sample identifiers must be the same between W and V, and the column must have the same name in W and V.

X

The new covariate data. Expects data (e.g., matrix, data.frame, tibble) with sample identifiers in the first column. Sample identifiers must be the same between W, V, and X, and the column must have the same name in W, V, and X. If X only consists of the subject identifiers, then no covariates are used.

beta_0

The posterior distribution on \(\beta_0\) (the intercept in the distribution on concentrations \(\mu\)), resulting from a call to run_paramedic.

beta_1

The posterior distribution on \(\beta_1\) (the slope in the distribution on concentrations \(\mu\); only used if X consists of more than just sample identifiers), resulting from a call to run_paramedic.

Sigma

The posterior distribution on \(\Sigma\) (the variance in the distribution on concentrations \(\mu\)).

sigma_e

The posterior distribution on \(\sigma_e\) (the variance of the efficiencies); only used if both alpha_sigma and kappa_sigma below are nonzero.

phi

The posterior distribution on \(\phi\) (the overdispersion parameter on V); only used if both alpha_phi and beta_phi below are nonzero.

k

the number of batches that the relative abundance data W were analyzed in. If k = 0 (the default), then batch effects are not considered. (currently not used)

n_iter

The total number of iterations per chain to be run by the Stan algorithm. Defaults to 10500.

n_burnin

The total number of warmups per chain to be run by the Stan algorithm. Defaults to 10000.

n_chains

The total number of chains to be run by the Stan algorithm. Defaults to 4.

stan_seed

The random number seed to initialize.

alpha_sigma

Hyperparameter specifying the shape parameter of the prior distribution on \(\sigma_e\). Defaults to 2.

kappa_sigma

Hyperparameter specifying the scale parameter of the prior distribution on \(\sigma_e\). Defaults to 1.

alpha_phi

Hyperparameter specifying the shape parameter of the prior distribution on \(\phi\). Defaults to 0; a negative binomial model can be specified if both alpha_phi and beta_phi are nonzero.

beta_phi

Hyperparameter specifying the rate parameter of the prior distribution on \(\phi\). Defaults to 0; a negative binomial model can be specified if both alpha_phi and beta_phi are nonzero.

sigma_xi

Hyperparameters specifying the variance of efficiencies over batches. Only used if k is greater than zero. Defaults to 1. (currently not used)

...

other arguments to pass to sampling (e.g., control).

Value

An object of class stanfit.

Details

Using the posterior distributions from a call to run_paramedic, generate predicted values.

See also

stan and sampling for specific usage of the stan and sampling functions; run_paramedic for usage of the run_paramedic function.

Examples

# load the package, read in example data
library("paramedic")
data(example_16S_data)
data(example_qPCR_data)
# generate a train/test split
set.seed(1234)
folds <- sample(1:2, size = nrow(example_16S_data), replace = TRUE)

# run paramedic (with an extremely small number of iterations, for illustration only)
# on only the first 10 taxa, and on a subset of the data
mod <- run_paramedic(W = example_16S_data[folds == 1, 1:10], 
V = example_qPCR_data[folds == 1, ], n_iter = 30, n_burnin = 25, 
n_chains = 1, stan_seed = 4747)
#> 
#> SAMPLING FOR MODEL 'variable_efficiency' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 5.3e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.53 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: WARNING: There aren't enough warmup iterations to fit the
#> Chain 1:          three stages of adaptation as currently configured.
#> Chain 1:          Reducing each adaptation stage to 15%/75%/10% of
#> Chain 1:          the given number of warmup iterations:
#> Chain 1:            init_buffer = 3
#> Chain 1:            adapt_window = 20
#> Chain 1:            term_buffer = 2
#> Chain 1: 
#> Chain 1: Iteration:  1 / 30 [  3%]  (Warmup)
#> Chain 1: Iteration:  3 / 30 [ 10%]  (Warmup)
#> Chain 1: Iteration:  6 / 30 [ 20%]  (Warmup)
#> Chain 1: Iteration:  9 / 30 [ 30%]  (Warmup)
#> Chain 1: Iteration: 12 / 30 [ 40%]  (Warmup)
#> Chain 1: Iteration: 15 / 30 [ 50%]  (Warmup)
#> Chain 1: Iteration: 18 / 30 [ 60%]  (Warmup)
#> Chain 1: Iteration: 21 / 30 [ 70%]  (Warmup)
#> Chain 1: Iteration: 24 / 30 [ 80%]  (Warmup)
#> Chain 1: Iteration: 26 / 30 [ 86%]  (Sampling)
#> Chain 1: Iteration: 28 / 30 [ 93%]  (Sampling)
#> Chain 1: Iteration: 30 / 30 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.009 seconds (Warm-up)
#> Chain 1:                0.002 seconds (Sampling)
#> Chain 1:                0.011 seconds (Total)
#> Chain 1: 
#> Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
#> https://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 1.9, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat

# generate predictions on test data
ext_mod <- rstan::extract(mod)
#> Error in (function (classes, fdef, mtable) {    methods <- .findInheritedMethods(classes, fdef, mtable)    if (length(methods) == 1L)         return(methods[[1L]])    else if (length(methods) == 0L) {        cnames <- paste0("\"", vapply(classes, as.character,             ""), "\"", collapse = ", ")        stop(gettextf("unable to find an inherited method for function %s for signature %s",             sQuote(fdef@generic), sQuote(cnames)), domain = NA)    }    else stop("Internal error in finding inherited methods; didn't return a unique method",         domain = NA)})(list("paramedic"), new("nonstandardGenericFunction", .Data = function (object,     ...) {    standardGeneric("extract")}, generic = structure("extract", package = "rstan"), package = "rstan",     group = list(), valueClass = character(0), signature = "object",     default = NULL, skeleton = (function (object, ...)     stop(gettextf("invalid call in method dispatch to '%s' (no default method)",         "extract"), domain = NA))(object, ...)), <environment>): unable to find an inherited method for function 'extract' for signature '"paramedic"'
beta0_post <- ext_mod$beta_0
#> Error in eval(expr, envir, enclos): object 'ext_mod' not found
sigma_post <- ext_mod$Sigma
#> Error in eval(expr, envir, enclos): object 'ext_mod' not found
sigmae_post <- ext_mod$sigma_e
#> Error in eval(expr, envir, enclos): object 'ext_mod' not found
pred_mod <- predict_paramedic(W = example_16S_data[folds == 2, 1:10], 
V = example_qPCR_data[folds == 2, ], n_iter = 75, n_burnin = 25,
 beta_0 = beta0_post, Sigma = sigma_post, sigma_e = sigmae_post, 
 n_chains = 1, stan_seed = 1234) 
#> Error in eval(expr, envir, enclos): object 'sigma_post' not found