Estimate concentrations (and efficiencies, if applicable) by combining absolute and relative abundance data.

run_paramedic(
  W,
  V,
  X = V[, 1, drop = FALSE],
  k = 0,
  n_iter = 10500,
  n_burnin = 10000,
  n_chains = 4,
  stan_seed = 4747,
  centered = FALSE,
  inits_lst = NULL,
  sigma_beta = sqrt(50),
  sigma_Sigma = sqrt(50),
  alpha_sigma = 2,
  kappa_sigma = 1,
  alpha_phi = 0,
  beta_phi = 0,
  sigma_xi = 1,
  ...
)

Arguments

W

The relative abundance data, e.g., from broad range 16S sequencing with "universal" primers. Expects data (e.g., matrix, data.frame, tibble) with sample identifiers in the first column. Sample identifiers must be the same between W and V, and the column must have the same name in W and V.

V

The absolute abundance data, e.g., from taxon-specific absolute primers. Expects data (e.g., matrix, data.frame, tibble) with sample identifiers in the first column. Sample identifiers must be the same between W and V, and the column must have the same name in W and V.

X

The covariate data. Expects data (e.g., matrix, data.frame, tibble) with sample identifiers in the first column. Sample identifiers must be the same between W, V, and X, and the column must have the same name in W, V, and X. If X only consists of the subject identifiers, then no covariates are used.

k

the number of batches that the relative abundance data W were analyzed in. If k = 0 (the default), then batch effects are not considered.

n_iter

The total number of iterations per chain to be run by the Stan algorithm. Defaults to 10500.

n_burnin

The total number of warmups per chain to be run by the Stan algorithm. Defaults to 10000.

n_chains

The total number of chains to be run by the Stan algorithm. Defaults to 4.

stan_seed

The random number seed to initialize.

centered

If TRUE, uses a centered parameterization on the true concentrations \(\mu\); if FALSE (the default), uses a non-centered parameterization.

inits_lst

An optional list of initial values of the parameters. Must be a named list; see stan.

sigma_beta

Hyperparameter specifying the prior variance on \(\beta_0\). Defaults to \(\sqrt{50}\).

sigma_Sigma

Hyperparameter specifying the prior variance on \(\Sigma\). Defaults to \(\sqrt{50}\).

alpha_sigma

Hyperparameter specifying the shape parameter of the prior distribution on \(\sigma_e\). Defaults to 2.

kappa_sigma

Hyperparameter specifying the scale parameter of the prior distribution on \(\sigma_e\). Defaults to 1.

alpha_phi

Hyperparameter specifying the shape parameter of the prior distribution on \(\phi\). Defaults to 0; a negative binomial model can be specified if both alpha_phi and beta_phi are nonzero.

beta_phi

Hyperparameter specifying the rate parameter of the prior distribution on \(\phi\). Defaults to 0; a negative binomial model can be specified if both alpha_phi and beta_phi are nonzero.

sigma_xi

Hyperparameters specifying the variance of efficiencies over batches. Only used if k is greater than zero. Defaults to 1.

...

other arguments to pass to sampling (e.g., control).

Value

An object of class stanfit.

Details

We fit a hierarchical model in Stan to the data, with goal to estimate true concentration for all taxa. There are two available hierarchical models. The two only differ in the way that the true concentration is parameterized. While these hierarchical models are mathematically identical, they are fit differently by Stan. The first hierarchical model is a "centered" model, where $$\beta_0 ~ N(0, \sqrt{50})$$ $$\log \Sigma ~ N(0, \sqrt{50})$$ $$\log \mu ~ N(\beta_0 + X\beta_1, \Sigma).$$ We call this model variable_efficiency_centered.stan. The second hierarchical model is a "noncentered" model, where $$\beta_0 ~ N(0, \sqrt{50})$$ $$\log \Sigma ~ N(0, \sqrt{50})$$ $$\log \gamma ~ N(0, I)$$ $$\log \mu = \sqrt{\Sigma} \log \gamma + \beta_0 + X\beta_1.$$ We call this model variable_efficiency.stan. In most cases, we suggest using the noncentered model. However, if the model does not converge in a reasonable amount of time using the noncentered model, consider trying the centered model.

See also

stan and sampling for specific usage of the stan and sampling functions.

Examples

# load the package, read in example data library("paramedic") data(example_16S_data) data(example_qPCR_data) # run paramedic (with an extremely small number of iterations, for illustration only) # on only the first 10 taxa mod <- run_paramedic(W = example_16S_data[, 1:10], V = example_qPCR_data, n_iter = 30, n_burnin = 25, n_chains = 1, stan_seed = 4747)
#> #> SAMPLING FOR MODEL 'variable_efficiency' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 9e-05 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.9 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: WARNING: There aren't enough warmup iterations to fit the #> Chain 1: three stages of adaptation as currently configured. #> Chain 1: Reducing each adaptation stage to 15%/75%/10% of #> Chain 1: the given number of warmup iterations: #> Chain 1: init_buffer = 3 #> Chain 1: adapt_window = 20 #> Chain 1: term_buffer = 2 #> Chain 1: #> Chain 1: Iteration: 1 / 30 [ 3%] (Warmup) #> Chain 1: Iteration: 3 / 30 [ 10%] (Warmup) #> Chain 1: Iteration: 6 / 30 [ 20%] (Warmup) #> Chain 1: Iteration: 9 / 30 [ 30%] (Warmup) #> Chain 1: Iteration: 12 / 30 [ 40%] (Warmup) #> Chain 1: Iteration: 15 / 30 [ 50%] (Warmup) #> Chain 1: Iteration: 18 / 30 [ 60%] (Warmup) #> Chain 1: Iteration: 21 / 30 [ 70%] (Warmup) #> Chain 1: Iteration: 24 / 30 [ 80%] (Warmup) #> Chain 1: Iteration: 26 / 30 [ 86%] (Sampling) #> Chain 1: Iteration: 28 / 30 [ 93%] (Sampling) #> Chain 1: Iteration: 30 / 30 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.008344 seconds (Warm-up) #> Chain 1: 0.001509 seconds (Sampling) #> Chain 1: 0.009853 seconds (Total) #> Chain 1:
#> Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See #> http://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 2.5, indicating chains have not mixed. #> Running the chains for more iterations may help. See #> http://mc-stan.org/misc/warnings.html#r-hat