emuFit from radEmu.fastEmuFit.RdA fast approximation to emuFit from radEmu.
fastEmuFit(
  reference_set = "data_driven",
  reference_set_size = 30,
  Y,
  X = NULL,
  formula = NULL,
  data = NULL,
  test_kj = NULL,
  cluster = NULL,
  penalize = TRUE,
  B = NULL,
  fitted_model = NULL,
  refit = TRUE,
  fastEmu_refit = FALSE,
  return_wald_p = FALSE,
  compute_cis = TRUE,
  run_score_tests = TRUE,
  verbose = FALSE,
  ...
)The reference set to use in the identifiability constraint.
The user can input a reference set as a vector of numbers that represent indices
for columns of the Y matrix, or names that correspond with column names of
the Y matrix. If a reference set is not provided, by default, this is set
to data_driven, and fastEmuFit will identify a reference set of typical
taxa of size reference_set_size. If data_driven_ss or
data_driven_thin, a data-driven reference set will be determined using sample
splitting or Poisson thinning respectively. The reference set can either be a single
object, or a list of objects of length p, for each row of the beta matrix.
The size of the reference set if it is data-driven, default
is set to 30. We recommend a reference set of size 30-100 for the best balance
of computational efficiency and estimation precision.
an n x J matrix or dataframe of nonnegative observations, or a phyloseq
or TreeSummarizedExperiment object containing an otu table and sample data.
an n x p matrix or dataframe of covariates (optional, either include X
or formula and data)
a one-sided formula specifying the form of the mean model to be fit
an n x p data frame containing variables given in formula
a data frame whose rows give coordinates (in category j and
covariate k) of elements of B to construct hypothesis tests for. If test_kj
is not provided, all elements of B save the intercept row will be tested.
a numeric vector giving cluster membership for each row of Y to be used in computing GEE test statistics. Default is NULL, in which case rows of Y are treated as independent.
logical: should Firth penalty be used in fitting model? Default is TRUE.
starting value of coefficient matrix (p x J). If not provided, B will be initiated as a zero matrix.
a fitted model produced by a call to fastEmu::fastEmuFit or
radEmu::emuFit; to be provided if score tests are to be run without refitting the
full unrestricted model. Default is NULL.
logical: if B or fitted_model is provided, in the radEmu estimation step,
should estimation be rerun? Default is TRUE.
logical: if fitted_model is provided that has been produced by a call to
fastEmu::fastEmuFit, should estimation and reference set step be skipped (FALSE), e.g.
if score tests are to be run on an already fitted fastEmuFit model. Default is FALSE.
logical: return p-values from Wald tests? Default is FALSE. These can only be
returned if estimate_full_model is TRUE.
logical: compute and return Wald CIs? Default is TRUE. These can only be
returned if estimate_full_model is TRUE.
logical: perform robust score testing? Default is TRUE.
provide updates as model is being fitted and score tests are run? Defaults to FALSE.
Additional arguments to radEmu:::emuFit. See possible arguments with ?radEmu::emuFit.
A list that includes all elements of an emuFit object from radEmu::emuFit(), as
well as additional elements. See the documentation in ?radEmu::emuFit for a full description of the
elements in an emuFit object.The emuFit object includes the matrix coef, which provides
estimates for all parameters and score statistics and p-values for all parameters that were tested.
The returned object also includes reference_set and reference_set_names, which give the
indices of the reference set in terms of columns of the Y matrix and category names respectively,
of the categories (taxa) that were used as a reference set of "typical taxa" for the identifiability
constraint. Other elements of the list correspond to score tests. included_categories gives the
set of categories used for the reduced model for each score test, score_test_hyperparams provides
the hyperparameters related to estimation under the null hypothesis for each score test. If return_null_B
or return_score_components were set to TRUE, then null_B or score_components
will also be returned, which respectively give the estimated B values under the null hypothesis and the
components of the robust score test that are run, for each score test.