emuFit
from radEmu
.fastEmuFit.Rd
A fast approximation to emuFit
from radEmu
.
fastEmuFit(
reference_set = "data_driven",
reference_set_size = 30,
Y,
X = NULL,
formula = NULL,
data = NULL,
test_kj = NULL,
cluster = NULL,
penalize = TRUE,
B = NULL,
fitted_model = NULL,
refit = TRUE,
fastEmu_refit = FALSE,
return_wald_p = FALSE,
compute_cis = TRUE,
run_score_tests = TRUE,
verbose = FALSE,
...
)
The reference set to use in the identifiability constraint.
The user can input a reference set as a vector of numbers that represent indices
for columns of the Y
matrix, or names that correspond with column names of
the Y
matrix. If a reference set is not provided, by default, this is set
to data_driven
, and fastEmuFit
will identify a reference set of typical
taxa of size reference_set_size
. If data_driven_ss
or
data_driven_thin
, a data-driven reference set will be determined using sample
splitting or Poisson thinning respectively. The reference set can either be a single
object, or a list of objects of length p
, for each row of the beta
matrix.
The size of the reference set if it is data-driven, default
is set to 30
. We recommend a reference set of size 30-100 for the best balance
of computational efficiency and estimation precision.
an n x J matrix or dataframe of nonnegative observations, or a phyloseq
or TreeSummarizedExperiment
object containing an otu table and sample data.
an n x p matrix or dataframe of covariates (optional, either include X
or formula
and data
)
a one-sided formula specifying the form of the mean model to be fit
an n x p data frame containing variables given in formula
a data frame whose rows give coordinates (in category j and
covariate k) of elements of B to construct hypothesis tests for. If test_kj
is not provided, all elements of B save the intercept row will be tested.
a numeric vector giving cluster membership for each row of Y to be used in computing GEE test statistics. Default is NULL, in which case rows of Y are treated as independent.
logical: should Firth penalty be used in fitting model? Default is TRUE.
starting value of coefficient matrix (p x J). If not provided, B will be initiated as a zero matrix.
a fitted model produced by a call to fastEmu::fastEmuFit
or
radEmu::emuFit
; to be provided if score tests are to be run without refitting the
full unrestricted model. Default is NULL.
logical: if B or fitted_model is provided, in the radEmu
estimation step,
should estimation be rerun? Default is TRUE.
logical: if fitted_model is provided that has been produced by a call to
fastEmu::fastEmuFit
, should estimation and reference set step be skipped (FALSE), e.g.
if score tests are to be run on an already fitted fastEmuFit
model. Default is FALSE
.
logical: return p-values from Wald tests? Default is FALSE. These can only be
returned if estimate_full_model
is TRUE.
logical: compute and return Wald CIs? Default is TRUE. These can only be
returned if estimate_full_model
is TRUE.
logical: perform robust score testing? Default is TRUE.
provide updates as model is being fitted and score tests are run? Defaults to FALSE.
Additional arguments to radEmu:::emuFit. See possible arguments with ?radEmu::emuFit
.
A list that includes all elements of an emuFit
object from radEmu::emuFit()
, as
well as additional elements. See the documentation in ?radEmu::emuFit
for a full description of the
elements in an emuFit
object.The emuFit
object includes the matrix coef
, which provides
estimates for all parameters and score statistics and p-values for all parameters that were tested.
The returned object also includes reference_set
and reference_set_names
, which give the
indices of the reference set in terms of columns of the Y
matrix and category names respectively,
of the categories (taxa) that were used as a reference set of "typical taxa" for the identifiability
constraint. Other elements of the list correspond to score tests. included_categories
gives the
set of categories used for the reduced model for each score test, score_test_hyperparams
provides
the hyperparameters related to estimation under the null hypothesis for each score test. If return_null_B
or return_score_components
were set to TRUE
, then null_B
or score_components
will also be returned, which respectively give the estimated B values under the null hypothesis and the
components of the robust score test that are run, for each score test.