Produces point estimates, interval estimates, and p values for an arbitrary
functional (mean, geometric mean, proportion, odds, hazard) of a
variable of class `integer`

, or `numeric`

when
regressed on an arbitrary number of covariates. Multiple Partial F-tests can
be specified using the `U`

function.

## Usage

```
regress(
fnctl,
formula,
data,
intercept = TRUE,
weights = rep(1, nrow(data.frame(data))),
subset = rep(TRUE, nrow(data.frame(data))),
robustSE = TRUE,
conf.level = 0.95,
exponentiate = fnctl != "mean",
replaceZeroes,
useFdstn = TRUE,
suppress = FALSE,
na.action,
method = "qr",
qr = TRUE,
singular.ok = TRUE,
contrasts = NULL,
init = NULL,
ties = "efron",
offset,
control = list(...),
...
)
```

## Arguments

- fnctl
a character string indicating the functional (summary measure of the distribution) for which inference is desired. Choices include

`"mean"`

,`"geometric mean"`

,`"odds"`

,`"rate"`

,`"hazard"`

.- formula
an object of class

`formula`

as might be passed to`lm`

,`glm`

, or`coxph`

. Functions of variables, specified using`dummy`

or`polynomial`

may also be included in`formula`

.- data
a data frame, matrix, or other data structure with matching names to those entered in

`formula`

.- intercept
a logical value indicating whether a intercept exists or not. Default value is

`TRUE`

for all functionals. Intercept may also be removed if a "-1" is present in`formula`

. If "-1" is present in`formula`

but`intercept = TRUE`

is specified, the model will fit without an intercept. Note that when`fnctl = "hazard"`

, the intercept is always set to`FALSE`

because Cox proportional hazards regression models do not explicitly estimate an intercept.- weights
vector indicating optional weights for weighted regression.

- subset
vector indicating a subset to be used for all inference.

- robustSE
a logical indicator that standard errors (and confidence intervals) are to be computed using the Huber-White sandwich estimator. The default is TRUE.

- conf.level
a numeric scalar indicating the level of confidence to be used in computing confidence intervals. The default is 0.95.

- exponentiate
a logical indicator that the regression parameters should be exponentiated. This is by default true for all functionals except the mean.

- replaceZeroes
if not

`FALSE`

, this indicates a value to be used in place of zeroes when computing a geometric mean. If`TRUE`

, a value equal to one-half the lowest nonzero value is used. If a numeric value is supplied, that value is used. Defaults to`TRUE`

when`fnctl = "geometric mean"`

. This parameter is always`FALSE`

for all other values of`fnctl`

.- useFdstn
a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution even in logistic regression models. When using the F distribution, the degrees of freedom are taken to be the sample size minus the number of parameters, as it would be in a linear regression model.

- suppress
if

`TRUE`

, and a model which requires exponentiation (for instance, regression on the geometric mean) is computed, then a table with only the exponentiated coefficients and confidence interval is returned. Otherwise, two tables are returned - one with the original unexponentiated coefficients, and one with the exponentiated coefficients.- na.action, qr, singular.ok, offset, contrasts, control
optional arguments that are passed to the functionality of

`lm`

or`glm`

.- method
the method to be used in fitting the model. The default value for

`fnctl = "mean"`

and`fnctl = "geometric mean"`

is`"qr"`

, and the default value for`fnctl = "odds"`

and`fnctl = "rate"`

is`"glm.fit"`

. This argument is passed into the lm() or glm() function, respectively. You may optionally specify`method = "model.frame"`

, which returns the model frame and does no fitting.- init
a numeric vector of initial values for the regression parameters for the hazard regression. Default initial value is zero for all variables.

- ties
a character string describing method for breaking ties in hazard regression. Only

`efron`

,`breslow`

, or`exact`

is accepted. See more details in the documentation for this argument in the survival::coxph function. Default to`efron`

.- ...
additional arguments to be passed to the

`lm`

function call

## Value

An object of class uRegress is returned. Parameter estimates, confidence intervals, and p values are contained in a matrix $augCoefficients.

## Details

Regression models include linear regression (for the ``mean'' functional), logistic regression with logit link (for the ``odds'' functional), Poisson regression with log link (for the ``rate'' functional), linear regression of a log-transformed outcome (for the ``geometric mean'' functional), and Cox proportional hazards regression (for the hazard functional).

Currently, for the hazard functional, only `coxph` syntax is supported; in other words, using `dummy`, `polynomial`,
and `U`

functions will result in an error when `fnctl = hazard`.

Note that the only possible link function in `regress` with `fnctl = odds"` is the logit link. Similarly, the only possible link function in `regress` with `fnctl = "rate"` is the log link.

Objects created using the
`U`

function can also be passed in. If the
`U`

call involves a partial formula of the form
`~ var1 + var2`

, then `regress`

will return a multiple-partial
F-test involving `var1`

and `var2`

. If an F-statistic will already be
calculated regardless of the `U`

specification,
then any naming convention specified via `name ~ var1`

will be ignored.
The multiple partial tests must be the last terms specified in the model (i.e. no other predictors can
follow them).

## Examples

```
# Loading dataset
data(mri)
# Linear regression of atrophy on age
regress("mean", atrophy ~ age, data = mri)
#>
#> Call:
#> regress(fnctl = "mean", formula = atrophy ~ age, data = mri)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -36.870 -8.589 -0.870 7.666 51.203
#>
#> Coefficients:
#> Estimate Naive SE Robust SE 95%L 95%H
#> [1] Intercept -16.06 6.256 6.701 -29.22 -2.907
#> [2] age 0.6980 0.08368 0.09002 0.5213 0.8747
#> F stat df Pr(>F)
#> [1] Intercept 5.75 1 0.0168
#> [2] age 60.12 1 < 0.00005
#>
#> Residual standard error: 12.36 on 733 degrees of freedom
#> Multiple R-squared: 0.08669, Adjusted R-squared: 0.08545
#> F-statistic: 60.12 on 1 and 733 DF, p-value: 2.988e-14
#>
# Linear regression of atrophy on sex and height and their interaction,
# with a multiple-partial F-test on the height-sex interaction
regress("mean", atrophy ~ height + sex + U(hs=~height:sex), data = mri)
#>
#> Call:
#> regress(fnctl = "mean", formula = atrophy ~ height + sex + U(hs = ~height:sex),
#> data = mri)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -28.092 -8.799 -0.935 8.100 48.862
#>
#> Coefficients:
#> Estimate Naive SE Robust SE 95%L 95%H
#> [1] Intercept 47.60 16.28 14.75 18.64 76.57
#> [2] height -0.09272 0.1026 0.09296 -0.2752 0.08978
#> [3] sexMale 36.80 23.93 23.60 -9.543 83.14
#> [4] height:sexMale -0.1690 0.1442 0.1411 -0.4460 0.1079
#> F stat df Pr(>F)
#> [1] Intercept 10.41 1 0.0013
#> [2] height 0.99 1 0.3189
#> [3] sexMale 2.43 1 0.1194
#> [4] height:sexMale 1.44 1 0.2312
#>
#> Residual standard error: 12.51 on 731 degrees of freedom
#> Multiple R-squared: 0.06687, Adjusted R-squared: 0.06304
#> F-statistic: 16.9 on 3 and 731 DF, p-value: 1.274e-10
#>
# Logistic regression of sex on atrophy
mri$sex_bin <- ifelse(mri$sex == "Female", 1, 0)
regress("odds", sex_bin ~ atrophy, data = mri)
#>
#> Call:
#> regress(fnctl = "odds", formula = sex_bin ~ atrophy, data = mri)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -1.636 -1.129 0.741 1.110 1.985
#>
#> Coefficients:
#>
#> Raw Model:
#> Estimate Naive SE Robust SE F stat df Pr(>F)
#> [1] Intercept 1.430 0.2372 0.2421 34.87 1 < 0.00005
#> [2] atrophy -0.03962 6.296e-03 6.504e-03 37.12 1 < 0.00005
#>
#> Transformed Model:
#> e(Est) e(95%L) e(95%H) F stat df Pr(>F)
#> [1] Intercept 4.177 2.597 6.719 34.87 1 < 0.00005
#> [2] atrophy 0.9612 0.9490 0.9735 37.12 1 < 0.00005
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 1018.91 on 734 degrees of freedom
#> Residual deviance: 975.38 on 733 degrees of freedom
#> AIC: 979.38
#>
#> Number of Fisher Scoring iterations: 4
#>
# Cox regression of age on survival
library(survival)
regress("hazard", Surv(obstime, death)~age, data=mri)
#>
#> Call:
#> regress(fnctl = "hazard", formula = Surv(obstime, death) ~ age,
#> data = mri)
#>
#>
#> Coefficients:
#>
#> Raw Model:
#> Estimate Naive SE Robust SE F stat df Pr(>F)
#> [1] age 0.06795 0.01354 0.01412 23.17 1 < 0.00005
#>
#> Transformed Model:
#> e(Est) e(95%L) e(95%H) F stat df Pr(>F)
#> [1] age 1.070 1.041 1.100 23.17 1 < 0.00005
#> n = 735, number of events= 133
#> Overall significance test:
#> Likelihood ratio test= 22.33 on 1 df, p=2.292e-06
#> Wald test = 25.17 on 1 df, p=5.24e-07
#> Score (logrank) test = 25.44 on 1 df, p=4.554e-07
#>
#>
```