Sunday, April 20, 2008

survreg

Just checking that I understand the model being fit by survreg from the survival package:

##Generate n variates from the standard (mu=0, sigma=1)
##Smallest Extreme Value distribution.
rsev <- function(n) log(-log(runif(n)))

##g -- RNG for the underlying standard distribution
##for the log-location-scale family.
##Defaults to rsev to produce Weibull variates.
gen.data <- function(x, beta0=3, beta1=9, sigma=4, g=rsev) {
n <- length(x)
exp(beta0 + beta1 * x + sigma * g(n))
}
> x <- rnorm(10000)
> Time <- gen.data(x=x)
> f <- survreg(Surv(Time, rep(1, 10000)) ~ x)
> f
Call:
survreg(formula = Surv(Time, rep(1, 10000)) ~ x)

Coefficients:
(Intercept) x
3.032166 9.037986

Scale= 3.968822

Loglik(model)= -37402.9 Loglik(intercept only)= -46087.6
Chisq= 17369.46 on 1 degrees of freedom, p= 0
n= 10000
> Time <- gen.data(x=x, g=rnorm)
> f <- survreg(Surv(Time, rep(1, 10000)) ~ x, dist="lognormal")
> f
Call:
survreg(formula = Surv(Time, rep(1, 10000)) ~ x, dist = "lognormal")

Coefficients:
(Intercept) x
3.018112 8.926638

Scale= 3.995970

Loglik(model)= -58522.6 Loglik(intercept only)= -67530.3
Chisq= 18015.41 on 1 degrees of freedom, p= 0
n= 10000
>


So, it looks like the model being fit is T ~ F((log(t) - mu) / sigma) where mu = X * beta. What's it doing in the case of dist="exponential", "gaussian", or "logistic" then? Oh, you can specify the transformation, eg, log, via the "trans", "dtrans", and "itrans" fields of the distribution object, where leaving the fields unspecified seems to imply that the identity transformation should be used. (See survreg.distributions.)

No comments: