THESIS
2012
Abstract
Response-biased sampling, in which samples are drawn from a population according
to the values of the response variable, is common in biomedical, epidemiological,
economic and social studies. In particular, the complete observations
in data with censoring, truncation or missing covariates can be regarded as
response-biased sampling under certain conditions. This work proposes to use
transformation models, known as the generalized accelerated failure time model
in econometrics, for regression analysis with response-biased sampling. With unknown
error distribution, the transformation models are broad enough to cover
linear regression models, the Cox's model and the proportional odds model as
special cases. To the best of our knowledge, except for the case-control logistic
regres...[
Read more ]
Response-biased sampling, in which samples are drawn from a population according
to the values of the response variable, is common in biomedical, epidemiological,
economic and social studies. In particular, the complete observations
in data with censoring, truncation or missing covariates can be regarded as
response-biased sampling under certain conditions. This work proposes to use
transformation models, known as the generalized accelerated failure time model
in econometrics, for regression analysis with response-biased sampling. With unknown
error distribution, the transformation models are broad enough to cover
linear regression models, the Cox's model and the proportional odds model as
special cases. To the best of our knowledge, except for the case-control logistic
regression, there is no report in the literature that a prospective estimation approach
can work for biased sampling without any modification. We prove that
the maximum rank correlation estimation is valid for response-biased sampling
and establish its consistency and asymptotic normality. Unlike the inverse probability
methods, the proposed method of estimation does not involve the sampling
probabilities, which are often difficult to obtain in practice. Without the need
of estimating the unknown transformation function or the error distribution, the
proposed method is numerically easy to implement with the Nelder-Mead simplex
algorithm, which does not require convexity or continuity. We propose an inference procedure using random weighting to avoid the complication of density
estimation when using the plug-in rule for variance estimation. Numerical studies
with supportive evidence are presented. Applications are illustrated with the
Forbes Global 2000 data and the Stanford heart transplant data. Inspired by
the maximum rank estimation with response-biased data, a similar rank method
for partial rank estimation with censored data is proposed. This method is also
supported by presented simulation studies.
Post a Comment