Ghostscript wrapper for c:\documents and settings\winkel\my documents\teaching\bs3bht10(ds)\bs3b_lecture_notes2010_3.pdf
Assignment 6 – BS3b Statistical Lifetime-Models – Oxford HT 2013
Model testing, proportional-hazards, accelerated life
(a) Supppose that we have a random sample which includes right-censored data (censoring as-
sumed non-informative). We wish to decide whether or not a Weibull distribution is appro-priate. Using an estimator of the survival function how might we graphically investigate theappropriateness of the model? Given that the model appears to be appropriate how wouldyou test whether or not the special case of an exponential model is valid? Suppose that theWeibull model does not appear to be appropriate what graph would you use to consider alog-logistic model?
(b) Now suppose that there are two groups to be considered (eg smokers v. non-smokers). What
graphs would be appropriate for consideration of a proportional hazards model, acceleratedlife model respectively?
(c) Gehan (1965) studied 42 leukaemia patients. Some were treated with the drug 6-mercaptopurine
and the rest are controls. The trial was designed as matched pairs, but both members of apair observed until both came out of remission or the study ended. (The data are includedunder the name gehan in the R package MASS. The description attached to these data theresays that in each pair both were withdrawn from the trial when either came out of remission.
If you have a look at the data, you can see that this is clearly not true.) The observed timesto recurrence (in months) were:
Controls: 1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23
Treatment: 6+, 6, 6, 6, 7, 9+, 10+, 10, 11+, 13, 16, 17+, 19+, 20+, 22, 23, 25+,
Here + indicates censored times. Investigate these data in respect of both a) and b).
(a) Describe the proportional hazards model, explaining what is meant by the partial likelihood
and how this can be used to estimate resgression coefficients. How might standard errors begenerated?
(b) Drug addicts are treated at two clinics (clinic 0 and clinic 1) on a drug replacement therapy.
The response variables are the time to relapse (to re-taking drugs) and the status relapse=1 and censored =0. There are three explanatory variables, clinic (0 or 1), previous stayin prison (no=0, yes=1) and the prescribed amount of the replacement dose. The followingresults are obtained using a proportional hazards model, h(t, x) = eβxh0(t).
What is the estimated hazard ratio for a subject from clinic 1 who has not been in prisonas compared to a subject from clinic 0 who has been in prison, given that they are eachassigned the same dose?
(c) Find a 95% confidence interval for the hazard ratio comparing those who have been in prison
to those who have not, given that clinic and dose are the same.
(a) Sketch the shape of the hazard function in the following cases, paying attention to any
changes of shape due to changes in value of κ where appropriate.
(b) Suppose that it is thought that an accelerated life model is valid and that the hazard function
has a maximum at a non-zero time point. Which parametric models might be appropriate?
Assignment 6 – BS3b Statistical Lifetime-Models – Oxford HT 2013
(c) Suppose that y1, . . . , yn are observations from a lifetime distribution with respective vectors
of covariates x1, . . . , xn. It is thought that an appropriate distribution for lifetime y is Weibullwith parameters ρ, κ, where the link is log ρ = β′x. In the case that there is no censoringwrite down the likelihood and, using maximum likelihood, give equations from which thevector of estimated regression coefficients β (and also the estimate for κ) could be found.
What would be the asymptotic distribution of the vector of estimators? How would thelikelihood differ if some of the observations yi were right censored (assuming independentcensoring)?
4. Coronary Heart Disease (CHD) remains the leading cause of death in many countries. The evidence
is substantial that males are at higher risk than females, but the role of genetic factors versus thegender factor is still under investigation. A study was performed to assess the gender risk ofdeath from CHD, controlling for genetic factors. A dataset consisting of non-identical twins wasassembled. The age at which each person died of CHD was recorded. Individuals who either hadnot died or had died from other causes had censored survival times (age). A randomly selectedsubsample from the data is as follows. (* indicates a censored observation.)
(a) Write down the times of events and list the associated risk sets.
(b) Suppose the censoring mechanism is independent of death times due to CHD, and that
the mortality rates for male and female twins satisfy the PH assumption, and let β be theregression coefficient for the binary covariate that codes gender as 0 or 1 for male or femalerespectively. Write down the partial-likelihood function. Using a computer or programmablecalculator, compute and plot the partial-likelihood for a range of values of β. What is theCox-regression estimate for β? What does this mean?
(c) Estimate the survival function for male twins.
(d) Suppose now only that the censoring mechanism is independent of death times due to CHD,
perform the log-rank test for equivalence of hazard amongst these two groups. Contrast thetest statistic and associated p-value with the results from the Fleming Harrington test usinga weight W (ti) = ˆ
(e) Do you think the assumption of an independent censoring mechanism is appropriate? Give
5. [Optional extra question] The life span distribution of machine components of a particular
type is affected by varying levels of stress. The stress level is measured by a non-negative variablex. The base level is denoted by x = 0 and the median lifetime becomes shorter with increasing x.
Assume that the cumulative hazard rate function for an item under stress x is of the form
where H0(t) is the cumulative hazard rate at baseline x = 0. What type of model does thisdescribe?
(a) Show that this is equivalent to assuming that a lifetime T (x) under stress x has the same
(b) Suppose that independent data are to be collected for stress levels x1, x2, . . . , xn. Show
that the assumed form of distribution leads to a regression model for ln T (xj) (generallynon-normal) of the form
where ǫj are independent, identically distributed and with zero mean. What is the connectionbetween the constant α and ln T (0)?
(c) If the baseline distribution is unknown suggest how you might estimate α and β. Give these
estimators and also their variances (you will need to use Var(ln T (0)) for the latter). Is thereany way in which you could test the validity of the model?
Drug and Alcohol Review (March 2006), 25, 111 – 113The rise of Viagra among British illicit drug users: 5-year survey dataJIM MCCAMBRIDGE1, LUKE MITCHESON1,2, NEIL HUNT2,3 & ADAM WINSTOCK31National Addiction Centre, Institute of Psychiatry, King’s College London, London, UK, 2South London and MaudsleyNHS Trust and KCA, UK, and 3South West Sydney Area Health Services and National Drug an
Eating Disorders Introduction In western cultures eating problems ranging from severe morbid obesity to anorexia nervosa have achieved an increasing amount of media interest. Morbid obesity and obesity probably have the most impact on medical health economics, medical problems such as type II diabetes becoming increasingly common in those who are overweight. However, within our society ther