Time-to-event Models: Comparison, Modification and Application

ABSTRACT

The Cox model is primarily used in time-to-events models for the specific period of study, from the origin to the event of interest. The Cox model has a certain limitation: no outlier, no heteroscedasticity, and no time-dependent covariates are assumed in the data. In case of the existence of any of the above in data, the Cox model fails to estimate the true effect of covariates. A specific model is used, in the literature, for each problem. If there is an outlier in the data, robust Cox is mostly used. If there is heteroscedasticity in the data, the WLS model can be used. If there is a time-dependent covariate in the data, the time-dependent Cox model can be used. However, there is no unique model for handling three problems simultaneously.

This study modifies the existing Cox model that simultaneously solves the problems of handling outliers, heteroscedasticity, and time-dependent covariates in censored time-to- event data. It compares the Cox model, robust Cox, weighted least square, time-dependent Cox, and modified Cox models. For simulation experiments, four scenarios are considered, allowing outliers, heteroscedasticity, and time-dependent Cox with varying sample sizes, outlier quantities, and magnitudes. The first scenario deals with the presence of the outlier and heteroscedasticity case. We evaluate the performance of the modified Cox model and other existing models and find that the robust Cox outperformed other models. The increase in sample size results in a slightly improved performance of all models. The outlier position doesn’t affect the model performance due to cross-sectional behavior. Performance indicator RMSE slightly improves in all models if the parameter theta of exponential distribution increases from one to two. In the second scenario, we evaluate the performance of the modified  Cox  model  in  the presence of  outlier,  heteroscedasticity,  and  time- dependent covariates. The results show that the RMSE, MAE, and MAPE of modified Cox are smaller than other survival analysis models. Outlier quantity and magnitude decrease the performance of all models, and with the increase in sample size, all models’ performance increases; the RMSE value for all models improves slightly with the increase in the exponential distribution theta parameter value from one to two and increase in the sample size from 100 to 500.

In the third scenario, the existence of heteroscedasticity and time-dependent covariates problem is taken into account and again, the modified Cox model performed better among the family of survival analysis models. The decision is based on RMSE, MAE, and MAPE. In the final scenario, the outlier, heteroscedasticity, and time-dependent covariates problem is taken, and modified Cox performed better among all existing survival analysis models. After completing the simulation exercise for different scenarios, we also conducted the empirical application of the suggested modified cox model on a national dataset in Pakistan known as the Labour Force Survey (LFS, 2021). The empirical application shows that education, monthly income, gender, and marital status significantly increase the likelihood of returning to work after a major injury or disease, while age, region, and government employed significantly decrease the likelihood of returning to work.

Although, in the first scenario, the existence of the outlier and heteroscedasticity, the robust Cox model performed better. But, In the other three scenarios and for the real-data applications, the suggested modified Cox model performed better. The increase in sample size and exponential distribution parameter theta improves the RMSE of all models slightly. However, the outlier quantity and magnitude affect all model performance. The exponential distribution parameter theta increase positively affects all models significantly, meaning that an increase in the exponential distribution parameter theta value improves all models slightly and the model’s loss function.

Keywords: .

Nauman Ahmad

Meta Data

Author: Nauman Ahmad
Supervisor:Amena Urooj
Co-Supervisor: Saud Ahmed Khan
Internal Examiner: Eatzaz Ahmed
External Examiner: Iftikhar Hussain Adil, Ahsan ul Haq
Keywords : Hazard Ratio, Heteroscedasticity, Outlier, Schoenfield residuals, Survival Analysis, Time-dependent

Related Thesis​