Comparison And Evaluation Of Methods For Handling Data Clusters In Regression Models
Author: Saima Ishaque


Mixture models and their variants are widely used in various disciplines. In this thesis we have considered four families of mixture models to address some modeling considerations through empirical applications. We employed a step-by-step model selection process to address the potential usefulness of the taken approaches as well as we incorporated auxiliary variables for prediction purpose. We have incorporated various notorious issues relevant to mixture modeling in the specific empirical setup including dependencies of observations, dependence between covariates and indictors, sparse data and model selection. We applied step 1 and step 3 approaches for separate discussion of distal outcomes and covariates inclusion in modeling setup of latent class cluster model, regression mixtures, growth mixtures and Markov models. Uni and multivariate mixed mode data are employed. Unconditional and conditional models are estimated and compared through a model evaluation kit consisting of absolute, relative, and bootstrapping based criteria. In latent class cluster models job quality typology is searched and compared for basic unconditional, for direct effects case and for continuous factor versions of latent class models. The explored job quality describes four clusters in terms of job quality variations for considered sample of American workers. For latent class regression case we find non-presence of differential effects of job satisfaction predictors. For Growth mixture variants in empirical application of employment status growth patterns we have found three clusters of active, inactive and mediocre active participants over the age span of 16years.In Markov modeling setup variants further improve the model fit compared to growth models since autocorrelation, heterogeneity of data and measurement error is simultaneously addressed in this case. The transitions and switching probabilities for three clusters of employed ,unemployed and inactive are calculated and compared.

Meta Data

Keywords : auxiliary variables, differential effects, direct effects, growth differences, Mixed-mode data, mixture modeling, Step-3 analysis, transition status, typology building, unobserved heterogeneity
Supervisor: Ahsan-ul-Haq
Cosupervisor: Asad Zaman

Related Thesis​