Structural Break Detection And Model Selection: Comparison Of Regularization Techniques With Autometrics
Author: Sara


The indicator Saturation method is a popular method for structural break and outlier detection that simultaneously detects the structural break/outliers in a model. Step Indicator Saturation (SIS) does not possess any restriction on the number or length of breaks, breaks at the start or end of observations. In contrast, IIS is a more efficient technique for handling outliers in cross-sectional modeling. The indicator saturation method uses Autometrics techniques for computation. However, the thriving model depends on the selection of significance level (with a significance level of 0.01 or 0.001 model drops the significant break, and with a nominal significance level 0.05, it retains irrelevant breaks).

Meanwhile, regularization techniques efficiently deal with the saturated model even if the regressors are far greater than the number of observations. This study uses well-known regularization techniques, Least Absolute Subset Selection Operator (LASSO), Adaptive Least Absolute Subset Selection Operator (AdaLASSO), Minimax Concave Penalty (MCP), and Smoothly Clipped Absolute Deviation (SCAD) for structure break and outlier detection and compared with Autometrics. We assess the performance of regularization techniques in terms of Gauge (β€˜Size’), Potency (β€˜Power’), RMSE, and MAE with different Data Generating Processes (DGP) in the simulation study. For structure break detection in simulation experimental, we consider three different scenarios single break at the end of observation, single break at the start of observations, and unknown break with two-step indicators. However, for outlier detection, we consider two different scenarios outliers with AR(1) process and different magnitudes. The second simulation experiment was with a static multivariate model with varying outlying observations of 5%, 10%, and 20% obtained by assuming πœ€π‘–~(0, 𝜎 + 4) π‘Žπ‘›π‘‘ πœ€π‘–~(0, 𝜎 + 6).. The final simulation experiment is based on the covariate and its lag selection with varying autocorrelation coefficients and sample sizes in time series modeling.

The simulation result indicates that MCP and SCAD perform near Autometrics in average potency with fixed tuning parameter in single and multiple breaks detections. On the other hand, LASSO estimates work well for single break detection, whereas it selects more irrelevant dummy indicators for multiple breaks. The SCAD and MCP perform better in forecasting and covariate selection in simulation studies with a 4SD outlier (20% and 5% outlying observations), nonetheless, as compared to Autometrics Meanwhile, LASSO and AdaLASSO select more covariates and possess higher RMSE than SCAD and MCP. Overall, SCAD and MCP possess ii least RMSE than Autometrics. Although, for covariate and its lag selection, compared to Autometrics, the WLAdaLASSO outperforms in covariate and its lag selection as well as in forecasting, especially when there is a greater linear dependency between predictors. In contrast, the efficiency of Autometrics in potency decreases with a strong linear dependence between predictors. However, under the large sample and weak linear dependency between predictors, the Autometrics potency approaches to 1 and gauge approaches to Ξ±.

In contrast, LASSO, SCAD, and MCP, select more covariates and possess higher RMSE than WLAdaLASSO and Autometrics. The real data analysis has been performed for each simulation experiment on a popular macroeconomic variable of Pakistan. The real data analysis is aligned with simulation findings.

Meta Data

Keywords : Autometrics, Data Generating Process, Regularization Techniques, Simulation, Step Indicator Saturation
Supervisor: Amena Urooj

Related Thesis​