Regression analysis is a widely used statistical technique in investigating relationships between the response variable and outcome variable. The logistic regression examines the relationship between variables when the response variable has a dichotomous output i.e., has two possible levels and outcome variable which could be categorical or continuous. Logistic regression using maximum likelihood estimation has gained wide use in determining the parameter estimate but, in the case, where the covariates are correlated, there is an inflation in the variance, standard error of the estimator and high coefficient of determination for the regression model, leading to the problem of multicollinearity in the regression model, thereby resulting to an incorrect conclusion about the relationship among these variables, hence the traditional method of estimating the parameters fails and becomes unstable. To attempt addressing the presence of multicollinearity in the regression model, various methods have been proposed which includes Ridge estimator, Stein estimator, Bayesian estimator and Liu estimators. We therefore propose a modified estimator for estimating the parameter of the logit model in the presence of multicollinearity by modifying the existing Liu logistic estimator. The modified estimator is applied to real life data. Results showed that the Modified Liu Logistic estimator outperformed the existing estimators considered in this study, in terms of smaller variance, bias and the MSE of the estimator.
Published in | International Journal of Data Science and Analysis (Volume 8, Issue 6) |
DOI | 10.11648/j.ijdsa.20220806.13 |
Page(s) | 187-193 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2022. Published by Science Publishing Group |
Logistic Regression, Multicollinearity, Maximum Likelihood Estimation, Bias, Variance, Mean Square Error
[1] | Aguilera, A. M., M. Escabias and M. J Valderrama (2006). Using Principal Component for estimating Logistic Regression with high-dimensional multicollinear data. Comput. Stat. Data Anal., 50 (8): 1905-1924. |
[2] | Agresti, A. (2007). An introduction to categorical data analysis, 2nd Edition. New York: Wiley. |
[3] | Fisher, R. A. (1935). "The fiducial argument in statistical inference". Annals of Eugenics. 8 (3): 391–398. |
[4] | Frank, I. E and J. H Friedman, 1993. A Statistical view of some Chenometrics regression tools. Technometrics, 35: 109-135. |
[5] | Grenne, W (2000) Econometric Analysis, 5th Ed. New Jersey: Prentice Hall. |
[6] | Hansman, J. and McFadden, D. (1984). Specification tests for the multinomial logit model, Econometrica 52 (5): 1219-1240. |
[7] | Hoerl, A. E. and Kennard, R. W. (1970). Ridge Regression Biased Estimation for non-Orthogonal Problems. Communication is statistics: Theory and Methods (4) 105-123. |
[8] | Hosmer, D. W. and Lemeshow, S. (2000). Applied Logistic Regression, 2nd edn. New York: Wiley. |
[9] | Leamer, E. E. (1978), Specification searches: Ad Hoc Inference with Non-experimental Data, New York: John Wiley. |
[10] | Liu, K (1993) A New class of biased estimate in linear regression. Communications in Statistics. Theory and Methods, (22): 393-402. |
[11] | Liu, K (2003). Using Liu-Type estimator to combat collinearity. Communications in Statistics-Theory and Methods, 32 (5): 1009-1020. |
[12] | Mansson, K., Kibria, B. M. G and Shukur, G. (2012) On Liu Estimators for the logit Regression Model. The Royal Institute of Technology Center of Excellence for Science and Innovation studies (CESIS), Sweden. 259-272. |
[13] | Massy, W. F. (1965). Principal component Regression in Explanatory Statistical Research. Journal of American Statistical Association, 60 (309): 234-256. |
[14] | Maura E Stokes, Charles, S. Davis (2000). Categorical Data Analysis using the SAS System. Second Edition. Cary; NC; SAS Institute Inc. |
[15] | Montgomery, D. C., Peck, E. A., and Vining, G. G. (2001). Introduction to Linear Regression Analysis (3rd ed.): John Wiley and Sons, Inc. |
[16] | Nja, M. E., Ogoke, U. P and Nduka, E. C. (2013) The Logistic Regression Model with a Modified Weight Function. Journal of Statistical and Econometrics Method, 2 (3); 161-171. |
[17] | Ogoke U. P., Nduka E. C., and Nja M. E. (2013). A New Logistic Ridge Regression Estimator Using Exponentiated Response Function. Journal of Statistical and Econometric Method. Vol. 2 (4); p. 161-171. |
[18] | Onwukwe, C. E., and I. A. Aki. (2015) "On Selection of Best Sensitive Logistic Estimator in the Presence of Collinearity." American Journal of Applied Mathematics and Statistics 3.1 (2015): 7-11. |
[19] | Rasha A. F., and Samah M. A., (2017). Evaluating the performance of the Liu Logistic regression estimator. Research Journal of Mathematics and Statistics, 9 (2): 11-19. |
[20] | Ryan, T. P. (1997). Modern Regression Methods. John Wiley & Sons, Inc., New York. |
[21] | Schaefer, R. L, Roi, L. D, and Wolfe, R. A (1984). A Ridge Logistic Estimator. Communications in statistics. Theory and Methods, 13: 99-133. |
[22] | Tibshirani, R. (1996). Regression Shrinkage and Selection via the LASSO. Journal of Royal Statistical Society. 58: 267-288. |
[23] | Traissac P., Martin-Prevel Y., Delpeuch F., & Maire B. (1999). Logistic regression vs other generalized linear models to estmate prevalence rate ratios. Journal of the Royal Statistical Society 47 (6), 593-604. |
[24] | Urgan, N. N and Tez, M (2008) Liu Estimator in Logistic Regression when the data are collinear. International Conference. “Continuous optimization and knowledge based technologies” 323-327. |
[25] | Vago, E. and S. Kememy, (2006). Logistic Ridge regression for clinical data analysis. Applied Ecol. Environ. Res., 4: 171-179. |
[26] | Wang H. W. (2006) PLS Logistic regression on computational data and its Application. The Journal of Quantitative & Technical Econometrics 9: 156-161. |
[27] | Xinfeng, C. (2015) On the Almost Unbiased Ridge and Liu Estimator in the logistic regression model. International Conference on Social Science, Education, Management and Sports Education, Atlentis Press, Amsterdam, (17) 1663-1665. |
[28] | Zou, H. and Hastie, T. (2005). “Regularization and Variable Selection via the elastic net.” Journal of Royal Statistical Society, 67: 301-320. |
APA Style
Runyi Emmanuel Francis, Maureen Tobe Nwakuya. (2022). Logistic Estimation Method in the Presence of Collinearity and It’s Application. International Journal of Data Science and Analysis, 8(6), 187-193. https://doi.org/10.11648/j.ijdsa.20220806.13
ACS Style
Runyi Emmanuel Francis; Maureen Tobe Nwakuya. Logistic Estimation Method in the Presence of Collinearity and It’s Application. Int. J. Data Sci. Anal. 2022, 8(6), 187-193. doi: 10.11648/j.ijdsa.20220806.13
@article{10.11648/j.ijdsa.20220806.13, author = {Runyi Emmanuel Francis and Maureen Tobe Nwakuya}, title = {Logistic Estimation Method in the Presence of Collinearity and It’s Application}, journal = {International Journal of Data Science and Analysis}, volume = {8}, number = {6}, pages = {187-193}, doi = {10.11648/j.ijdsa.20220806.13}, url = {https://doi.org/10.11648/j.ijdsa.20220806.13}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdsa.20220806.13}, abstract = {Regression analysis is a widely used statistical technique in investigating relationships between the response variable and outcome variable. The logistic regression examines the relationship between variables when the response variable has a dichotomous output i.e., has two possible levels and outcome variable which could be categorical or continuous. Logistic regression using maximum likelihood estimation has gained wide use in determining the parameter estimate but, in the case, where the covariates are correlated, there is an inflation in the variance, standard error of the estimator and high coefficient of determination for the regression model, leading to the problem of multicollinearity in the regression model, thereby resulting to an incorrect conclusion about the relationship among these variables, hence the traditional method of estimating the parameters fails and becomes unstable. To attempt addressing the presence of multicollinearity in the regression model, various methods have been proposed which includes Ridge estimator, Stein estimator, Bayesian estimator and Liu estimators. We therefore propose a modified estimator for estimating the parameter of the logit model in the presence of multicollinearity by modifying the existing Liu logistic estimator. The modified estimator is applied to real life data. Results showed that the Modified Liu Logistic estimator outperformed the existing estimators considered in this study, in terms of smaller variance, bias and the MSE of the estimator.}, year = {2022} }
TY - JOUR T1 - Logistic Estimation Method in the Presence of Collinearity and It’s Application AU - Runyi Emmanuel Francis AU - Maureen Tobe Nwakuya Y1 - 2022/12/15 PY - 2022 N1 - https://doi.org/10.11648/j.ijdsa.20220806.13 DO - 10.11648/j.ijdsa.20220806.13 T2 - International Journal of Data Science and Analysis JF - International Journal of Data Science and Analysis JO - International Journal of Data Science and Analysis SP - 187 EP - 193 PB - Science Publishing Group SN - 2575-1891 UR - https://doi.org/10.11648/j.ijdsa.20220806.13 AB - Regression analysis is a widely used statistical technique in investigating relationships between the response variable and outcome variable. The logistic regression examines the relationship between variables when the response variable has a dichotomous output i.e., has two possible levels and outcome variable which could be categorical or continuous. Logistic regression using maximum likelihood estimation has gained wide use in determining the parameter estimate but, in the case, where the covariates are correlated, there is an inflation in the variance, standard error of the estimator and high coefficient of determination for the regression model, leading to the problem of multicollinearity in the regression model, thereby resulting to an incorrect conclusion about the relationship among these variables, hence the traditional method of estimating the parameters fails and becomes unstable. To attempt addressing the presence of multicollinearity in the regression model, various methods have been proposed which includes Ridge estimator, Stein estimator, Bayesian estimator and Liu estimators. We therefore propose a modified estimator for estimating the parameter of the logit model in the presence of multicollinearity by modifying the existing Liu logistic estimator. The modified estimator is applied to real life data. Results showed that the Modified Liu Logistic estimator outperformed the existing estimators considered in this study, in terms of smaller variance, bias and the MSE of the estimator. VL - 8 IS - 6 ER -