The problem of separation in logistic regression, a solution and an application
DOI:
https://doi.org/10.17533/udea.rfnsp.8770Keywords:
logistic model, maximum likelihood estimation, menarcheAbstract
Logistic regression is one of the most used statistical techniques for explaining the probabilistic behavior of a given phenomenon. Data separation is a frequent problem in this model, as successes appear separated from failures and make it impossible to find the maximum likelihood estimators. Objective: to present a revision and a solution to the problem, and to compare it with other solutions. Methodology: a simulation of the logistic model and an estimation of the parameters’ bias using the proposed classical and Bayesian solution with fictitious observations, as well as the Firth method. Results: the bias found is lower when the pair of fictitious observations are generated using the Bayesian method. An example about the age at which menarche occurs is presented. Discussion: an appropriate solution to the problem of separation is provided using a simulation in a simple logistic model. Conclusions: the generation of fictitious observations within the separation region is recommended, and the best solution method is based on Bayesian theory, which achieves convergence of the parameters of the logistic model.
Downloads
References
(1). Albert A, Anderson JA. On the existence of maximum likeliho-od estimates in logistic regression models. Biometrika 1984;71: 1-10. DOI: https://doi.org/10.1093/biomet/71.1.1
(2). Christmann A, Rousseeuw PJ. Measuring overlap in binary regression. Computational Statistics and Data Analysis 2001; 37: 65-75. DOI: https://doi.org/10.1016/S0167-9473(00)00063-3
(3). Christmann A, Rousseeuw PJ. Robustness against separation and outliers in logistic regression, Computational Statistics and Data Analysis 2003;43: 315-332. DOI: https://doi.org/10.1016/S0167-9473(02)00304-3
(4). King E, Ryan TP. A preliminary investigation of maximum likeli-hood logistic regression versus Exact logisic Regression. Ameri-can Statistical Association 2002; 56 (3): 163-170. DOI: https://doi.org/10.1198/00031300283
(5). Lesaffre E, Albert A. Partial Separation in Logistic Discrimination. Journal of the Royal Statistical Society. Series B (Methodo-logical) 1989; 51(1): 109-116. DOI: https://doi.org/10.1111/j.2517-6161.1989.tb01752.x
(6). Rindskopf D. Infinite parameter estimates in logistic regression: Opportunities, not problems. Journal of Educational and Behavioral Statistics 2002; 27(2): 147-161. DOI: https://doi.org/10.3102/10769986027002147
(7). Gentleman R, Ihaka R. R: A Language and Environment for Statistical Computing. R Development Core Team [internet] R Foundation for Statistical Computing: Vienna; 2009 [acceso 07 de noviembre de 2010]. Disponible en: www.R-project.org..
(8). Santner TJ, Duffy DE. A note on A. Albert and J. A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika 1986; 73(3): 755-758. DOI: https://doi.org/10.1093/biomet/73.3.755
(9). Ying So. A Tutorial on Logistic Regression [revista en internet]. Journal Of Marriage And The Family 1995; 57(4): 1-6. Disponi-ble en: http://www.mendeley.com/research/a-tutorial-on-logistic-regression/ DOI: https://doi.org/10.2307/353415
(10). Heinze G, Shemper M. A solution to the problem of separation in logistic regression. Statist. Med 2002; 21:2409-2419. DOI: https://doi.org/10.1002/sim.1047
(11). Firth D. Bias reduction, the Je_reys prior and glim. En: Fahrmeir L, Francis B, Gilchrist R, Tutz G, editores. Advances in glimand Statistical Modelling. New York: Springer-Verlag; 1992. p. 91-100. DOI: https://doi.org/10.1007/978-1-4612-2952-0_15
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Juan C. Correa M., Marisol Valencia C.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The contents of the articles are the responsibility of the authors
The editorial committee has editorial independence from the National School of Public Health "Héctor Abad Gómez" of the University of Antioquia.
The editorial committee is not responsible for aspects related to copying, plagiarism or fraud that may appear in the articles published in it.
When you are going to reproduce and disclose photographs or personal data in printed or digital format, informed consent is required. Therefore, this requirement is required of the author at the time of receipt of the manuscript.
Authors are responsible for obtaining the necessary permissions to reproduce any material protected by reproduction rights.
The authors preserve the moral rights and assign the economic rights that will correspond to the University of Antioquia, to publish it, distribute electronic copies, include them in indexing services, directories or national and international databases in Open Access, under the Creative Commons Attribution license -Not Commercial-Share Equal 4.0 International Commercial (CC BY-NC-SA) which allows others to distribute, remix, retouch, and create from the work in a non-commercial way, as long as the respective credit and license are granted. new creations under the same conditions.
The authors will sign the declaration of transfer of economic rights to the University of Antioquia, after the acceptance of the manuscript.
The editorial committee reserves the right to reject the articles whose authors do not offer satisfactory explanations about the contribution of each author, to meet the criteria of authorship in the submission letter. All authors must meet the four criteria of authorship according to ICMJE: "a) .- That there is a substantial contribution to the conception or design of the article or to the acquisition, analysis or interpretation of the data. b) That they have participated in the design of the research work or in the critical review of its intellectual content. c) .- That has been intervened in the approval of the final version that will be published.d). That they have the capacity to respond to all aspects of the article in order to ensure that issues related to the accuracy or integrity of any part of the work are adequately investigated and resolved. "


--
--
