EVALUATING THE PREDICTIVE POWER OFLOGISTIC REGRESSION MODELS INCLASSIFYING BINARY OUTCOMES

Authors

  • Intesar N. El-Saeiti Department of Statistics, Faculty of science, University of Benghazi, Libya. Author
  • Aman Pannu Principal, Advanced Analytics, DirecTV, USA. Author

Keywords:

Logistic Regression, Predictive Accuracy, ROC Curve, Goodness-of-Fit, Model Selection

Abstract

This study investigates the predictive performance of logistic regression models with varying parameter specifications in classifying binary outcomes. Utilizing SAS software, the analysis focuses on key predictive metrics, including sensitivity, specificity, and overall classification accuracy. The model’s predictive strength is quantified using the concordance index (c), with an area under the Receiver Operating Characteristic (ROC) curve of 0.738, indicating acceptable classification capability. Goodness-of-fit assessments, such as the Hosmer-Lemeshow and Pearson tests, reveal no significant deviations, thereby confirming the model's adequacy. A backward elimination approach is employed to refine the model, balancing predictive power with interpretability by selecting a parsimonious set of main effects and interaction terms. Parameter estimates, confidence intervals, and significance levels are provided for key predictors, including smoking and alcohol use, which exhibit significant associations with binary health outcomes. The analysis also examines the sensitivity of parameter estimates to unbalanced data, demonstrating how modifications in single observations can influence model outcomes. This study emphasizes the critical role of model selection and fit diagnostics in logistic regression, offering valuable insights for optimizing predictive models in the classification of categorical data.

Author Biography

  • Intesar N. El-Saeiti, Department of Statistics, Faculty of science, University of Benghazi, Libya.

     Volume 5, Issue 2, July-December 2024 *]IJE)

References

El-Saeiti, I. N., & Pannu, (2024). “H-Likelihood Estimation Method for Varying Clustered Binary Mixed Effects Model”. Journal of Computational Analysis and Applications (JoCAAA), 33(08), 220–225.

El-Saeiti, I. N. (2023). Evaluating the efficiency of restricted pseudo likelihood estimation in balanced and unbalanced clustered binary data models. The Scientific Journal of University of Benghazi, 36(2).

Wang, Y., Zhang, H., & Li, M. (2023). "Improving predictive accuracy in logistic regression for healthcare applications." Statistics in Medicine, 42(7), 1342-1357.

López, L. M., Fernández, P. A., & Rodríguez, G. J. (2022). "Dealing with unbalanced classes in logistic regression: A comparative review." Advances in Data Analysis and Classification, 16(3), 503-523.

Lee, J. H., Kim, K., & Park, S. (2021). "Evaluating goodness-of-fit in logistic regression models for health data." Journal of Biostatistical Research, 45(2), 145-161.

Zou, H., & Hastie, T. (2020). Regularization and Variable Selection via the Elastic Net. Wiley.

Menard, S. (2019). Logistic Regression: From Introductory to Advanced Concepts and Applications. Sage Publications.

Agresti, A. (2018). An Introduction to Categorical Data Analysis (3rd ed.). John Wiley & Sons.

Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.). Wiley

Published

2024-12-30

How to Cite

EVALUATING THE PREDICTIVE POWER OFLOGISTIC REGRESSION MODELS INCLASSIFYING BINARY OUTCOMES. (2024). INTERNATIONAL JOURNAL OF EDUCATION (IJE), 5(2), 93-103. https://lib-index.com/index.php/IJE/article/view/1637