Linear Regression Approach to Solving Multicollinearity and Overfitting in Predictive Analysis
Keywords:
Linear Regression, Multicollinearity, Overfitting, Predictive Analysis, Exploratory Data AnalysisAbstract
Multicollinearity and overfitting are ubiquitous problems in predictive analysis, especially in linear regression models, which significantly hinder the precision and interpretability of predicted results providing critical insights for data-driven decision-making in diverse industries. This research examines a linear regression approach to address the dual challenges of multicollinearity and overfitting in predictive analysis. The dataset, sourced from the National Center for Disease Control (NCDC), was analyzed using multiple regression techniques, including Linear Regression, Ridge Regression, LASSO Regression, and Elastic Net Regression. The study aimed to assess and compare the efficacy of these methods in mitigating multicollinearity (measured by Variance Inflation Factor) and reducing overfitting through Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) metrics. Data was analyzed both with all features and after applying feature selection. Results demonstrated that while all models effectively addressed multicollinearity and overfitting, Elastic Net Regression exhibited superior performance, offering the best generalization capabilities with minimal MSE and RMSE discrepancies between internal and external data. These findings highlight the potential of advanced regularization techniques in improving predictive accuracy and interpretability, particularly in high-dimensional data contexts such as those involving COVID-19 outcomes. The study underscores the importance of further research into enhanced machine learning techniques and the inclusion of broader datasets to refine predictive models for practical decision-making across sectors.
Published
How to Cite
Issue
Section
Copyright (c) 2025 Edeh John Otse, Georgina N. Obunadike, Ahmad Abubakar (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- NonCommercial — You may not use the material for commercial purposes.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.