Ground data analysis for PM2.5 Prediction using predictive modeling techniques
Abstract
Introduction: Air quality forecasting, particularly predicting Particulate Matter (PM 2.5 ) concentrations, has gained significant attention due to itscritical implications for public health and environmental management.Accurately predicting PM2.5 , a harmful air pollutant associated with respiratory and cardiovascular diseases, is vital for effective air quality management in densely populated urban areas.
Materials and methods: This study uses various meteorological and environmental data combinations in Tehran, Iran, this study investigates the efficacy of three predictive modeling techniques Auto Regressive Integrated Moving Average (ARIMA), Extreme Gradient Boosting (XGBoost), and Long Short-Term Memory (LSTM) in forecasting daily and monthly PM2.5levels. The models were evaluated based on performance metrics such as RootMean Square Error (RMSE), Mean Absolute Error (MAE), and R² scores.
Results: Results indicate that XGBoost excelled in daily predictions when using solely meteorological data, achieving an R² score of 0.998674, while ARIMA demonstrated strong predictive capacity but struggled with added complexity. LSTM maintained reasonable performance amidst increased data input but faced challenges in both daily and monthly forecasts. Monthly predictions from all models proved less reliable, particularly with ARIMA yielding negative R² values, indicating suboptimal performance compared to simplistic models.
Conclusion: The findings highlight the importance of model selection and feature engineering in accurately predicting PM2.5 levels. The study suggests a shift towards hybrid modeling approaches and incorporating diverse environmental data to enhance forecasting accuracy in air quality management, particularly for long-term predictions.