Prediction of Air Quality Index Using Supervised Machine Learning
2022 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE credits
Student thesis
Abstract [en]
Background: Air pollution has become a serious environmental issue. It is responsible for hundreds of fatalities each year and it poses a serious threat to human health and environment. It leads to global warming, greenhouse effect and it also causes respiratory problems like asthma, lung cancer etc. It is important to predict the quality of the air to regulate air pollution. Air quality index (AQI) is a measure of air quality which describes the level of air pollution. Machine learning algorithms can help in predicting the AQI. Linear regression, LASSO regression, ridge regression, and SVR algorithms were used to forecast the AQI.
Objectives: The main objective of the thesis is to build and train a models using machine learning algorithms and find out the most accurate model in predicting the AQI.
Methods: Literature Review and Experimentation were chosen as methods to answer the research questions. There are a number of research papers written on prediction of AQI and literature review helped us a lot in research and references. Experimentation is also used to find out the most accurate machine learning model in predicting the air quality. In the experimentation phase, four machine learning algorithms were trained with air quality data to create predictive models for fore- casting AQI.
Results: Algorithms like Logistic Regression, Ridge Regression, LASSO Regression, and SVR are selected through literature review. Upon experimentation and training the algorithm with "Air Quality Data in India (2015-2020)" data set has showed that Ridge regression has the least MAE and RMSE and the highest R- square, which shows that it has the highest performance in predicting the AQI.
Conclusions: Four models are built by training with machine learning algorithms like Logistic Regression, Ridge Regression, LASSO Regression, and SVR and "Air Quality Data in India (2015-2020)" data set. After experimentation, it was found that Ridge Regression and LASSO regression has the better performance in the prediction of AQI.
Place, publisher, year, edition, pages
2022.
Keywords [en]
Air Quality Index, Air Quality Monitoring, Machine Learning, Regression
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-23439OAI: oai:DiVA.org:bth-23439DiVA, id: diva2:1681590
Subject / course
DV1478 Bachelor Thesis in Computer Science
Educational program
DVGDT Bachelor Qualification Plan in Computer Science 60.0 hp
Presentation
2022-05-27, J1640, Blekinge Institute of Technology, Karlskrona, 10:30 (English)
Supervisors
Examiners
2022-07-082022-07-072022-07-08Bibliographically approved