A novel feature engineering algorithm for air quality datasets

Raja Sher Afgun Usmani, Wan Nurul Farah Binti Wan Azmi, Akibu Mahmoud Abdullahi, Ibrahim Abaker Targio Hashem, Thulasyammal Ramiah Pillai

Abstract


Feature engineering (FE) is one of the most important steps in data science research. FE provides useful features to be used later in the study. Due to climate change, the research focus is moving towards air quality estimation and the impacts of air pollution on health in Malaysia. Malaysia has 66 air quality monitoring (AQM) stations, and the air quality data for research is provided in an excel worksheet format by the Department of Environment, Malaysia. The data generated by the AQM stations is in a raw custom format, and it is virtually impossible to clean and engineer this data manually due to the sheer number of files. Hence, we propose a novel feature engineering algorithm to transform and combine this data into a useable format. The results show that the proposed feature engineering algorithm was able to efficiently extract and combine the hourly and daily values for pollutant and meteorological variables in useful row format. This algorithm will help all the researchers using the data from the AQM station in Malaysia as well as other countries using the same AQM station. The implementation of the feature engineering algorithm is also available to use at GitHub (https://github.com/rajasherafgun/featureengineeringaq) under AFL-3.0 license.

Keywords


Air pollution; Feature engineering; Data cleaning; Air quality; Air quality monitoring station; Data science

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v19.i3.pp1444-1451

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics