The application of Artificial Intelligence (AI) and Machine Learning (ML) in IoT smart sensor technologies has opened wide possibilities in the field of Water Quality Monitoring (WQM). Power saving and price-per-unit requirements, fundamentals for Wide Distributed Sensors Networks (WDSN), drive research in developing AI-model reduction techniques to make algorithms faster and cheaper in terms of hardware resources and battery consumption. Before any optimization process, Feature Selection (FS) is needed to reduce the number of basic operations in smart sensors workflow, thus making lighter the data acquiring phase and decreasing the size of data input for the subsequent AI process. However, selecting the FS method that best fits the specific requirements of the considered application is not trivial, given the numerous available FS methods and the relevant number of possible feature subsets. In this context, this paper presents a generalized and versatile algorithm, based on the concept of ensemble-FS, to support and speed up the AI-unit design process. The method compares different FS methods, effectively providing precise information about the accuracy (and any other requirement) of the selected FS method with respect to the number of acquired features. The proposed methodology is tested on a real WQM case study by analyzing the obtained results when both the popular and high-performer XGBoost algorithm and some ready-to-use FS-ranker methods in the Waikato Environment for Knowledge Analysis (WEKA) are used. Results show that the XGboost is the best performer for the case study in terms of stability and accuracy.
A Generalized Approach for Feature Selection in Water Quality Monitoring
Epicoco, Nicola;
2023-01-01
Abstract
The application of Artificial Intelligence (AI) and Machine Learning (ML) in IoT smart sensor technologies has opened wide possibilities in the field of Water Quality Monitoring (WQM). Power saving and price-per-unit requirements, fundamentals for Wide Distributed Sensors Networks (WDSN), drive research in developing AI-model reduction techniques to make algorithms faster and cheaper in terms of hardware resources and battery consumption. Before any optimization process, Feature Selection (FS) is needed to reduce the number of basic operations in smart sensors workflow, thus making lighter the data acquiring phase and decreasing the size of data input for the subsequent AI process. However, selecting the FS method that best fits the specific requirements of the considered application is not trivial, given the numerous available FS methods and the relevant number of possible feature subsets. In this context, this paper presents a generalized and versatile algorithm, based on the concept of ensemble-FS, to support and speed up the AI-unit design process. The method compares different FS methods, effectively providing precise information about the accuracy (and any other requirement) of the selected FS method with respect to the number of acquired features. The proposed methodology is tested on a real WQM case study by analyzing the obtained results when both the popular and high-performer XGBoost algorithm and some ready-to-use FS-ranker methods in the Waikato Environment for Knowledge Analysis (WEKA) are used. Results show that the XGboost is the best performer for the case study in terms of stability and accuracy.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.