Università LUM Giuseppe Degennaro - sito della Ricerca Institutional Research Information System

The application of Artificial Intelligence (AI) and Machine Learning (ML) in IoT smart sensor technologies has opened wide possibilities in the field of Water Quality Monitoring (WQM). Power saving and price-per-unit requirements, fundamentals for Wide Distributed Sensors Networks (WDSN), drive research in developing AI-model reduction techniques to make algorithms faster and cheaper in terms of hardware resources and battery consumption. Before any optimization process, Feature Selection (FS) is needed to reduce the number of basic operations in smart sensors workflow, thus making lighter the data acquiring phase and decreasing the size of data input for the subsequent AI process. However, selecting the FS method that best fits the specific requirements of the considered application is not trivial, given the numerous available FS methods and the relevant number of possible feature subsets. In this context, this paper presents a generalized and versatile algorithm, based on the concept of ensemble-FS, to support and speed up the AI-unit design process. The method compares different FS methods, effectively providing precise information about the accuracy (and any other requirement) of the selected FS method with respect to the number of acquired features. The proposed methodology is tested on a real WQM case study by analyzing the obtained results when both the popular and high-performer XGBoost algorithm and some ready-to-use FS-ranker methods in the Waikato Environment for Knowledge Analysis (WEKA) are used. Results show that the XGboost is the best performer for the case study in terms of stability and accuracy.

A Generalized Approach for Feature Selection in Water Quality Monitoring

Pavone, Marino;Epicoco, Nicola;Magliocca, Francesco;Pola, Giordano

2023-01-01

Abstract

The application of Artificial Intelligence (AI) and Machine Learning (ML) in IoT smart sensor technologies has opened wide possibilities in the field of Water Quality Monitoring (WQM). Power saving and price-per-unit requirements, fundamentals for Wide Distributed Sensors Networks (WDSN), drive research in developing AI-model reduction techniques to make algorithms faster and cheaper in terms of hardware resources and battery consumption. Before any optimization process, Feature Selection (FS) is needed to reduce the number of basic operations in smart sensors workflow, thus making lighter the data acquiring phase and decreasing the size of data input for the subsequent AI process. However, selecting the FS method that best fits the specific requirements of the considered application is not trivial, given the numerous available FS methods and the relevant number of possible feature subsets. In this context, this paper presents a generalized and versatile algorithm, based on the concept of ensemble-FS, to support and speed up the AI-unit design process. The method compares different FS methods, effectively providing precise information about the accuracy (and any other requirement) of the selected FS method with respect to the number of acquired features. The proposed methodology is tested on a real WQM case study by analyzing the obtained results when both the popular and high-performer XGBoost algorithm and some ready-to-use FS-ranker methods in the Waikato Environment for Knowledge Analysis (WEKA) are used. Results show that the XGboost is the best performer for the case study in terms of stability and accuracy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Codice ISBN
	
				979-8-3503-1543-1
			
	Parole chiave
	
				Feature Selection; Tiny Machine Learning; WEKA; XG-Boost
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12572/14405

Citazioni

ND

social impact