Treffer: Indoor occupancy monitoring using environmental feature fusion and semi-supervised machine learning models.
Weitere Informationen
Smart buildings optimize energy consumption and occupant comfort through Heating, Ventilation and Air Conditioning, and lighting management. Nevertheless, large venues require data fusion techniques to improve analysis and forecasting. This study aims to evaluate the effectiveness of using different feature fusion techniques, environmental sensors, and semi-supervised learning to estimate indoor occupancy in a 230 m<sup>2</sup> office. Using five Internet of Things devices measuring air temperature, relative humidity, and barometric pressure, data was collected for 99 days with 6800 entries (on average) and only 14% labeled. Eight feature selection methods were evaluated along with three supervised and two semi-supervised classification methods. Results indicate that the Chi-squared-based approach for feature fusion outperformed others. Similarly, the semi-supervised Self-Training model achieved better performance than the supervised methods. This research shows that combining semi-supervised learning and data fusion allows for estimating the occupancy level in large indoor spaces with high accuracy and low labeling costs. Highlights This study pioneers in exploring semi-supervised learning and distinct feature fusion methods for estimating indoor occupancy levels in a 230 m $ ^2 $ 2 open office using only Internet of Things (IoT) environmental sensors (air temperature, relative humidity, and barometric pressure). A comprehensive comparison of statistical methods, feature selection, and dimensionality reduction techniques are conducted to determine their ability to generate robust feature fusion sets. The feature fusion selected through the Chi-squared test stood out with a high accuracy F1-score (average of 0.95) and an average accuracy of 0.99. The Self-Training model reached the best performance from semi-supervised learning, with an average F1-Score of 0.90 and an average accuracy of 0.97, based on a dataset with a large proportion of unlabelled data (16,847 entries) and only 9367 labels. For supervised learning, Random Forest achieved a high accuracy (average of 0.98) and F1-score (average of 0.93) across various feature sets. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Building Performance Simulation is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)