THESIS
2023
1 online resource (xii, 73 pages) : color illustrations
Abstract
Automated algae classification using machine learning is a more efficient and effective solution
compared to manual classification, which can be tedious and time-consuming. However, the
practical application of this classification approach is restricted by the scarcity of labeled
freshwater algae datasets, especially for rarer algae. To overcome these challenges, this study
proposes to generate artificial algal images with StyleGAN2-ADA and use both real and the
generated images to train machine-learning-driven algae classification models. This approach
significantly enhances the performance of classification models, particularly in their ability to
identify rare algae. Overall, the proposed approach improves the F1-score of lightweight
MobileNetV3 classification models covering all 20...[
Read more ]
Automated algae classification using machine learning is a more efficient and effective solution
compared to manual classification, which can be tedious and time-consuming. However, the
practical application of this classification approach is restricted by the scarcity of labeled
freshwater algae datasets, especially for rarer algae. To overcome these challenges, this study
proposes to generate artificial algal images with StyleGAN2-ADA and use both real and the
generated images to train machine-learning-driven algae classification models. This approach
significantly enhances the performance of classification models, particularly in their ability to
identify rare algae. Overall, the proposed approach improves the F1-score of lightweight
MobileNetV3 classification models covering all 20 of the freshwater algae in this research from
88.4% to 96.2%, while for the models that cover only the rarer algae, the experiments show an
improvement from 80% to 96.5% in terms of F1-score. The results show that the approach
enables the trained algae classification systems to effectively cover algae with limited image
data. Additionally, a multi-genera algae detection system is also developed to detect the algae’s
genera and locations in microscopic images. In particular, a multi-genera algal image dataset
is prepared by image composition and annotation generation. The composed images are used as additional training data to enrich the original dataset. Using both the original and the artificial
images, a mixed image model can significantly outperform the baseline real image model,
showing a 16.4% improvement for all 20 algae and a significant 30.2% for rare algae in
bounding box mAP. Finally, this research also improves user accessibility of the algae
detection system by converting the trained model into the ONNX format, developing a user-friendly
website user interface using Gradio, and hosting the entire system on Huggingface
Spaces. This research developed an automated algae classification and detection system with
limited and imbalanced dataset, contributed to the early detection and management of harmful
algal blooms, and also, highlighted the potential of machine learning in environmental
management.
Post a Comment