THESIS
2023
1 online resource (81 pages) : color illustrations
Abstract
Assessing freshwater algae is pivotal for understanding aquatic ecosystems, yet data
imbalances and scarcities in algal genera complicate the efficacy of detection models. This
study introduces an innovative method to tackle this issue by leveraging the capabilities of
StyleGAN2-ADA for data augmentation, thereby enhancing the detection performance for
algal images. By generating artificial algal instances using StyleGAN2-ADA, this approach
provides a robust solution to the critical issues of data scarcity and imbalance prevalent in
freshwater algal datasets. The method was applied to a dataset of 645 images of various
freshwater algae genera. A Cascade Mask R-CNN model, trained on a combined dataset of real
and GAN-generated multi-genera algal images using the Swin-Tiny backbone model,...[
Read more ]
Assessing freshwater algae is pivotal for understanding aquatic ecosystems, yet data
imbalances and scarcities in algal genera complicate the efficacy of detection models. This
study introduces an innovative method to tackle this issue by leveraging the capabilities of
StyleGAN2-ADA for data augmentation, thereby enhancing the detection performance for
algal images. By generating artificial algal instances using StyleGAN2-ADA, this approach
provides a robust solution to the critical issues of data scarcity and imbalance prevalent in
freshwater algal datasets. The method was applied to a dataset of 645 images of various
freshwater algae genera. A Cascade Mask R-CNN model, trained on a combined dataset of real
and GAN-generated multi-genera algal images using the Swin-Tiny backbone model, showed
remarkable improvements in detection accuracy. The mAP scores for bounding box detection
demonstrated a substantial increase, with a notable enhancement in detecting rare algal genera.
The strategy of adding 50% more artificial data to the training set proved beneficial, pushing
the model performance further without necessitating excessive usage of artificial data.
Moreover, this GAN-based data augmentation technique was successfully applied to both
Swin-Tiny and ResNet-50 backbone models, indicating its versatility and wide applicability across various machine learning models. The substantial increase in mAP scores across these
models validates the robustness of GAN-based data augmentation, regardless of their original
performance level. The study took a step forward by deploying the freshwater algae detection
model as an online service. By converting the model to the ONNX format and hosting it on the
HuggingFace Space cloud platform, real-time and precise algae detection was made accessible
to a broader range of users, greatly enhancing efficiency and flexibility. The model maintains
high performance while significantly reducing the time required for microscope algae detection.
In conclusion, this study presents a promising data augmentation solution for algae detection
and monitoring, offering a noteworthy contribution to aquatic ecology research and
applications challenged by imbalanced and scarce data.
Post a Comment