Full text loading...
-
Advancements in Data Augmentation and Transfer Learning: A Comprehensive Survey to Address Data Scarcity Challenges
- Source: Recent Advances in Computer Science and Communications, Volume 17, Issue 8, Nov 2024, p. 14 - 35
-
- 01 Nov 2024
Abstract
Deep Learning (DL) models have demonstrated remarkable proficiency in image classification and recognition tasks, surpassing human capabilities. The observed enhancement in performance can be attributed to the utilization of extensive datasets. Nevertheless, DL models have huge data requirements. Widening the learning capability of such models from limited samples even today remains a challenge, given the intrinsic constraints of small datasets. The trifecta of challenges, encompassing limited labeled datasets, privacy, poor generalization performance, and the costliness of annotations, further compounds the difficulty in achieving robust model performance. Overcoming the challenge of expanding the learning capabilities of Deep Learning models with limited sample sizes remains a pressing concern even today. To address this critical issue, our study conducts a meticulous examination of established methodologies, such as Data Augmentation and Transfer Learning, which offer promising solutions to data scarcity dilemmas. Data Augmentation, a powerful technique, amplifies the size of small datasets through a diverse array of strategies. These encompass geometric transformations, kernel filter manipulations, neural style transfer amalgamation, random erasing, Generative Adversarial Networks, augmentations in feature space, and adversarial and meta- learning training paradigms. Furthermore, Transfer Learning emerges as a crucial tool, leveraging pre-trained models to facilitate knowledge transfer between models or enabling the retraining of models on analogous datasets. Through our comprehensive investigation, we provide profound insights into how the synergistic application of these two techniques can significantly enhance the performance of classification tasks, effectively magnifying scarce datasets. This augmentation in data availability not only addresses the immediate challenges posed by limited datasets but also unlocks the full potential of working with Big Data in a new era of possibilities in DL applications.