Regularization Frameworks for Overfitting Prevention in Language Model Fine-Tuning
Abstract
Regularization has been a significant technology in both large-scale machine learning and deep learning. The scale and complexity in the language models are increasing, which leads to overfitting during their applications. Adaptive regularization techniques are presented as a way of mitigating overfitting in large-scale language model tuning. We look at different approaches such as dropout, weight decay, as well as state-of-the-art methods like adaptive noise in weight and differential privacy. The study provides feedback on the methods and their impact on the model performance. From this, we can then inform how these techniques can be used to achieve generalization and the accuracy of the desired task and not only produce models that do not (get to) the (sought-after goal).