Result Details
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?
The generative large language models (LLMs) are increasingly being used for data
augmentation tasks, where text samples are LLM-paraphrased and then used for
classifier fine-tuning. Previous studies have compared LLM-based augmentations
with established augmentation techniques, but the results are contradictory: some
report superiority of LLM-based augmentations, while other only marginal
increases (and even decreases) in performance of downstream classifiers.
A research that would confirm a clear cost-benefit advantage of LLMs over more
established augmentation methods is largely missing. To study if (and when) is
the LLM-based augmentation advantageous, we compared the effects of recent LLM
augmentation methods with established ones on 6 datasets, 3 classifiers and 2
fine-tuning methods. We also varied the number of seeds and collected samples to
better explore the downstream model accuracy space. Finally, we performed
a cost-benefit analysis and show that LLM-based methods are worthy of deployment
only when very small number of seeds is used. Moreover, in many cases,
established methods lead to similar or better model accuracies.
data-efficient training, data augmentation, analysis
@inproceedings{BUT193745,
author="Ján {Čegiň} and Jakub {Šimko}",
title="LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?",
booktitle="Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
year="2025",
pages="10476--10496",
publisher="Association for Computational Linguistics",
address="Albuquerque, New Mexico",
doi="10.18653/v1/2025.naacl-long.526",
isbn="979-8-8917-6189-6",
url="https://aclanthology.org/2025.naacl-long.526/"
}