News
Day: 11 November 2025
In November, Ján Čegiň from the Institute of Computer Graphics and Multimedia will defend his dissertation
Ing. Ján Čegiň will present his dissertation on Tuesday, November 25, 2025, at 9:30 a.m. in room G108. The thesis, entitled "Machine Learning With Human in the Loop for Textual Augmentation in the Era of Llms," was written under the supervision of doc. Jakub Šimko (ÚPGM).
Čegiň's thesis responds to rapid advances in large language models (LLMs), which have sparked interest in their potential to improve data augmentation processes, especially when compared to traditional human-based methods. Creating new training data without the need to collect additional real-world samples is key to improving artificial intelligence models. Traditionally, this process has required costly and time-consuming crowdsourcing efforts. The work explores how large language models (LLMs) can not only replace human workers, but in some cases even outperform them in generating diverse, valid, and cost-effective training data. "This work bridges human computational labor and artificial intelligence techniques, creating space for more efficient, scalable, and sustainable approaches to training smaller and more efficient models," the author summarizes the contribution of the work in the most general terms.
In his research, Cegin addresses the following main questions:
- How effective are LLMs compared to human workers in data augmentation?
- How transferable are human computation techniques to LLM prompting?
- What are the costs and benefits of an LLM-based approach compared to traditional methods?
Through extensive experiments, Čegiň demonstrates that LLMs can generate more diverse and valid text data than human workers while significantly reducing costs. "We also showed that techniques inspired by human behavior (e.g., providing examples as hints) improve the performance of subsequent models. And we also found that LLM-based augmentation is particularly valuable in data-scarce environments, i.e., when few labeled examples are available."
You can read the abstract of the dissertation here.
You are cordially invited to attend the defense!
