Dissertation Topic

Improving performance of Large Language Models for downstream tasks

Academic Year: 2024/2025

Supervisor: Bieliková Mária, prof. Ing., Ph.D.

Department: Department of Computer Graphics and Multimedia

Programs:
Information Technology (DIT) - combined study
Information Technology (DIT-EN) - combined study

Large language models (LLMs) are increasingly being used for a wide range of downstream tasks where they often show a good performance in zero/few-shot settings compared to specialized fine-tuned models, especially for tasks in which the LLMs can tap into the vast knowledge learned by them during the pre-training. However, they lag behind the specialized fine-tuned models in tasks requiring a more specific domain knowledge and adaptation. Additionally, they often suffer from problems such as hallucinations, i.e., outputting coherent, but factually false or nonsensical answers; or generating text laden with biases propagated from pre-training data. Various approaches have recently been proposed to address these issues, such as improved prompting strategies including in-context learning, retrieval-augmented generation or adapting the LLMs through efficient fine-tuning.

Each of these approaches (or combination thereof) presents opportunities for new discoveries. Orthogonal to this, there are multiple important factors of models like their level of alignment with human values, their robustness, explainability or interpretability and advances in this regard are welcome as well (generally in AI and particularly in the mentioned approaches).

There are many downstream tasks, where research of the LLM adaptation methods can be applied. These include (but are not limited to) false information (disinformation) detection, credibility signals detection, auditing of social media algorithms and their tendencies for disinformation spreading, and support of manual/automated fact-checking.

Relevant publications:

Macko, D., Moro, R., Uchendu, A., Lucas, J.S., Yamashita, M., Pikuliak, M., Srba, I., Le, T., Lee, D., Simko, J. and Bielikova, M., 2023. MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing https://arxiv.org/abs/2310.13606
Vykopal, I., Pikuliak, M., Srba, I., Moro, R., Macko, D., and Bielikova, M., 2023. Disinformation Capabilities of Large Language Models. Preprint at arXiv: https://arxiv.org/abs/2311.08838

The research will be performed at the Kempelen Institute of Intelligent Technologies (KInIT, https://kinit.sk) in Bratislava in collaboration with industrial partners or researchers from highly respected research units involved in international projects. A combined (external) form of study and full employment at KInIT is expected.