News

Day: 26 May 2026

How can we detect manipulation by large language models (LLMs)? A collaboration between Jakub Reš, a PhD student at the Faculty of Information Technology at Brno University of Technology, and Red Hat is seeking an answer

Tags: partner

[img]

Large Language Models (LLMs) have been among the most frequently discussed topics in the field of artificial intelligence in recent years. Technology that works with human language—such as that behind generative chatbots—is increasingly finding its way into everyday use, both in business and in private life. Naturally, this is accompanied by an increase in the frequency and severity of attempts to manipulate or misuse this technology, and with it, the importance of cybersecurity protection for these models.

It was precisely in this field that an interesting application of doctoral research from FIT VUT in corporate practice emerged last year. Moreover, it was with a leading global developer of open-source software solutions for businesses. Jakub Reš, a doctoral student working under the supervision of Assoc. Prof. Kamil Malinka and a member of the Security@FIT research group, began a research internship at Red Hat. Their collaboration, broadly speaking, focuses on securing next-generation artificial intelligence systems against manipulation. A specific example of such LLM manipulation is a phenomenon known as fact injection. Its goal is clear: to trick the language model, through covert modifications, into confidently generating fabricated data as if it were true, thereby undermining its reliability and intentionally deceiving users. Attackers achieve this by making highly detailed, precise, and simultaneously difficult-to-detect adjustments to the model’s parameters. A major challenge is the difficulty of distinguishing these malicious inputs from legitimate ones in the form of harmless model updates. The impact can be severe: the widespread dissemination of sophisticated disinformation leading to erroneous decisions in sensitive areas (e.g., finance) or, for example, damage to an organization’s reputation, or more generally, users’ trust in artificial intelligence systems. Cybersecurity research, meanwhile, focuses on the technical aspects of attacks with the aim of eliminating their impact.

Reš defines the field in which he has recently begun collaborating with Red Hat as follows: The integrity of language models and its security. This is, of course, a broad definition, so he immediately adds a clarification—he is interested in verifying the integrity of AI models, i.e., developing software methods for preventing or detecting undisclosed manipulation of AI models. Interference with the functioning of LLMs can occur in several scenarios: It can happen at any point between the model’s creation by the developer and its use by the end user. However, the problem may arise at the creator’s end. A third possibility involves community contributors to the model (developers with their own datasets), who routinely modify it to improve it for a specific domain (when users do not want a general-purpose model, but perhaps one focused on programming). And what exactly might constitute the harmfulness of manipulating LLM models? This depends heavily on the specific use case. “Let me give an example: a chatbot on a gardening website that recommends competitors or links to inappropriate products. However, there are also abuses leading to potential massive political influence on users or, for instance, the generation of dangerous code. And a very serious manifestation of hidden model modifications can be erroneous advice regarding our health,” Jakub Reš specifies the potential risks. The verification he is working on combines two main approaches: a) prevention—ensuring trustworthy computations from the model source all the way to users using cryptography and digital model signatures; b) detection of undisclosed modifications, where it is necessary to delve deep into the model and examine not only how it behaves but also its “inner workings”—how much it has been modified and how this affects its internal states.

How did Reš’s path to becoming a major player in software development even begin? “If you have an interesting research topic, don’t keep it to yourself. Talk to people, spread awareness about it. Sometimes you’ll find like-minded people offering to collaborate. In my case, the main thanks go to Martin Ukrop and Marek Grác from Red Hat,” our student recalls the start of the collaboration. The research internship, on which the collaboration is based, is, in his view, a little-known and underutilized platform for involving students in corporate practice. “It’s not very well known among students—certainly less so than a traditional internship—and that’s a shame.” Yet the connection between university research and corporate practice often gives students a strong sense of purpose and motivation, as Reš confirms: “For me, it means a clear vision of what I want to achieve and what it will lead to. I won’t stop at an article that will just sit on a shelf somewhere. Academic results can be put to immediate practical use, and there’s someone out there who might be interested in this. For me, it’s an incredible source of motivation, a spark for what I do.”

We hope Jakub Reš continues to enjoy this collaboration throughout his doctoral studies. It certainly has meaning and social impact.

Skupinové foto – Security@FIT
Skupinové foto – Security@FIT

Share News

Back to top