Safe LLM applications
Generative Artificial Intelligence (GenAI) opens new possibilities for automating complex tasks. For example, chatbots and AI agents have conquered the internet in recent years by replacing search engines, writing emails, or summarizing books. The underlying Large Language Models (LLMs) can efficiently process and reproduce natural language input as well as images and documents. Studies estimate that GenAI can unlock efficiency gains worth billions of euros in the economy [1].
Especially for safety-critical applications, a key prerequisite for any technology to achieve lasting practical efficiency gains is reliability and robustness. However, the statistical nature and hard-to-interpret learning processes of GenAI make this a real challenge, as there is fundamentally no guarantee of correct or helpful results (“hallucinations”). As a result, the gap between technologically possible and verifiably safe GenAI applications continues to grow [2].
Fraunhofer Institute for Cognitive Systems IKS