Hallucination
Layerup Security uses a custom-trained hallucination model based on curated proprietary datasets to detect hallucinations.
What is Hallucination?
Hallucination in the context of LLMs refers to the generation of incorrect or fictitious information presented as factual. This phenomenon poses significant security risks and can undermine model reliability.
Why Security Teams Should Prioritize Preventing Hallucination
The prevention of hallucinations is critical for maintaining security in systems employing LLMs for several reasons:
- Prevention of Insecure Output Handling: Hallucinations can result in the generation of insecure or manipulated outputs that might be used to mislead users or misinform decision-making processes. Ensuring the integrity of output is essential to prevent these risks.
- Reduction of Vulnerability to Information Attacks: Fabricated information can be exploited in information warfare or social engineering attacks. Preventing hallucinations helps protect against scenarios where malicious actors could leverage incorrect model outputs to perpetrate fraud or disseminate false information.
- Safeguarding Data Privacy: Hallucinations can inadvertently lead to the disclosure of information that mimics sensitive or private data, which could be mistaken for real data breaches. This is particularly crucial in compliance with data protection regulations.
- Preventing Phishing Attacks: Phishing attacks leverage deceptive techniques to trick users into divulging sensitive information, clicking on harmful links, or unwittingly granting access to secure systems. In the context of LLMs, hallucinations can inadvertently support such schemes by generating or amplifying deceptive content.
How to protect your Gen AI application against Hallucination
The datasets are curated based on hallucinations involving many scenarios, including:
- Assumptions made about the data or prompt that were not explicitly stated
- Incorrect statements based on commonly known facts or general knowledge
Our hallucination model supports RAG-based prompts, long context sizes, and more.
Invoke hallucination detectiong using the layerup.hallucination
guardrail. Additionally, view hallucination score and justifications via the Hallucination Center on the Layerup Security dashboard.