Security measures and controls designed to protect against risks associated with large language models, including prompt injection, data extraction, and model manipulation attacks.
LLM security addresses insider risks when employees use language models for work tasks. Threats include prompt injection attacks to extract sensitive training data, jailbreaking attempts to bypass safety measures, and indirect prompt injection through malicious documents. Organizations must implement input sanitization, output filtering, access controls, and monitoring of LLM interactions to prevent data exposure and maintain security boundaries. The OWASP Top 10 for LLMs provides a framework for addressing these risks.