Post

LLM - OWASP Top 10 for LLM

OWASP Top 10 for LLM

Table of contents:

ref:

  • https://arxiv.org/abs/2310.02059
  • https://neo4j.com/developer-blog/knowledge-graphs-llms-multi-hop-question-answering/

Overview

  • This undertaking is not a one-time effort but a continuous process, mirroring the ever-evolving nature of cyber threats. With the rapid advancements in LLMs, their potential for both utility and abuse will continue to grow, making the task of security a continually moving target that demands the attention and expertise.

  • In closing, the quest for robust security measures for LLMs is ongoing.

  • ensure that the tools are not just powerful and effective, but also secure and ethically used.


OWASP Top 10 for LLM

  • The frenzy of interest in Large Language Models (LLMs) following the mass-market pre-trained chatbots in late 2022 has been remarkable.

  • Businesses, eager to harness the potential of LLMs, are rapidly integrating them into their operations and client-facing offerings. Yet, the breakneck speed at which LLMs are being adopted has outpaced the establishment of comprehensive security protocols, leaving many applications vulnerable to high-risk issues.

  • The absence of a unified resource addressing these security concerns in LLMs was evident. Developers, unfamiliar with the specific risks associated with LLMs, were left with scattered resources and OWASP’s mission seemed a perfect fit to help drive safer adoption of this technology.

Who is it for?

  • the primary audience is developers, data scientists and security experts tasked with designing and building applications and plug-ins leveraging LLM technologies. aim to provide practical, actionable, and concise security guidance to help these professionals navigate the complex and evolving terrain of LLM security.

The Making of the List

  • The creation of the OWASP Top 10 for LLMs list was a major undertaking, built on the collective expertise of an international team of nearly 500 experts, with over 125 active contributors. the contributors come from diverse backgrounds, including AI companies, security companies, ISVs, cloud hyperscalers, hardware providers and academia.

  • Over the course of a month, brainstormed and proposed potential vulnerabilities, with team members writing up 43 distinct threats. Through multiple rounds of voting, refined these proposals down to a concise list of the ten most critical vulnerabilities. Each vulnerability was then further scrutinized and refined by dedicated sub-teams and subjected to public review, ensuring the most comprehensive and actionable final list.

  • Each of these vulnerabilities, along with common examples, prevention tips, attack scenarios, and references, was further scrutinized and refined by dedicated sub-teams and subjected to public review, ensuring the most comprehensive and actionable final list.

Relating to other OWASP Top 10 Lists

  • While the list shares DNA with vulnerability types found in other OWASP Top 10 lists, do not simply reiterate these vulnerabilities. Instead, delve into the unique implications these vulnerabilities have when encountered in applications utilizing LLMs.

  • the goal is to bridge the divide between general application security principles and the specific challenges posed by LLMs. This includes exploring how conventional vulnerabilities may pose different risks or might be exploited in novel ways within LLMs, as well as how traditional remediation strategies need to be adapted for applications utilizing LLMs.

The Future

  • This first version of the list will not be the last. expect to update it on a periodic basis to keep pace with the state of the industry. will be working with the broader community to push the state of the art, and creating more educational materials for a range of uses. also seek to collaborate with standards bodies and governments on AI security topics. welcome you to join the group and contribute.

Top 10

LLM Blog Cover -- OG

LLM01: Prompt Injection

  • This manipulates a large language model (LLM) through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources.

LLM02: Insecure Output Handling

  • This vulnerability occurs when an LLM output is accepted without scrutiny, exposing backend systems. Misuse may lead to severe consequences like XSS, CSRF, SSRF, privilege escalation, or remote code execution.

LLM03: Training Data Poisoning

  • This occurs when LLM training data is tampered, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. Sources include Common Crawl, WebText, OpenWebText, & books.

LLM04: Model Denial of Service

  • Attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and unpredictability of user inputs.

LLM05: Supply Chain Vulnerabilities

  • LLM application lifecycle can be compromised by vulnerable components or services, leading to security attacks. Using third-party datasets, pre-trained models, and plugins can add vulnerabilities.

LLM06: Sensitive Information Disclosure

  • LLM’s may inadvertently reveal confidential data in its responses, leading to unauthorized data access, privacy violations, and security breaches. It’s crucial to implement data sanitization and strict user policies to mitigate this.

LLM07: Insecure Plugin Design

  • LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences like remote code execution.

LLM08: Excessive Agency 过度代理

  • LLM-based systems may undertake actions leading to unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.

LLM09: Overreliance 过度依赖

  • Systems or people overly depending on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.

  • LLM10: Model Theft

    • This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.
This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.