Safeguard your Gen AI journey with the OWASP Top 10 for LLM Apps

In this blog John Sotiropoulos, Kainos and Steve Wilson, Contrast Security share their advice on using the OWASP Top 10 for LLMs to manage your security risks.

Home · Insights · Safeguard your Gen AI journey with the OWASP Top 10 for LLM Apps

Date posted

11 August 2023

Reading time

8 minutes

On August 1, OWASP published its Top 10 for Large Language Model (LLM) Applications. OWASP is a global non-profit organisation dedicated to software security. Its Top 10 lists have become the reference for application security. The fact that it has turned its attention to LLMs reflects the meteoric rise of those applications.  

The list was produced by a community of nearly 500 experts around the world. Steve Wilson of Contrast Security led the project, and Kainos was an active supporter, with John Sotiropoulos being part of the core team providing expertise and governance. Working together, they researched and agreed on the critical security risks for LLM applications. It aims to give developers and security professionals practical and actionable security guidance. Practical examples and references are included.

The list can also function as a valuable security compass to help CISOs, CTOs, and Programme and Technology leaders navigate the security risks in this rapidly evolving area.

Our view is that organisations will follow an iterative approach to evolve the adoption of Generative AI.  In this blog, we'll walk you through one common approach and its typical adoption stages. We'll relate this to how you use the OWASP Top 10 for LLM applications. This will help you to manage your security risks.  

Getting started with LLM applications

The first step in harnessing the power of LLMs could be as simple as a simple app communicating with a public LLM (e.g., OpenAI’s ChatGPT) via an Application Programming Interface (API). The model will most likely be sandboxed and with limited access to the internet and other sources. The user or application will communicate with the LLM using prompts.

Prompts are free-form text, unsurprisingly prompt injection is at the top of the OWASP list. Prompt Injections are carefully crafted prompts that can be exploited to cause unintended actions by the LLM. In a sandboxed model, direct prompt injections or jailbreaking can enable attackers to produce harmful or biased content or reveal sensitive data. Jailbreaking refers to bypass mechanisms or constraints implemented to ensure the model operates safely, ethically, and within the intended boundaries. Adding guardrails and separating the types of content based on trust are some of the mitigations in this scenario. Guardrails are additional programmable safety controls applied to interactions with LLMs. NVIDIA NeMo Guardrails and Guardrails AI are two frameworks that can help build guardrails for LLMs.

If another system consumes the output, insecure output handling will be a related vulnerability requiring output validation.  Since the output is free text, this may not always be straightforward and if you’re ingesting untrusted info into your prompts, you should treat all output from the LLM as also untrusted. Having a human to review outputs is an effective defence against the previous two vulnerabilities. A good example of this is code generation which could otherwise lead to harmful code execution.

But there is another vital reason for human review, which is captured in the overreliance entry. LLMs produce incredibly convincing responses. However, the results will not always be accurate. Especially as LLMs are susceptible to “hallucinations,” where the LLM’s statistical model generates false information but states it as authoritative. Models are designed to provide plausible-sounding language and are not currently capable of reasoning or fact-checking.  Humans tend to trust information that comes from computer systems as accurate, and today, that’s often not the case with simple LLM queries.

Finally, at this stage, it is important to safeguard the data you use in prompts. Your data could be used to further train the model. So, potentially your data could be exposed to other users. The sensitive information disclosure entry covers this. Your GDPR compliance could also be an issue. To reduce this risk, you must understand the platform’s T&Cs and make sure your data isn't used for any other purpose. The top 10 guidelines cover this in more detail under supply-chain vulnerabilities.  

Extending LLM applications with plugins

Sandboxed models have limited value, so plugins are popular for extending the model's functionality. Plugins rely on the platform's extensibility APIs. OpenAI offers its own plugins marketplace, ranging from providing access, including internet search results, to taking contextual action, such as booking a flight with Expedia or triggering workflows with Zapier. Plugins are REST APIs, driven by the model and have no application control over their execution. The supply-chain vulnerabilities entry highlights the risks of using third-party plugins without proper diligence.  

Since the model and user prompts drive plugins, there is little chance for the application to do any input or output validation. As a result, they can be susceptible to both direct and indirect injections. Indirect prompt injections can include content that will bias a response – for instance, links with misinformation – or help create phishing links and privilege escalation.  

We've seen examples of Git repository take-overs and email exfiltration because authorisation flows through plugins (“confused deputy”). We've also seen plugins impersonating the user identity when talking to external applications. Inadequate access control can allow an attacker to exploit excessive agency via a plugin. This enables the LLM to cause damaging effects when driving downstream systems that blindly accept the info.  

The insecure plugins design entry covers all these and how to safeguard a plugin, if you’re writing one. It focuses on input validation, authentication, authorisation, and explicit user confirmation when the plugin is about to perform sensitive operations. The entry can be especially useful when evaluating third-party plugins too.  

Bringing your own data

Organisations will want to enhance the relevance of LLM responses by submitting their own data. For example, a legal firm could develop an application that finds and references legal statutes. This would make generating legal documents faster and more accurate. There are options to do this; one would be to create appropriate document representations as embeddings. This data would need to be stored in vector databases. Then found using retrieval-based techniques such as Retrieval-Augmented Generation (RAG) to enrich the model’s response with their own data.  

Supply-chain vulnerabilities will be relevant to the tool used for generating embeddings and the vector databases used. OpenAI offers a semantic retrieval plugin to access and update vector databases. Note that the risks discussed in Plugins apply here, too. If you use the plugin, you will need to ensure the use of authentication and robust access control.  

Access control is a key mitigation for the biggest risk in this area, data confidentiality. Sensitive information disclosure discusses the role of access control, especially to external data sources. You can also learn how to prevent the use of Prompt Injections to leak data from connected services.  

Hosting and Fine Tuning your own LLM

Organisations may choose to host their own LLMs for privacy reasons. Additionally, they may choose to host and fine-tune their own LLM. Fine-tuning adapts a pre-trained LLM to a specific task or domain by further training it on a smaller, task-specific dataset.  

This provides complete control over the LLM stack. But it brings additional risks, typically found in developing and hosting API solutions. Supply-chain vulnerabilities will cover hosting providers. Pre-trained models and data sources supplied by third parties, are susceptible to training data poisoning.  

Researchers have already demonstrated how to achieve this on model hub Hugging Face. The breach of the Meta accounts on Hugging Face underlines the problem. Training data poisoning can also happen on data you curate, and the mitigations of the entry are applicable to both scenarios. Model training and finetuning brings risks of sensitive data being memorised and subsequent sensitive information disclosure. As the entry highlights, unlike retrieval-based approaches (e.g., RAG), it is nearly impossible to apply granular access controls to data memorised during training.  

 
Model theft is another relevant entry. This covers physical leaks, like the one by a researcher leaking Meta’s LLM (LLaMa), but also cloning a model. Traditional extraction attacks comparing inputs and outputs are unlikely in LLMs because of their size but also the variability of outputs. 

Instead, research has demonstrated how a second baseline LLM (like LLaMA) can be used to drive the victim model and create appropriate training data. These are then applied to replicate the victim model by training the baseline model. Microsoft Research has successfully refined the approach using additional response traces and has replicated OpenAI’s ChatGPT4 for research purposes. The Top 10 entry discusses preventions focusing on rate limiting and monitoring requests for cloning attacks. Using a second LLM to help you monitor for such attacks is an option to explore.  

 
Finally, LLMs introduce their version of Denial-of-Service attacks, model denial of service, which exploits prompts to starve model resources. Resource capping, API rate limiting, and monitoring are some of the defences to prevent attacks of this type.  

Advanced AI Automation

Frameworks like LangChain offer rich abstractions and allow developers to create integrations with various external systems and applications using dynamic workflows (chains) which are also able to drive multiple LLMs.  

These are sophisticated architectures that will take time to mature. Yet the activity level around frameworks like LangChain suggests this area will evolve quickly. Most of the Top 10 items apply here. Their impact becomes more pronounced when the application comprises multiple layers of integration. The frameworks themselves are a supply-chain vulnerability risk, with vulnerabilities already raised and fixed in LangChain.  

Two Top 10 items stand out here. Excessive agency and overreliance. Excessive functionality, permissions, and autonomy in rich automation can easily lead to damaging actions in downstream systems driven by LLM responses. Some key preventions include the least-privilege access and granular authentication, least applicable functionality, and output validation. Complexity in advanced automation may make overreliance on an LLM an easy option, but the same over-reliance risks apply. It is critical to identify major decision points and introduce a human into the loop. 

Adopting a risk-based approach    

As we can see, different risks apply at each stage and adopting a risk-based approach helps to avoid being overwhelmed or missing security considerations. Understand your application and map risks and vulnerabilities. Each LLM Top entry has common vulnerability examples and scenario attack sections backed with references. This will help you understand and prioritise your risks. They can be an ideal tool to standardise your threat modelling exercises, whilst the prevention sections can help map mitigations.  

 
This is a rapidly changing area, and security risks will evolve, as will the OWASP Top 10 for LLM Applications. It is important to be part of the conversation the Top 10 brings. It helps to stay up to date in your security thinking, benefit from research and other people’s experiences and influence the Top 10 with your journey.  

We encourage everyone to reach out and give us their feedback. We'll be happy to walk you through the OWASP Top 10 for LLM Apps and review how it applies to your Generative AI and LLM journey.  

Co-authored by

John Sotiropoulos

Sr Security Architect, Kainos, and Core Team Expert, OWASP Top 10 for LLM Applications ·

Steve Wilson

Chief Product Officer, Contrast Security and Project Lead, OWASP Top 10 for LLM Applications ·

Services

Impacts