Threat Modeling Insider – October 2024

Threat Modeling Insider Newsletter

38th Edition – October 2024

Welcome!

It’s that time again! We’re back with another edition of the Threat Modeling Insider Newsletter!

This packed edition features a guest article by Ben Ramirez on Threat Modeling for Retrieval-Augmented Generation (RAG) AI Applications. As promised last month, our Threat Modeling experts, Sebastien Deleersnyder and Steven Wierckx, share their insights from their trip to ThreatModCon in San Francisco.

But that’s not all! Let’s dive into what else we have in store for this month’s edition:

Threat Modeling Insider edition

Welcome!

Threat Modeling Insider edition

It’s that time again! We’re back with another edition of the Threat Modeling Insider Newsletter!

This packed edition features a guest article by Ben Ramirez on Threat Modeling for Retrieval-Augmented Generation (RAG) AI Applications. As promised last month, our Threat Modeling experts, Sebastien Deleersnyder and Steven Wierckx, share their insights from their trip to ThreatModCon in San Francisco.

But that’s not all! Let’s dive into what else we have in store for this month’s edition:

On this edition

Tips & tricks
DrawIO Attack trees plugin

Training update
An update on our upcoming training sessions.

Guest article

Threat Modeling for Retrieval-Augmented Generation (RAG) AI Applications

Introduction

AI systems, particularly those leveraging large language models (LLMs), have advanced significantly in recent years. Among these, Retrieval-Augmented Generation (RAG) architectures stand out for their ability to combine real-time information retrieval with generative capabilities, offering more accurate and contextually relevant outputs. However, this enhanced functionality also broadens the attack surface, exposing systems to security and privacy threats. 

RAG systems introduce threats because they integrate real-time corporate data retrieval with generative AI. Traditional AI systems, like standard LLMs, are static—their knowledge is limited to the data they were trained on. In contrast, RAG systems dynamically fetch and integrate corporate information, significantly increasing both complexity and the potential exposure of sensitive corporate information.. 

What Makes RAG Unique?

1. Real-Time Data Access: Unlike static LLMs, RAG systems query external, real-time corporate data sources (e.g., databases, APIs, web documents, HR applications, corporate chats, project management tools, document suites). This introduces things like: 

  • Knowledge Sources Manipulation: An attacker could compromise an external data source or insert malicious content into the system. Such tampering may lead the RAG system to generate misleading or harmful outputs (Zou et al.). Traditional LLMs, which do not retrieve data post-training, are not susceptible to this issue. 
  • Similarity Search Threats: Similarly search of vector databases in RAG systems represents a new attack surface that could be used to reverse engineer, data leakage or even generate a Denial of Service (Huang, 2023). 

2. Dual-System Complexity: 
Since RAG systems combine both retrieval and generative components, vulnerabilities in either system (retrieval or generation) can cascade, creating broader attack vectors. For instance, a RAG system might rely on proprietary databases. If these databases have weak access controls, attackers can exploit the retrieval component to gain unauthorized access to sensitive data, which the generative model might then expose. 

In the financial sector, imagine a RAG system generating real-time financial reports by retrieving data from various financial databases and news sources. If an attacker were to alter or inject false data into one of these external sources, the RAG system might generate reports with inaccurate stock prices or fabricated financial news. In such a sensitive domain, this could lead to significant financial losses or even market manipulation. In contrast, traditional AI systems—trained only on historical financial data—would not face this risk, as they don’t access real-time, manipulable data sources. 

In this article, a generic threat model for RAG systems is presented, aiming to highlight potential threats common to RAG applications. The focus is on providing a holistic view of relevant threats in corporate environments that may be overlooked. 

Overview of Retrieval-Augmented Generation (RAG) AI Systems

The primary advantage of RAG systems is their ability to provide up-to-date and factually accurate responses by referencing external, non-static data sources. Traditional LLMs are limited to the knowledge contained within their training datasets, which may become outdated. In contrast, RAG systems can fetch real-time data, enhancing the accuracy of responses in dynamic environments such as financial reporting, medical information, and customer support systems. 

Threat Modeling Approach

With the increasing adoption of RAG systems in sensitive and critical domains, there is a need to develop threat models. RAG systems incorporate multiple subsystems that could make them vulnerable to a wide array of threats—ranging from adversarial manipulation of the retrieval system to the exploitation of the generative model. 

Previous work has shown the general risk of the RAG model at different levels (Huang, 2023). This article uses  Adam Shostack’s 4-question framework (Shostack, 2014), a widely recognized approach among threat modelers, to threat model a generic RAG architecture. The context is an enterprise application that retrieves company knowledge to answer queries supporting business processes. 

What Are We Working On?

RAG Architecture Overview

A typical RAG system architecture consists of two key components: 

  • Retrieval Module: Queries a database or knowledge store (e.g., vector store, search engine, or proprietary data repository) based on user input or system prompts. 
  • Generation Module: A generative AI (usually a pre-trained large language model) that synthesizes retrieved information into a natural language response. 

Below is the architecture (Dichone, 2024), which will serve as the foundation for this threat model (Diagram generated using OWASP Threat Dragon (https://owasp.org/www-project-threat-dragon/): 

architecture Dichone
  • User (Actor): The individual or application interacting with the RAG system by submitting queries. The user expects the system to generate accurate and informative responses based on both pre-existing knowledge and real-time retrieval.
  • User Query (Data Flow): Represents the user’s input, typically in natural language. The system processes the query to search for relevant documents.
  • Embedding Model (Process): Converts both the user query and chunks of documents into vector representations (embeddings). These embeddings capture the semantic meaning of the text, allowing for efficient comparisons between the query and stored knowledge. This component is a bottleneck for both retrieval and generation.
  • Sources (Store): Repositories, databases, or systems that store large amounts of structured or unstructured data. Examples include websites, research papers, articles, and proprietary data. The system retrieves documents from these sources to provide a contextually relevant response.
  • Sources Retrieved Information (Data Flow): The communication between the external sources and the RAG system, one of the most critical data flows in the architecture.
  • Parsing and Preprocessing (Process): A crucial step that prepares user queries and knowledge sources for retrieval and generation. Raw input data is transformed into a structured, normalized format for the embedding models and retrieval components. This process can become complex in large-scale RAG systems due to data variety and the need for precision.
  • Chunks (Data Flow): Retrieved documents are broken into smaller segments or “chunks” for efficient processing.
  • Indexing (Data Flow): Embeddings of document chunks are indexed and stored in a vector database for fast retrieval, facilitating efficient searches based on user queries.
  • Vector Database (Store): Stores embeddings (numerical representations) of document chunks. When a query is submitted, the system compares the query’s embedding with those in the database to find the most semantically similar documents or sections.
  • Retrieve (Data Flow): Once the most similar document embeddings are identified, the system retrieves the corresponding document chunks.
  • Query Info (Data Flow): Used by the “Search” component to find relevant documents based on the query.
  • Search (Most Similar Results) (Process): Compares the user query’s embedding with the embeddings stored in the vector database. The system retrieves the most relevant document chunks based on similarity scores to ensure that responses are grounded in external knowledge.
  • Prompt + Relevant Docs + Query (Data Flow): This flow carries the user query, the relevant document chunks, and the prompt instructions for the LLM to generate the response.
  • Generate Response – LLM (Process): Involves the large language model (LLM), such as GPT or other generative models. The LLM processes the prompt, which includes the user query and the relevant documents, and generates a coherent, fact-based response.
  • Response (Data Flow): After the generative model produces the response, the system sends it back to the user in a readable and concise format, completing the interaction.
  • There are 4 trust boundaries drawn in the diagram:

    • User Interaction Boundary: This is the primary surface where attackers can attempt to inject malicious input or exploit the system.
    • RAG/Sources Boundary: A critical boundary where attackers could target external data sources, such as through weak API security or over-scoped tokens.
    • Vector Database Boundary: The vector database is a sensitive target, as it holds the embeddings essential for retrieval.
    • Retrieval and Generation Boundary: In complex systems, the separation between retrieval and generation adds another attack vector, especially if the models are hosted separately without proper security.

What Can Go Wrong?

The threat actors targeting a Retrieval-Augmented Generation (RAG) system can vary depending on the system’s goals, exposure, and users. If a RAG system is open or exposed to the internet, it could be targeted by criminals or financially motivated threat actors, especially if the system facilitates activities they seek. Information extraction for reconnaissance in the early stages of an attack or espionage efforts can also motivate such actors. However, one of the primary threat actors for these systems is often the insider threat. Accidental or deliberate information sharing and system exploitation by malicious insiders can lead to serious security risks, whether as part of post-exploitation activity or attempts to compromise corporate information for personal gain (Huang et al., 2024).

The primary threat category associated with RAG systems is Information Disclosure, given the vast amount of sensitive data these systems may access. Additionally, a lack of awareness of how these systems function could expose organizations to compliance risks, especially if RAG systems handle Personally Identifiable Information (PII) or Payment  Information.

Despite their sophisticated design, RAG systems are susceptible to various threats, many stemming from their reliance on external data and generative capabilities. Vulnerabilities in these systems can exacerbate existing problems in corporate environments, such as poor access control or inadequate data governance and classification.

For this threat model, the STRIDE mnemonic was used to identify potential threats through a component-based approach. While this list is not exhaustive, it covers key threats relevant to a generic context. However, the specific threats may vary based on the project, company, or industry. No threat scoring is provided here, as risk levels depend on the unique characteristics of each organization or system.

Identified Threats:

  1. Sensitive Information Disclosure through LLM and Retrieval Components (Information Disclosure): If sensitive or confidential information (e.g., corporate secrets, personal data) is inadvertently exposed through the retrieval process or fed into the LLM, it could lead to significant data breaches, privacy violations, and compliance failures. This is particularly critical for industries dealing with highly sensitive data (e.g., healthcare, finance).
  2. Data Poisoning and Source Manipulation (Tampering): Compromising the data sources or poisoning the embeddings and indexing process can lead to the generation of biased, harmful, or outright incorrect information. This can have major repercussions in high-stakes fields like medicine, law, or financial systems, where the accuracy of information is crucial.
  3. Weak access control to sources (Information Disclosure, Spoofing): If access controls on the data sources are not strong enough inside the company, unauthorized users or threat actors could gain access to confidential information, leading to data breaches quicker and easier. Users could have inadvertently discovered access they had but they didn’t realize.
  4. Exposure of information through APIs (Information Disclosure): Access to external data sources often relies on APIs, which introduce an additional attack surface. Vulnerabilities such as insufficient access control, weak API security, poor secret management, over-scoped tokens or inadequate API hardening could expose sensitive company information.
  5. Hallucinations (Denial of Service): RAG systems may generate false or fabricated information (hallucinations) when retrieved data is ambiguous or incomplete. This poses a significant risk in high-stakes fields like healthcare, finance, or legal services.
  6. Sources unavailable due to resource overconsumption (Denial of Service): Excessive resource requests during parsing and preprocessing could overwhelm external sources, resulting in a denial of service of key corporate resources. This poses a significant risk for large-scale RAG systems processing thousands or millions of user queries. Additionally, abusing the similarity search of the vector database by crafting targeted prompts is also a threat, given the substantial resources already consumed by similarity searches.
  7. Prompt Injection and Misuse of Query Information (Information Disclosure, Tampering): Malicious actors may craft prompts or queries intended to manipulate the RAG system into retrieving sensitive or inaccurate information. This can result in unauthorized access to confidential data or the generation of misleading responses, particularly when the RAG system operates in an open-ended, user-facing environment.
  8. No granular audit trail on source access (Repudiation): RAG systems often access external data using system credentials. Without a detailed audit trail, it becomes difficult to track who accessed specific information through the RAG system, making it harder to establish accountability. This might be relevant for highly regulated industries.

What Can We Do About It?

To mitigate the threats associated with RAG systems, a combination of best practices, regular security audits, and cutting-edge defense mechanisms should be implemented.

  • Implement strong access controls on source data: Granting appropriate access to corporate knowledge retrieved by the RAG solution is crucial to avoid perpetuating existing security issues. While it may seem trivial, many companies still struggle to adhere to the least privilege principle, especially in large, dynamic corporate environments. Since the responses generated by the LLM often depend on the current permissions of the user querying the information, permissions should be tailored based on data sensitivity, ensuring that corporate knowledge is accessed and used only by those with explicit authorization. Regular audits and reviews of both permissions and the outputs of the RAG system should be conducted to prevent privilege escalation and unauthorized access.(Threats 1, 3, 4)
  • Apply data masking and classification to prevent sensitive data exposure in documents sent to LLMs: Masking or redacting sensitive information, such as personal data or corporate secrets, before processing it through the LLM minimizes the risk of data breaches. Automated data classification tools and redaction techniques should be employed to prevent unauthorized data exposure. (Threats 1, 2, 7)
  • Secure API integrations: This may seem like an obvious recommendation, but it cannot be emphasized enough, as basic security issues are still commonly found in APIs. Enforce strict API security measures such as authentication, rate limiting, and end-to-end encryption to minimize the risk of exposing sensitive information. Regularly review token scopes and access controls in APIs, and avoid using overscoped tokens that may grant access to sensitive data to users who do not have the need to know. Strong API hardening and proper token management are essential to reducing the risk of information disclosure. (Threat 4)
  • Deploy monitoring and alert systems for data source integrity: Continuous monitoring of data sources for tampering, unauthorized changes, or unusual activity is crucial for detecting source manipulation. While this should be part of the existing defense and detection systems, consider scenarios—depending on the RAG system architecture—where additional integrity validations are required directly within the RAG system for critical sources. (Threat 2)
  • Implement Adequate System Resilience: Usual measures such as rate limiting and throttling to control excessive requests, alongside load balancing and caching, can reduce strain on external sources. Optimizing query preprocessing and similarity search algorithms helps minimize unnecessary resource usage, while anomaly detection for suspicious query patterns prevents abuse. Additionally, setting resource quotas and utilizing failover mechanisms ensures overall system resilience. (Threat 6)
  • Output and sources validation and verification: To mitigate the threat of hallucinations in RAG systems, especially in high-stakes fields such as healthcare, finance, and legal services, it is essential to implement rigorous validation and verification processes for the retrieved data. This involves cross-referencing outputs against trusted sources to ensure accuracy and reliability. Additionally, incorporating feedback loops that allow users to flag inaccuracies can help refine the system’s understanding over time. Leveraging human-in-the-loop approaches, where domain experts review critical outputs, can further enhance the quality of information generated. Regularly updating the knowledge base and employing advanced error-checking algorithms will also minimize the chances of ambiguity and incompleteness in the retrieved data, thereby reducing the potential for hallucinations.(Threat 5)
  • Prompt filtering and validation mechanisms to detect and block injection attempts: Implement input validation and prompt sanitization to prevent prompt injection attacks. Predefined templates or rules can help ensure only safe inputs are processed by the LLM, reducing the risk of information disclosure and tampering. Additionally, incorporating contextual understanding can improve security by detecting anomalous inputs based on prior query patterns, and real-time monitoring with alert systems can flag suspicious input behavior for further investigation, enhancing the overall protection against prompt injection threats. (Threats 7, 1)
  • Enable detailed logging and audit trails for source access actions: Maintain detailed logs of who accessed data, when, and what actions were taken. Granular audit trails help in forensic analysis and traceability, especially in regulated industries. Securely store logs for periodic review to detect anomalies or suspicious activities. To further enhance security, consider log anonymization to protect user privacy while maintaining traceability, and implement automated alerts to flag abnormal or unauthorized access patterns, improving the responsiveness and effectiveness of your logging system.(Threat 8)

Did We Do a Good Job?

The threat model presented here provides a foundation for identifying threats and recommending mitigation strategies for RAG AI systems. However, like any threat model, it has some inherent limitations that should be acknowledged:

  • Context-Specific Customization: This model is intentionally generalized. Each RAG system may face unique threats based on its specific context, such as industry regulations, user base, and operational environment that are key to determine the relevant threats and their prioritization.
  • Corporate Environment Approach: This threat model was produced from a corporate perspective, focusing on a RAG solution that accesses company information to retrieve or answer user queries. This approach may have left out other threats relevant to different applications, environments, or use cases.
  • Evolving Attack Vectors: Cyber threats evolve rapidly, and new attack techniques may emerge that exploit weaknesses in generative AI and retrieval mechanisms. Continuous threat modeling is essential to keep pace with these developments and ensure the system remains secure.
  • Technical Complexity: RAG systems consist of numerous interconnected components. Comprehensive security requires protecting not only the retrieval and generation modules but also the underlying infrastructure, databases, web interfaces, and APIs. Achieving this level of protection requires coordinated efforts across teams, including security, engineering, and data governance.
  • Human Factor: Many threats, such as social engineering attacks or insider threats, stem from human error or malicious intent. Even the most technically sound systems can be compromised if users or administrators do not follow proper security protocols.

Summary of Key Points:

  • RAG systems combine external data retrieval with generative models, enabling more relevant and fact-based responses.
  • Key threats include unauthorized access, data leakage, denial of service, source tampering, and prompt injection.
  • Mitigation strategies focus on strong access controls, detailed logging, input validation, rate limiting, and regular audits.
  • The threat model serves as a source that could be used along with others to help identify your own project threats.

In summary, securing RAG systems requires a proactive, multi-layered approach, integrating technical safeguards with organizational best practices.

References

Dichone, P. (2024, August 1). Learn RAG Fundamentals and Advanced Techniques. freeCodeCamp. Retrieved September 24, 2024, from https://www.freecodecamp.org/news/learn-rag-fundamentals-and-advanced-techniques/

Huang, K. (2023, November 22). Mitigating Security Risks in RAG LLM Applications | CSA. Cloud Security Alliance. Retrieved September 26, 2024, from https://cloudsecurityalliance.org/blog/2023/11/22/mitigating-security-risks-in-retrieval-augmented-generation-rag-llm-applications

Huang, K., Wang, Y., Goertzel, B., Li, Y., Wright, S., & Ponnapalli, J. (Eds.). (2024). Generative AI Security: Theories and Practices. Springer Nature Switzerland, Imprint: Springer.

Shostack, A. (2014). Threat Modeling: Designing for Security. Wiley.

Zou, W., Geng, R., Wang, B., & Jia, J. (n.d.). PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models. Cryptography and Security. https://arxiv.org/abs/2402.07867

Advance your career with our in-company Threat Modeling Practitioner certification - tailored training options available!

Testimonial from a Happy Customer:

The Threat Modeling training from Toreon was a game changer. The trainers were true experts, answering all our questions and guiding us through complex topics. The hands-on sessions and well-structured approach gave our team the confidence to tackle threat modeling effectively, no matter their experience level. It’s made a real impact on how we protect our organization.

Maxine McFarlane, Application Security SME, Lloyds Banking Group

CURATED CONTENT

Handpicked for you

Threat Modeling Trends and Insights from ThreatModCon 2024

The AI Risk Repository

In this article, our threat modeling experts, Sebastien Deleersnyder and Steven Wierckx, share their experience attending two significant events in the U.S.: the OWASP Global AppSec 2024 and the second annual Threat Modeling Conference. Organized by Threat Modeling Connect, this unique event gathers cybersecurity professionals and enthusiasts from around the world to explore the latest trends, tools, and techniques in threat modeling.

Sebastien and Steven provide key insights from the conference, highlighting the discussions and innovations shaping the future of threat modeling practices.

We provide a curated overview of the AI Risk Repository, created by MIT FutureTech, which serves as an essential resource outlining the landscape of AI risks. The repository includes three key components: a database of over 700 risks drawn from 43 frameworks, a causal taxonomy explaining how and why these risks arise, and a domain taxonomy categorizing them into seven domains and 23 subdomains.

It serves as a valuable tool for researchers, developers, businesses, and policymakers by providing an up-to-date overview, aiding research and policy development, and offering a shared reference point for AI risk analysis.

Application Security Blog - True Positives

Evan Oslick explores the challenges of evolving communication habits and their impact on application security. Using a humorous meme, he questions whether emergency contacts will respond in urgent situations, highlighting the trend of ignoring unknown calls.

Oslick discusses “alert fatigue” and its effects on critical notifications in emergency apps. He urges developers to broaden their threat models by considering user behavior and external factors, advocating for insights from behavioral experts to create more resilient applications that can anticipate unforeseen threats.

TIPS & TRICKS

DrawIO Attack trees plugin

This curated content introduces the Attack Graphs Plugin for Draw.io, an extension designed to enhance cyber attack modeling in Draw.io. The plugin adds a user-friendly interface with new shapes for creating detailed attack graphs that visualize potential threats.

Key features include dynamic attribute calculations that display computed values based on shape connections and the ability to link attack graphs across multiple pages. Users can also annotate edges with impact parameters, making the graphs more informative.

This tool is especially useful for cybersecurity professionals, providing essential resources for visualizing and analyzing threats effectively.

Upcoming trainings & events

Book a seat in our upcoming trainings & events

Threat Modeling Practitioner training, hybrid online, hosted by DPI

Cohort starting on 6 Dec 2024

Agile Whiteboard Hacking a.k.a. Hands-on Threat Modeling, online, hosted by Black Hat Europe, London

Next training dates:
9-10 December 2024

Agile Whiteboard Hacking a.k.a. Hands-on Threat Modeling, in-person, hosted by NDC Security, Oslo

9-10 December 2024

Threat Modeling Practitioner training, hybrid online, hosted by DPI 

Cohort starting on 6 Dec 2024

Agile Whiteboard Hacking a.k.a. Hands-on Threat Modeling, online, hosted by Black Hat Europe, London

Next training dates:
9-10 December 2024

Agile Whiteboard Hacking a.k.a. Hands-on Threat Modeling, in-person, hosted by NDC Security, Oslo

9-10 December 2024

Threat Modeling Insider Newsletter

Delivering the latest Threat Modeling articles and tips straight to your mailbox.

Start typing and press Enter to search

Shopping Cart