This post is written by Bartosz Pietrucha
Building enterprise-grade LLM applications is a necessity in today's business environment. While the accessibility of models and APIs is improving, one significant challenge remains: ensuring their security and effectively managing their permissions.
To address this, Fine-Grained Authorization (FGA) and Retrieval Augmented Generation (RAG) are effective strategies for building secure, context-aware AI applications that maintain strict access control. In this article, we’ll explore how FGA and RAG can be applied in a healthcare setting while safeguarding sensitive data.
We’ll do this by guiding you through implementing a Relationship-Based Access Control (ReBAC)authorization system that supports real-time updates with three tools: AstraDB, Langflow, and Permit.io.
To better understand the complexity of authorization in LLM applications, and the solutions offered by FGA and RAG, we can look at the digital healthcare space - as it presents a perfect example where both AI capabilities and stringent security are essential. Healthcare providers increasingly want to leverage LLMs to streamline workflows, improve decision-making, and provide better patient care. Doctors and patients alike want easy access to medical records through intuitive AI interfaces such as chatbots.
However, medical data is highly sensitive and should be carefully regulated. While LLMs can provide intelligent insights, we must ensure that they only access and reveal information that users are authorized to see. Doctors, for example, should only see diagnoses from their assigned medical centers, and patients should only be able to access their own records.
Proceeding with the digital healthcare example, let’s look at an example of a medical application.
This application is comprised of several resources, a couple of roles, and a few relationships between these entities:
Resource Types :
Roles :
Relationships :
As you can see, the hierarchal relationships of our resources mean implementing traditional role-based access control, where permissions are assigned directly, will be insufficient.
The complexity of this applications authorization will require us to use more fine-grained authorization (FGA) solutions - in this case, Relationship-Based Access Control (ReBAC).
ReBAC, an authorization model inspired by Google's Zanzibar paper, derives permissions from relationships between entities in the system - unlike traditional role-based access control (RBAC), where permissions are assigned directly.
The power of ReBAC lies in how permissions are derived through these relationships. Let’s look at a visual representation of our example:
In the above example, Dr Bartosz has access to the Virus diagnosis not because of a directly granted permission but rather because he is assigned to Warsaw Medical Center, which contains the Afternoon Visit, which contains the diagnosis. Thus, the relationships between these resources form a chain that allows us to derive access permissions.
There are clear benefits to using this approach:
But the challenge doesn’t end there - as we are building a system that needs to work with LLMs, it needs to have the ability to evaluate these relationship chains in real-time. In the next section, we will learn how to create an implementation that allows that.
Before we continue, let's quickly review the authorization rules we want to ensure are in place:
These requirements can be achieved through the use of Retrieval Augmented Generation (RAG).
RAG (Retrieval Augmented Generation) is a technique that enhances LLM outputs by combining two key steps: first, retrieving relevant information from a knowledge base, and then using that information to augment the LLM's context for more accurate generation. While RAG can work with traditional databases or document stores, vector databases are particularly powerful for this purpose because they can perform semantic similarity search, finding conceptually related information even when exact keywords don't match.
In practice, this means that when a user asks about "heart problems," the system can retrieve relevant documents about "cardiac issues" or "cardiovascular disease," making the LLM's responses both more accurate and comprehensive. The "generation" part then involves the LLM synthesizing this retrieved context with its pre-trained knowledge to produce relevant, factual responses that are grounded in your specific data.
For our implementation, we will use AstraDB as our vector database. AstraDB offers the following benefits:
To implement our RAG pipeline, we'll also use LangFlow, an open-source framework that makes building these systems intuitive through its visual interface. LangFlow systems can be developed with a Python environment running locally or in the cloud-hosted DataStax platform. In our case, we are choosing the second option by creating a serverless (vector) AstraDB database under: https://astra.datastax.com
In our implementation, authorization checks should happen at a crucial moment - after retrieving data from the vector database but before providing it to the LLM as context. This way, we maintain search efficiency by first finding all relevant information and later filtering out unauthorized data before it ever reaches the LLM. The LLM can only use and reveal information the user is authorized to see.
These security checks are implemented using Permit.io, which provides the infrastructure for evaluating complex relationship chains in real time. As your data grows and relationships become more complex, the system continues to ensure that each piece of information is only accessible to those with proper authorization.
To get started with Permit, you can easily create a free account by visiting the website at https://app.permit.io. Once your free account is created, you'll have access to Permit's dashboard, where you can set up your authorization policies, manage users and roles, and integrate Permit into your applications. The free tier offers all the necessary features to create a digital healthcare example with relationship-based access control (ReBAC).
Both LangFlow and Permit offer free accounts to start work, so you don’t have to pay anything to build such a system and see how it works for yourself.
Before we dive into the implementation details, it's important to understand the tool we'll be using - Langflow. Built on top of LangChain, Langflow is an open-source framework that simplifies the creation of complex LLM applications through a visual interface. LangChain provides a robust foundation by offering standardized components for common LLM operations like text splitting, embedding generation, and chain-of-thought prompting. These components can be assembled into powerful pipelines that handle everything from data ingestion to response generation.
What makes Langflow particularly valuable for our use case is its visual builder interface, which allows us to construct these pipelines by connecting components graphically - similar to how you might draw a flowchart. This visual approach makes it easier to understand and modify the flow of data through our application, from initial user input to the final authorized response. Additionally, Langflow's open-source nature means it's both free to use and can be extended with custom components, which is crucial for implementing our authorization checks.
Our Langflow solution leverages two distinct yet interconnected flows to provide secure access to medical information:
The ingestion flow is responsible for loading diagnoses into AstraDB along with their respective embeddings. We use MistralAI to generate embeddings for each diagnosis, making it possible to perform semantic searches on the diagnosis data later. The key components involved in this flow are:
The chat flow is responsible for interacting with users and serving them the required diagnosis data. The images below are supposed to be read from left to right (the right side of the first one continues as the left side of the second one):
? Note: There is an additional “_ Pip Install” _ component that is executed only once to install permit module. This is because we are implementing LangFlow on DataStax low-code platform. This step is equivalent to executing pip install permit locally.
The sequence of operations in the Chat Flow is as follows:
Seasonal Migraine Flu virus with high fever --- You are a doctor's assistant and help to retrieve information about patients' diagnoses. Given the patients' diagnoses above, answer the question as best as possible. The retrieved diagnoses may belong to multiple patients. Question: list all the recent diagnoses Answer:
To run the PermitFilter component, which plays a crucial role in our implementation, we need a running instance of Permit's Policy Decision Point (PDP). The PDP is responsible for evaluating policies and making decisions on whether a given action is permitted for a specific user and resource. By enforcing this permission check before the context reaches the language model, we prevent the leakage of sensitive information and ensure the enforcement of access control policies.
The complete implementation is available in our GitHub Repository, where you'll find:
To start interacting with our AI assistant with authorization checks implemented we can simply start the LangFlow playground. In the example below, I am authenticated as bartosz@health.app which means I have access to only Afternoon-Visit and Evening-Visit without Morning-Visit with Diabetes. This means that the LLM does not have the information about diabetes in its context.
Securing access to sensitive healthcare data while leveraging LLM capabilities is both a priority and a challenge. By combining RAG and fine-grained authorization, we can build AI applications that are both intelligent and secure. The key benefits are:
Using tools like LangFlow and Permit.io, healthcare providers can implement relationship-based access control systems that respond dynamically to role and relationship changes, ensuring data is accessible only to authorized individuals. By integrating these solutions, healthcare organizations can effectively harness AI to improve patient care without compromising on security.
The above is the detailed content of Building AI Applications with Enterprise-Grade Security Using RAG and FGA. For more information, please follow other related articles on the PHP Chinese website!