Home NewsX Building a RAG application with Microsoft Fabric

Building a RAG application with Microsoft Fabric

by info.odysseyx@gmail.com
0 comment 31 views


introduction

In this article, we will guide you through building generative AI applications in Microsoft Fabric.
This guide walks you through implementing a Retrieval Augmented Generation (RAG) system in Microsoft Fabric using Azure OpenAI and Microsoft Fabric Eventhouse as a vector store.

Why MS Fabric Eventhouse?

Fabric Eventhouse is built using Kusto Engine, which provides best-in-class performance for large-scale similarity search.

If you are looking to build a RAG application using a large number of embedding vectors, look no further. With MS Fabric, you can leverage the processing power required to build a vector database and the high-performance engine that powers the Fabric Eventhouse DB.

To learn more about how to use Fabric Eventhouse as your vector store, please see the following links:

Azure Data Explorer for vector similarity search

Optimizing vector similarity search in Azure Data Explorer – Performance updates

Optimizing vector similarity search at scale

RAG – What is Augmented Search Generation?

Large-scale language models (LLMs) are very good at generating human-like text.
Initially, an LLM has extensive knowledge of the wide range of datasets used for training. This provides flexibility, but may not provide the specialized focus or knowledge required for a particular topic.

Augmented Search Generation (RAG) is a technique that improves the relevance and accuracy of LLMs by incorporating real-time relevant information into the response. With RAG, LLMs are enhanced by a retrieval system that filters unstructured text to find information and then refines the LLM’s response.

What is a vector database?

The vector database is an essential component of the search process in RAG, allowing us to quickly and efficiently identify relevant sections of text to a query based on how closely they match the search terms.

Vector DB is a data store optimized for storing and processing vector data. Vector data can represent data types such as geometric shapes, spatial data, or more abstract high-dimensional data used in machine learning applications such as embedding.

These databases are designed to efficiently handle operations such as similarity search, nearest neighbor search, and other operations commonly used when dealing with high-dimensional vector spaces.

For example, in machine learning, it is common to use models such as word embeddings, image embeddings, etc. to transform text, images, or other complex data into high-dimensional vectors. To efficiently search and compare these vectors, vector databases or vector stores with special indexing and search algorithms are used.

In our case, we will use the Azure OpenAI Ada Embeddings model to generate embeddings. Embeddings are vector representations of text that we index and store in a Microsoft Fabric Eventhouse DB.

cord

You can find the code here.

We will use the Project Gutenberg book Moby Dick in PDF format as our knowledge base.

We read a PDF file, chop the text into 1000-character chunks, compute embeddings for each chunk, and then store the text and embeddings in a vector database (Fabric Eventhouse).

Then, we ask questions and receive answers from the vector DB, and send the questions and answers to Azure OpenAI GPT4 to receive responses in natural language.

File processing and embedding indexing

Dennis Schlesinger 0-1723557533819.png

We’ll only do this once – to generate the embeddings and then store them in our vector database, Fabric Eventhouse.

  • Reading files from Fabric Lakehouse
  • Create embeddings from text using the Azure OpenAI ada Embeddings model.
  • Store text and embeddings in the Fabric Eventhouse DB.

RAG – Get Answers

Dennis Schlesinger 1-1723557533821.png

Whenever you search for an answer in our knowledge base, do the following:

  • Create embeddings for questions and use similarity search to retrieve answers from Fabric Eventhouse.
  • We combine the question and the answers retrieved from the vector database and call the Azure OpenAI GPT4 model to get a “natural language” answer.

Prerequisites

To follow this guide, you will need to ensure you have access to the following services and have the necessary credentials and keys set up:

  • Microsoft Fabric.
  • Azure OpenAI Studio lets you manage and deploy OpenAI models.

setting

Creating a Fabric Workspace

Dennis Schlesinger 1-1723559126859.png

Building a Lakehouse

Dennis Schlesinger 2-1723559163845.png

Dennis Schlesinger 4-1723557533827.png

Upload your Moby Dick pdf file

Dennis Schlesinger 5-1723557533829.png

Dennis Schlesinger 6-1723557533831.png

Dennis Schlesinger 7-1723557533834.png

Create an Eventhouse DB called “GenAI_eventhouse”.

Dennis Schlesinger 8-1723557533836.png

Dennis Schlesinger 9-1723557533840.png

Click on the DB name and then click “Explore Data” in the top right.

Dennis Schlesinger 10-1723557533844.png

Create a “bookEmbeddings” table

Paste the following command and run it.

.create table bookEmbeddings (document_name:string, content:string, embedding:dynamic)

Dennis Schlesinger 11-1723557533847.png

Dennis Schlesinger 12-1723557533851.png

Bring your laptop to us

Dennis Schlesinger 13-1723557533855.png

Dennis Schlesinger 14-1723557533859.png

Get your Azure openAI endpoint and secret key and paste them into your notebook, changing the model deployment name if necessary.

Dennis Schlesinger 15-1723557533862.png

Dennis Schlesinger 16-1723557533866.png

Get the Eventhouse URI and paste it into your notebook as “KUSTO_URI”.

Dennis Schlesinger 17-1723557533870.png

Connect your laptop to Lakehouse

Dennis Schlesinger 18-1723557533873.png

Dennis Schlesinger 19-1723557533874.png

Dennis Schlesinger 20-1723557533875.png

Let’s run the notebook

Dennis Schlesinger 21-1723557533876.png

This will install all the required Python libraries.

%pip install openai==1.12.0 azure-kusto-data langchain tenacity langchain-openai pypdf

After configuring the environment variables, run cell 2.

OPENAI_GPT4_DEPLOYMENT_NAME="gpt-4"
OPENAI_DEPLOYMENT_ENDPOINT=""
OPENAI_API_KEY=""
OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME = "text-embedding-ada-002"
KUSTO_URI = ""

Dennis Schlesinger 22-1723557533880.png

Run cell 3

Here we create an Azure OpenAI client and define a function to compute the embedding.

client = AzureOpenAI(
        azure_endpoint=OPENAI_DEPLOYMENT_ENDPOINT,
        api_key=OPENAI_API_KEY,
        api_version="2023-09-01-preview"
    )

#we use the tenacity library to create delays and retries when calling openAI embeddings to avoid hitting throttling limits
@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))

def generate_embeddings(text):
    # replace newlines, which can negatively affect performance.
    txt = text.replace("\n", " ")
    return client.embeddings.create(input = [txt], model=OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME).data[0].embedding

Run cell 4

Read the file and split it into 1000 character chunks.

# splitting into 1000 char long chunks with 30 char overlap
# split ["\n\n", "\n", " ", ""]
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=30,
)

documentName = "moby dick book"
#Copy File API path
fileName = "/lakehouse/default/Files/moby dick.pdf"
loader = PyPDFLoader(fileName)
pages = loader.load_and_split(text_splitter=splitter)
print("Number of pages: ", len(pages))


Run cell 5

Save text chunks to a pandas dataframe.

#save all the pages into a pandas dataframe
import pandas as pd
df = pd.DataFrame(columns=['document_name', 'content', 'embedding'])
for page in pages:
    df.loc[len(df.index)] = [documentName, page.page_content, ""]  
df.head()

 

Run cell 6

Computing embeddings

# calculate the embeddings using openAI ada
df["embedding"] = df.content.apply(lambda x: generate_embeddings(x))
print(df.head(2))

Run cell 7

Writing data to MS Fabric Eventhouse

df_sp = spark.createDataFrame(df)
df_sp.write.\
format("com.microsoft.kusto.spark.synapse.datasource").\
option("kustoCluster",KUSTO_URI).\
option("kustoDatabase",KUSTO_DATABASE).\
option("kustoTable", KUSTO_TABLE).\
option("accessToken", accessToken ).\
mode("Append").save()

Let’s check if the data is stored in the vector database.

Go to the Event House and run this query.

bookEmbeddings
| take 10

Dennis Schlesinger 23-1723557533886.png

Go back to your notebook and run the remaining cells.

Create a function that calls GPT4 for NL answers.

def call_openAI(text):
    response = client.chat.completions.create(
        model=OPENAI_GPT4_DEPLOYMENT_NAME,
        messages = text,
        temperature=0
    )
    return response.choices[0].message.content

Create a feature that retrieves answers using embeddings using similarity search.

def get_answer_from_eventhouse(question, nr_of_answers=1):
        searchedEmbedding = generate_embeddings(question)
        kusto_query = KUSTO_TABLE + " | extend similarity = series_cosine_similarity(dynamic("+str(searchedEmbedding)+"), embedding) | top " + str(nr_of_answers) + " by similarity desc "
        kustoDf  = spark.read\
        .format("com.microsoft.kusto.spark.synapse.datasource")\
        .option("kustoCluster",KUSTO_URI)\
        .option("kustoDatabase",KUSTO_DATABASE)\
        .option("accessToken", accessToken)\
        .option("kustoQuery", kusto_query).load()
        return kustoDf

Search 2 answers on Eventhouse.

nr_of_answers = 2
question = "Why does the coffin prepared for Queequeg become Ishmael's life buoy once the Pequod sinks?"
answers_df = get_answer_from_eventhouse(question, nr_of_answers)

Connect the answers

answer = ""
for row in answers_df.rdd.toLocalIterator():
    answer = answer + " " + row['content']

Generate a prompt for GPT4 that contains a question and two answers.

prompt="Question: {}".format(question) + '\n' + 'Information: {}'.format(answer)
# prepare prompt
messages = [{"role": "system", "content": "You are a HELPFUL assistant answering users questions. Answer the question using the provided information and do not add anything else."},
            {"role": "user", "content": prompt}]
result = call_openAI(messages)
display(result)

You have now built your first RAG app using MS Fabric.

Here is all the code: here.

thank you

Dennis





Source link

You may also like

Leave a Comment

Our Company

Welcome to OdysseyX, your one-stop destination for the latest news and opportunities across various domains.

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

Laest News

@2024 – All Right Reserved. Designed and Developed by OdysseyX