Building a RAG application with Microsoft Fabric by info.odysseyx@gmail.com August 13, 2024 written by info.odysseyx@gmail.com August 13, 2024 0 comment 31 views 31 introduction In this article, we will guide you through building generative AI applications in Microsoft Fabric.This guide walks you through implementing a Retrieval Augmented Generation (RAG) system in Microsoft Fabric using Azure OpenAI and Microsoft Fabric Eventhouse as a vector store. Why MS Fabric Eventhouse? Fabric Eventhouse is built using Kusto Engine, which provides best-in-class performance for large-scale similarity search. If you are looking to build a RAG application using a large number of embedding vectors, look no further. With MS Fabric, you can leverage the processing power required to build a vector database and the high-performance engine that powers the Fabric Eventhouse DB. To learn more about how to use Fabric Eventhouse as your vector store, please see the following links: Azure Data Explorer for vector similarity search Optimizing vector similarity search in Azure Data Explorer – Performance updates Optimizing vector similarity search at scale RAG – What is Augmented Search Generation? Large-scale language models (LLMs) are very good at generating human-like text. Initially, an LLM has extensive knowledge of the wide range of datasets used for training. This provides flexibility, but may not provide the specialized focus or knowledge required for a particular topic. Augmented Search Generation (RAG) is a technique that improves the relevance and accuracy of LLMs by incorporating real-time relevant information into the response. With RAG, LLMs are enhanced by a retrieval system that filters unstructured text to find information and then refines the LLM’s response. What is a vector database? The vector database is an essential component of the search process in RAG, allowing us to quickly and efficiently identify relevant sections of text to a query based on how closely they match the search terms. Vector DB is a data store optimized for storing and processing vector data. Vector data can represent data types such as geometric shapes, spatial data, or more abstract high-dimensional data used in machine learning applications such as embedding. These databases are designed to efficiently handle operations such as similarity search, nearest neighbor search, and other operations commonly used when dealing with high-dimensional vector spaces. For example, in machine learning, it is common to use models such as word embeddings, image embeddings, etc. to transform text, images, or other complex data into high-dimensional vectors. To efficiently search and compare these vectors, vector databases or vector stores with special indexing and search algorithms are used. In our case, we will use the Azure OpenAI Ada Embeddings model to generate embeddings. Embeddings are vector representations of text that we index and store in a Microsoft Fabric Eventhouse DB. cord You can find the code here. We will use the Project Gutenberg book Moby Dick in PDF format as our knowledge base. We read a PDF file, chop the text into 1000-character chunks, compute embeddings for each chunk, and then store the text and embeddings in a vector database (Fabric Eventhouse). Then, we ask questions and receive answers from the vector DB, and send the questions and answers to Azure OpenAI GPT4 to receive responses in natural language. File processing and embedding indexing We’ll only do this once – to generate the embeddings and then store them in our vector database, Fabric Eventhouse. Reading files from Fabric Lakehouse Create embeddings from text using the Azure OpenAI ada Embeddings model. Store text and embeddings in the Fabric Eventhouse DB. RAG – Get Answers Whenever you search for an answer in our knowledge base, do the following: Create embeddings for questions and use similarity search to retrieve answers from Fabric Eventhouse. We combine the question and the answers retrieved from the vector database and call the Azure OpenAI GPT4 model to get a “natural language” answer. Prerequisites To follow this guide, you will need to ensure you have access to the following services and have the necessary credentials and keys set up: Microsoft Fabric. Azure OpenAI Studio lets you manage and deploy OpenAI models. setting Creating a Fabric Workspace Building a Lakehouse Upload your Moby Dick pdf file Create an Eventhouse DB called “GenAI_eventhouse”. Click on the DB name and then click “Explore Data” in the top right. Create a “bookEmbeddings” table Paste the following command and run it. .create table bookEmbeddings (document_name:string, content:string, embedding:dynamic) Bring your laptop to us Get your Azure openAI endpoint and secret key and paste them into your notebook, changing the model deployment name if necessary. Get the Eventhouse URI and paste it into your notebook as “KUSTO_URI”. Connect your laptop to Lakehouse Let’s run the notebook This will install all the required Python libraries. %pip install openai==1.12.0 azure-kusto-data langchain tenacity langchain-openai pypdf After configuring the environment variables, run cell 2. OPENAI_GPT4_DEPLOYMENT_NAME="gpt-4" OPENAI_DEPLOYMENT_ENDPOINT="" OPENAI_API_KEY="" OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME = "text-embedding-ada-002" KUSTO_URI = "" Run cell 3 Here we create an Azure OpenAI client and define a function to compute the embedding. client = AzureOpenAI( azure_endpoint=OPENAI_DEPLOYMENT_ENDPOINT, api_key=OPENAI_API_KEY, api_version="2023-09-01-preview" ) #we use the tenacity library to create delays and retries when calling openAI embeddings to avoid hitting throttling limits @retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6)) def generate_embeddings(text): # replace newlines, which can negatively affect performance. txt = text.replace("\n", " ") return client.embeddings.create(input = [txt], model=OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME).data[0].embedding Run cell 4 Read the file and split it into 1000 character chunks. # splitting into 1000 char long chunks with 30 char overlap # split ["\n\n", "\n", " ", ""] splitter = RecursiveCharacterTextSplitter( chunk_size=1000, chunk_overlap=30, ) documentName = "moby dick book" #Copy File API path fileName = "/lakehouse/default/Files/moby dick.pdf" loader = PyPDFLoader(fileName) pages = loader.load_and_split(text_splitter=splitter) print("Number of pages: ", len(pages)) Run cell 5 Save text chunks to a pandas dataframe. #save all the pages into a pandas dataframe import pandas as pd df = pd.DataFrame(columns=['document_name', 'content', 'embedding']) for page in pages: df.loc[len(df.index)] = [documentName, page.page_content, ""] df.head() Run cell 6 Computing embeddings # calculate the embeddings using openAI ada df["embedding"] = df.content.apply(lambda x: generate_embeddings(x)) print(df.head(2)) Run cell 7 Writing data to MS Fabric Eventhouse df_sp = spark.createDataFrame(df) df_sp.write.\ format("com.microsoft.kusto.spark.synapse.datasource").\ option("kustoCluster",KUSTO_URI).\ option("kustoDatabase",KUSTO_DATABASE).\ option("kustoTable", KUSTO_TABLE).\ option("accessToken", accessToken ).\ mode("Append").save() Let’s check if the data is stored in the vector database. Go to the Event House and run this query. bookEmbeddings | take 10 Go back to your notebook and run the remaining cells. Create a function that calls GPT4 for NL answers. def call_openAI(text): response = client.chat.completions.create( model=OPENAI_GPT4_DEPLOYMENT_NAME, messages = text, temperature=0 ) return response.choices[0].message.content Create a feature that retrieves answers using embeddings using similarity search. def get_answer_from_eventhouse(question, nr_of_answers=1): searchedEmbedding = generate_embeddings(question) kusto_query = KUSTO_TABLE + " | extend similarity = series_cosine_similarity(dynamic("+str(searchedEmbedding)+"), embedding) | top " + str(nr_of_answers) + " by similarity desc " kustoDf = spark.read\ .format("com.microsoft.kusto.spark.synapse.datasource")\ .option("kustoCluster",KUSTO_URI)\ .option("kustoDatabase",KUSTO_DATABASE)\ .option("accessToken", accessToken)\ .option("kustoQuery", kusto_query).load() return kustoDf Search 2 answers on Eventhouse. nr_of_answers = 2 question = "Why does the coffin prepared for Queequeg become Ishmael's life buoy once the Pequod sinks?" answers_df = get_answer_from_eventhouse(question, nr_of_answers) Connect the answers answer = "" for row in answers_df.rdd.toLocalIterator(): answer = answer + " " + row['content'] Generate a prompt for GPT4 that contains a question and two answers. prompt="Question: {}".format(question) + '\n' + 'Information: {}'.format(answer) # prepare prompt messages = [{"role": "system", "content": "You are a HELPFUL assistant answering users questions. Answer the question using the provided information and do not add anything else."}, {"role": "user", "content": prompt}] result = call_openAI(messages) display(result) You have now built your first RAG app using MS Fabric. Here is all the code: here. thank you Dennis Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post Postdoc in Molecular Geobiology at the University of Copenhagen, Denmark next post Explore Exciting Junior Mechanical Engineer Opportunities at Krompt Digital in Salt Lake, Kolkata You may also like AI teachers, raises greater concerns for students than administrators: study April 16, 2025 NTT -up Upgrade 2025 Event: A showcase of possibilities without purpose April 14, 2025 Intel and others can help Western car manufacturers to compete with China April 14, 2025 Personal data collection targets the mobile app for hackers Fat for hackers April 9, 2025 Gartner detects 12 disruptive technologies for future business systems April 8, 2025 Intel Vision 2025: A bold jump with lip-boo tan in Helme April 7, 2025 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.