MongoRAG: Leveraging MongoDB Atlas as a Vector Database with Databricks-Deployed Embedding Model and LLMs for Retrieval-Augmented Generation
towardsai.net
Author(s): Dwaipayan Bandyopadhyay Originally published on Towards AI. Today, in this article, I will give a detailed walkthrough about how we can leverage MongoDBs own Atlas as a Vector Search Index and Embedding model and LLM served as an endpoint in the Databricks portal to do Retrieval Augmented Generation (RAG) on a piece of data.Source : Image by AuthorIn todays AI World, where large amounts of structured and unstructured data are generated daily, accurately using knowledge has become the cornerstone of modern-day technology. Retrieval Augmented Generation (RAG) is a widely used approach that solves real-world data problems by amalgamating the power of Generative AI and Information Retrieval.Retrieval Augmented Generation generally consists of Three major steps, I will explain them briefly down below Information Retrieval The very first step involves retrieving relevant information from a knowledge base, database, or vector database, where we store the embeddings of the data from which we will retrieve information. This Retrieval part is typically done via Similarity Search, in which we find the similarities between the embedded query and the embeddings already stored in the Vector Database.Augmentation Step After retrieving the similar information from the Vector Database, it gets combined with the query asked by the user so that the retriever gets the context to what has been asked and form a better answer for the query.Generation Step This is the final step, where a Large Language Model comes into play, we feed the augmented information to the LLM, and it generates a proper human readable answer based on that information provided. Feeding of the augmented information is crucial because otherwise the AI might generate some random information as it doesnt have any context of what has been asked.What is MongoDB Atlas?Atlas is a multi-cloud database service provided by MongoDB in which the developers can create clusters, databases and indexes directly in the cloud, without installing anything locally. Basically, its MongoDB on Cloud, users can create an account by signing up from their official website provided below MongoDB Atlas: Cloud Document Database | MongoDBAfter signing in for the very first time, just follow the steps mentioned in the below documentation to spin up a free cluster.Get Started with Atlas MongoDB AtlasAfter the Cluster has been created, its time to create a Database and a collection. Now, as MongoDB is a NoSQL Database, we have to create a Database first (unlike Schema for SQL Databases, although the concept is same), then inside the Database we have to create a collection, in which we can store documents (It is like creating a table inside a Database). If this feels confusing, please refer to the following article of how to create a Collection and Database, but remember, do not add any documents, just create collection and a database.Connecting MongoDB with Python The Coding part starts nowNow, we will connect MongoDB with Python, so that we can do the rest of the steps programmatically, without using the UI for a second.To connect and access MongoDB Atlas via Python, we need to install a package called pymongo. It can be installed via the following pip command.pip install pymongoAfter it has been installed, we will import the class MongoClient to connect with MongoDB via Python. For that we will require the connection string, which can be found under Drivers settings after clicking on Connect from the Cluster. The process can be found in the following link, Step 2.Quick Start: Getting Started With MongoDB Atlas and Python | MongoDBAfter the connection string is found, write and execute the below to connect with MongoDBfrom pymongo import MongoClientclient = MongoClient("YOUR_CONNECTION_URL")dbName = "YOUR_DATABASE_NAME"collectionName = "YOUR_COLLECTION_NAME"collection = client[dbName][collectionName]This will establish the connection with MongoDB, if no errors are encountered, then the connection has been successfully made with MongoDB.After the connection has been established, lets talk about all the other packages we require to do the entire RAG process, apart from pymongo. Install the following packages via pippip install langchainpip install langchain_databrickspip install langchain_mongodbWe only require these three packages to do the entire process. After they are installed successfully, lets import all the necessary classes from these packages.Importing Necessary classes from the packagesfrom pymongo import MongoClientfrom langchain_mongodb.vectorstores import MongoDBAtlasVectorSearchfrom langchain.document_loaders import TextLoaderfrom langchain.text_splitter import RecursiveCharacterTextSplitterfrom langchain.chains import RetrievalQAfrom langchain_databricks import ChatDatabricksfrom langchain_databricks import DatabricksEmbeddingsAs we have already established the connection with Databricks, lets just load our data and do the chunking using RecursiveCharacterTextSplitter. We will be keeping each chunk size as 1000 with an overlapping of 100 characters and a new paragraph(\n\n) as a separator.# Importing the data using TextLoaderloader = TextLoader("story.txt")data = loader.load()# Configuring the Chunking strategytext_splitter = RecursiveCharacterTextSplitter( chunk_size=1000, chunk_overlap=100, separators="\n\n")# keeping the chunks in this variablechunked_docs = text_splitter.split_documents(data)Configuring LLM and Embedding ModelsNext, we will configure our Embedding model and Large Language Model which we are going to use. Now, here we will be using the models which are serving as an endpoint in Databricks Portal. If someone dont have the access of Databricks, then they can go with the regular approach of using OpenAIEmbeddings and ChatOpenAI classes, and configure them accordingly.embeddings = DatabricksEmbeddings( endpoint="databricks-gte-large-en")llm = ChatDatabricks( target_uri="databricks", endpoint="databricks-meta-llama-3-1-70b-instruct", temperature=0.0,)We will be using the GTE-Large embedding model and the Meta LLama 3.1 70B Instruct models for this demo.Creating a Vector Search Index in MongoDB AtlasNow, after all the configuration is done, we will be creating a Vector Search Index in Atlas, in which we will store our embeddings and use them later to do the RAG. There are two ways to create the Vector Search Index, one is either the UI or the other way is via Code. Now Atlas provides us with a default Search Index name i.e vector_index, if someone wants to go by this name, then they can just write and execute the following codevectorStore = MongoDBAtlasVectorSearch.from_documents( chunked_docs, embeddings, collection=collection).create_vector_search_index( dimensions=1024 )This will create a vector search index named vector_index with the dimension 1024, inside the collection we created earlier. We just have the pass the chunked documents, alongside the embeddings and the collection configuration via which we connected to Atlas.Image before the creation of the Search Index (execution of the above code)Source : Image by AuthorImage after executing the above code (creation of the default search index)Source : Image by AuthorAs we can see now, after the execution of the above piece of code, our search index with the default name vector_index has been created and 129 documents have been inserted (which is the number of chunks created earlier)But, if someone wants to go a step further and create their own Search Index by providing own custom name, then we need to make some changes in the above code. First, we need to create the custom index using the name provided by the user, and then insert the embeddings into it, this cannot be done in one go (if done programmatically).Creating custom vector search indexMongoDBAtlasVectorSearch(index_name="mongo_rag", collection=collection, embedding=embeddings).create_vector_search_index( dimensions=1024)Here, we are creating an index called mongo_rag first with the dimension of 1024, now the dimension is very crucial whether we create the by-default index or custom index, because if this dimension doesnt match with the one of the Embedding model, then it will be a major issue, the application will not even execute. for the embedding model used here i.e GTE-Large, the dimension is 1024.Image after creating the custom index (embeddings are not yet added)Source : Image by AuthorAs we can see here, the index has been created successfully, but the Documents are still at 129 values as we havent populated the embeddings here. We should delete the previously added chunks first, otherwise we will just push the same chunks again, which will be a repetition which might introduce hallucinations.Populating the Custom Index with EmbeddingsUsing the following code, we can populate the custom index with the embeddingsvectorStore = MongoDBAtlasVectorSearch.from_documents(index_name="mongo_rag", embedding=embeddings, documents=chunked_docs, collection=collection)In this approach, while inserting embeddings into the search index, we are providing the index_name here, this will let us store the indexes in that particular search index.Designing the RAG functionIn this step, we will just design a generic RAG function using the LLMs and Endpoint configuration we defined earlier.def query_data(query): # Perform Atlas Vector Search using Langchain's vectorStore # similarity_search returns MongoDB documents most similar to the query docs = vectorStore.similarity_search(query, k=3) # Putting the similar chunks into a list to print it later similar_chunks = [chunk for chunk in docs] # Setting up the retriever defined using MongDBAtlasVectorSearch retriever = vectorStore.as_retriever() # Load "stuff" documents chain. Stuff documents chain takes a list of documents, # inserts them all into a prompt and passes that prompt to an LLM. qa = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=retriever) # Execute the chain retriever_output = qa.invoke(query) # Return Atlas Vector Search output, and output generated using RAG Architecture return (f"Similar Chunks\n-{similar_chunks}\n, Answer-{retriever_output}")Now we will pass a sample query and check how it is workingquery = "Explain the Character of Macbeth"query_data(query)Answer'Similar Chunks\n-[Document(metadata={\'_id\': \'6798ecf0a3137ba55ff6b544\', \'source\': \'story.txt\'}, page_content="1606\\nTHE TRAGEDY OF MACBETH\\n\\n\\nby William Shakespeare\\n\\n\\n\\nDramatis Personae\\n\\n DUNCAN, King of Scotland\\n MACBETH, Thane of Glamis and Cawdor, a general in the King\'s\\narmy\\n LADY MACBETH, his wife\\n MACDUFF, Thane of Fife, a nobleman of Scotland\\n LADY MACDUFF, his wife\\n MALCOLM, elder son of Duncan\\n DONALBAIN, younger son of Duncan\\n BANQUO, Thane of Lochaber, a general in the King\'s army\\n FLEANCE, his son\\n LENNOX, nobleman of Scotland\\n ROSS, nobleman of Scotland\\n MENTEITH nobleman of Scotland\\n ANGUS, nobleman of Scotland\\n CAITHNESS, nobleman of Scotland\\n SIWARD, Earl of Northumberland, general of the English forces\\n YOUNG SIWARD, his son\\n SEYTON, attendant to Macbeth\\n HECATE, Queen of the Witches\\n The Three Witches\\n Boy, Son of Macduff \\n Gentlewoman attending on Lady Macbeth\\n An English Doctor\\n A Scottish Doctor\\n A Sergeant\\n A Porter\\n An Old Man\\n The Ghost of Banquo and other Apparitions\\n Lords, Gentlemen, Officers, Soldiers, Murtherers, Attendants,"), Document(metadata={\'_id\': \'6798ecf0a3137ba55ff6b5a7\', \'source\': \'story.txt\'}, page_content="Was a most sainted king; the queen that bore thee,\\n Oftener upon her knees than on her feet,\\n Died every day she lived. Fare thee well!\\n These evils thou repeat\'st upon thyself\\n Have banish\'d me from Scotland. O my breast,\\n Thy hope ends here!\\n MALCOLM. Macduff, this noble passion,\\n Child of integrity, hath from my soul\\n Wiped the black scruples, reconciled my thoughts\\n To thy good truth and honor. Devilish Macbeth\\n By many of these trains hath sought to win me\\n Into his power, and modest wisdom plucks me\\n From over-credulous haste. But God above\\n Deal between thee and me! For even now\\n I put myself to thy direction and \\n Unspeak mine own detraction; here abjure\\n The taints and blames I laid upon myself,\\n For strangers to my nature. I am yet\\n Unknown to woman, never was forsworn,\\n Scarcely have coveted what was mine own,\\n At no time broke my faith, would not betray\\n The devil to his fellow, and delight"), Document(metadata={\'_id\': \'6798ecf0a3137ba55ff6b57d\', \'source\': \'story.txt\'}, page_content="Particular addition, from the bill\\n That writes them all alike; and so of men. \\n Now if you have a station in the file,\\n Not i\' the worst rank of manhood, say it,\\n And I will put that business in your bosoms\\n Whose execution takes your enemy off,\\n Grapples you to the heart and love of us,\\n Who wear our health but sickly in his life,\\n Which in his death were perfect.\\n SECOND MURTHERER. I am one, my liege,\\n Whom the vile blows and buffets of the world\\n Have so incensed that I am reckless what\\n I do to spite the world.\\n FIRST MURTHERER. And I another\\n So weary with disasters, tugg\'d with fortune,\\n That I would set my life on any chance,\\n To mend it or be rid on\'t.\\n MACBETH. Both of you\\n Know Banquo was your enemy.\\n BOTH MURTHERERS. True, my lord.\\n MACBETH. So is he mine, and in such bloody distance\\n That every minute of his being thrusts \\n Against my near\'st of life; and though I could")]\n, Answer-{\'query\': \'Explain the Character of Macbeth\', \'result\': "Based on the provided context, Macbeth is a complex character who is the Thane of Glamis and Cawdor, and a general in the King\'s army. He is a prominent figure in the play and is driven by a desire for power and prestige. \\n\\nInitially, Macbeth is portrayed as a respected and accomplished military leader, but as the play progresses, his darker qualities are revealed. He is shown to be ruthless, ambitious, and willing to do whatever it takes to achieve his goals, including murder. \\n\\nMacbeth\'s relationship with his wife, Lady Macbeth, also plays a significant role in shaping his character. He is influenced by her goading and encouragement, which pushes him to commit regicide and seize the throne. \\n\\nHowever, Macbeth\'s actions are also motivated by a sense of insecurity and paranoia, as he becomes increasingly obsessed with the idea of being overthrown and killed. This fear drives him to order the murder of his friend Banquo and his family, further highlighting his descent into darkness and tyranny.\\n\\nThroughout the play, Macbeth\'s character undergoes a significant transformation, from a respected nobleman to a tyrannical and isolated ruler. His downfall is ultimately sealed when he is killed by Macduff, and his head is brought to Malcolm, the rightful king. \\n\\nIt\'s worth noting that the provided context only gives a glimpse into Macbeth\'s character, and a more comprehensive understanding would require a broader analysis of the entire play."}'The output can be further modified based on the requirement.Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming asponsor. Published via Towards AI
0 Комментарии ·0 Поделились ·63 Просмотры