RAG Chatbot
Rag chatbot with a localhost MCP server
Building a RAG-based HR chatbot for providing rules in the workplace with the localhost MCP server as a function-calling hub
Overview
This project implements a Retrieval-Augmented Generation (RAG) chatbot using Streamlit and the MCP server. Users can upload PDF files, and the chatbot retrieves relevant information from the PDFs to answer natural language questions. The system leverages OpenAI models, LangChain utilities, and an in-memory vector store for efficient document retrieval.
Features
-
MCP Tool Integration:
Tool orchestration with MCP ensures smooth communication between document indexing, retrieval, and answer generation. The backend tools can be extended or replaced easily as new functionalities are added. -
PDF Upload and Parsing:
Upload a PDF file which is then parsed usingPDFPlumberLoader
to extract text content. -
Text Chunking:
The extracted text is split into smaller chunks usingRecursiveCharacterTextSplitter
to facilitate efficient indexing and retrieval. -
Document Indexing:
Chunks are indexed in an in-memory vector store with embeddings generated viaOpenAIEmbeddings
. -
Similarity Search (Consine Similarity):
When a user submits a query, the chatbot performs a similarity search to retrieve the most relevant document chunks based on the query. -
Prompt-Based Answer Generation:
A custom prompt template combines the user question with retrieved context, and a GPT-4 powered LLM (usingChatOpenAI
) generates the final answer. -
Interactive Interface:
The application uses Streamlit to provide an interactive, chat-like interface where user questions and bot responses are displayed.