(AISite) Conversational AI Site for Private Knowledge Base

A full-stack application leveraging Retrieval-Augmented Generation (RAG) to provide accurate answers from private documents.

AI Chatbot Interface

Project Overview

The AISite is a sophisticated, full-stack conversational AI platform designed to serve as an intelligent interface for an internal knowledge base. Its core purpose is to allow users to get fast and accurate answers from a private collection of documents (PDFs, DOCX, etc.) through an intuitive chat interaction, preventing the AI from hallucinating or providing information from outside its designated knowledge scope.

Core AI Technology: Retrieval-Augmented Generation (RAG)

The intelligence behind the chatbot is a meticulously designed RAG pipeline, which ensures that every answer is grounded in the provided documents. The process works as follows:

  1. Query Transformation: The user's raw question is first processed by a small, fast LLM (Llama 3.2 3B) to rephrase it into an optimized search query.
  2. Vector Retrieval: The optimized query is used to search a FAISS vector database, retrieving the 16 most semantically similar text chunks from the knowledge base.
  3. Relevance Gate: To reduce noise, a hybrid filter ensures only the most relevant chunks are used. The top-scoring chunk is always included, while the rest must pass a similarity threshold to proceed.
  4. Prompt Engineering: The filtered, relevant text chunks are compiled into a single context block, which is then combined with the user's original question to create a final, context-rich prompt.
  5. Answer Generation: This final prompt is sent to the main LLM (Llama 3.2 3B or Deepseek qween 1.5B), which is strictly instructed to formulate an answer based *only* on the provided context, saying "I don't know" if the information is not present.
  6. Streaming Response: The generated answer is streamed back to the user word-by-word, creating a responsive and dynamic chat experience.

Full-Stack Architecture

The application is built on a modern, decoupled architecture with three main components communicating via a REST API.

  • Frontend: An interactive and responsive user interface built with Next.js, TypeScript, and Material-UI (MUI).
  • Backend: A high-performance API server built with FastAPI (Python) and SQLAlchemy for business logic, AI processing, and database interactions.
  • Database: Microsoft SQL Server is used for persistent data storage, including user information, chat history, and document metadata.
  • Deployment: The entire stack is deployed on a Windows Server, with the Frontend and Backend processes managed as continuous services by NSSM, and unified under a single entry point using IIS as a reverse proxy.