Coding Track for Module 3

Building a Library RAG Assistant

The Coding track for Module 3 introduces the core architecture behind document-grounded chatbots. In a library setting, this means teaching an assistant to answer with local policies, guides, and service documentation instead of relying only on generic model memory.

📚

Project Reference: IMLS_Rag_lib

Use the project repository as a working example of how a library-focused RAG system can be structured, tested, and extended across ingestion, retrieval, and response generation.

View IMLS_Rag_lib on GitHub →

Key Concepts

The RAG Pipeline

A RAG workflow usually has four major stages: loading documents, chunking them into smaller passages, embedding those passages as vectors, and retrieving the most relevant chunks at question time.

Loading: bring in PDFs, policy files, FAQs, LibGuide text, or web pages.
Chunking: split long documents into pieces small enough for precise retrieval.
Embedding: convert each chunk into a numeric vector that captures meaning.
Retrieval: find the chunks most semantically similar to a user question.

LangChain Basics

LangChain helps connect the parts of a RAG workflow into one pipeline. In practice, you can "chain" a document retriever with an LLM so the model receives both the user question and the most relevant library passages before generating an answer.

Vector Databases (FAISS / Chroma)

Tools like FAISS and Chroma store document chunks as mathematical vectors. Instead of matching only exact keywords, they support semantic search, which means the system can retrieve passages that are conceptually related to the question even when the wording is different.

Prompt Templates

A strong system message tells the assistant how to behave. For library bots, prompt templates can define tone, citation behavior, boundaries, and fallback rules so the model acts like a professional librarian instead of a generic chatbot.

💡 Coding Projects

Python + LangChain

Task A: Understand the RAG Pipeline in Code

The example below shows how the major RAG stages connect in a single workflow: load library documents, split them into chunks, embed them, store them in a vector index, and retrieve relevant context for a user question.

# 1. Load documents
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma

loader = TextLoader("library_policy.txt")
documents = loader.load()

# 2. Chunk the documents
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
chunks = splitter.split_documents(documents)

# 3. Embed and index
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)

# 4. Retrieve relevant chunks
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
results = retriever.invoke("What is the borrowing limit for graduate students?")

Why this matters in libraries

This pattern allows a chatbot to answer from local library documents such as circulation policies, study room rules, instruction guides, and service FAQs instead of relying on generalized internet text.

💻 Open IMLS_Rag_lib Repository

🚀 Once retrieval works, the next step is teaching the assistant how to answer like a librarian.

💡 Mini Exercise

Prompt Design

Task B: Write a Library System Prompt

Scenario: “Answer like a professional librarian.”
In this task, learners design a prompt template that tells the model to stay grounded in retrieved context, avoid unsupported claims, and escalate uncertain questions to library staff.

from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", """
You are a professional academic librarian assistant.
Answer using only the retrieved library context.
If the context does not contain the answer, say you are not certain.
Do not invent policies, hours, or holdings.
When appropriate, recommend contacting library staff for confirmation.
"""),
    ("human", """
Question: {question}

Retrieved context:
{context}
""")
])

Good Prompt Behaviors

Uses retrieved passages before answering.
Maintains a helpful, professional tone.
Admits uncertainty when evidence is missing.
Protects against hallucinated library policies.

Where FAISS / Chroma Fit

Store chunk embeddings for fast similarity search.
Return semantically related passages to the prompt.
Improve answers even when wording does not exactly match.
Support scalable search across large document collections.

Back to All Modules