Learn how retrieval, vector search, and prompt design come together to build grounded library assistants with code.
The Coding track for Module 3 introduces the core architecture behind document-grounded chatbots. In a library setting, this means teaching an assistant to answer with local policies, guides, and service documentation instead of relying only on generic model memory.
Use the project repository as a working example of how a library-focused RAG system can be structured, tested, and extended across ingestion, retrieval, and response generation.
View IMLS_Rag_lib on GitHub βA RAG workflow usually has four major stages: loading documents, chunking them into smaller passages, embedding those passages as vectors, and retrieving the most relevant chunks at question time.
LangChain helps connect the parts of a RAG workflow into one pipeline. In practice, you can "chain" a document retriever with an LLM so the model receives both the user question and the most relevant library passages before generating an answer.
Tools like FAISS and Chroma store document chunks as mathematical vectors. Instead of matching only exact keywords, they support semantic search, which means the system can retrieve passages that are conceptually related to the question even when the wording is different.
A strong system message tells the assistant how to behave. For library bots, prompt templates can define tone, citation behavior, boundaries, and fallback rules so the model acts like a professional librarian instead of a generic chatbot.
The example below shows how the major RAG stages connect in a single workflow: load library documents, split them into chunks, embed them, store them in a vector index, and retrieve relevant context for a user question.
# 1. Load documents
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
loader = TextLoader("library_policy.txt")
documents = loader.load()
# 2. Chunk the documents
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
chunks = splitter.split_documents(documents)
# 3. Embed and index
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)
# 4. Retrieve relevant chunks
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
results = retriever.invoke("What is the borrowing limit for graduate students?")
This pattern allows a chatbot to answer from local library documents such as circulation policies, study room rules, instruction guides, and service FAQs instead of relying on generalized internet text.
π Once retrieval works, the next step is teaching the assistant how to answer like a librarian.
Scenario: βAnswer like a professional librarian.β
In this task, learners design a prompt template that tells the model to stay grounded in
retrieved context, avoid unsupported claims, and escalate uncertain questions to library staff.
from langchain.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", """
You are a professional academic librarian assistant.
Answer using only the retrieved library context.
If the context does not contain the answer, say you are not certain.
Do not invent policies, hours, or holdings.
When appropriate, recommend contacting library staff for confirmation.
"""),
("human", """
Question: {question}
Retrieved context:
{context}
""")
])