StudyBuddy
AI Study Guide with RAG
2024 · Product Owner
Trust
Every output is grounded in verifiable source material.
Learning
Designed for progressive mastery and retention.
PROBLEM & USERS
Students struggle to create effective study materials from dense lecture PDFs. Generic AI tools hallucinate content not covered in the curriculum.
Target users: University students preparing for exams who need curriculum-aligned study materials.
CONSTRAINTS
- Generated content must be 100% grounded in uploaded materials
- Must handle complex, unstructured PDF layouts
- Fast enough for real-time study sessions
APPROACH
RAG pipeline design
Built a retrieval-augmented generation pipeline that chunks, embeds, and retrieves relevant passages before generating quiz questions — ensuring every answer maps to source material.
Unstructured input handling
Developed robust PDF parsing that handles tables, diagrams references, and multi-column layouts common in academic materials.
Prompt engineering
Iterative prompt refinement to ensure generated questions test understanding rather than surface-level recall, while staying strictly within curriculum bounds.
WHAT SHIPPED
A web application where students upload lecture PDFs and receive auto-generated, curriculum-grounded quizzes with detailed analytics on knowledge gaps.
Architecture Snapshot
Input
University students preparing for exams who need curriculum-aligned study materials.
Core Decision
RAG architecture grounds every answer in source documents
Output
Zero hallucinated content in generated quizzes
Stack
- React
- Python
- RAG Pipeline
- LLM APIs
IMPACT
- Eliminated hallucinated quiz content through RAG grounding
- Students report 40% faster exam preparation
LEARNINGS
- PDF parsing quality directly impacts RAG retrieval accuracy
- Chunk size and overlap significantly affect answer quality
NEXT STEPS
- Add support for collaborative study groups
- Implement adaptive difficulty based on quiz performance