LLM-as-Judge

From Chat-with-PDF to Quiz-Master: Live-Grading RAG with LLM-as-Judge in Python

Coming soon: Moving beyond passive search, this live-coded session demonstrates how to build an interactive “exam engine” from complex documents. Learn how to combine layout-aware ingestion, synthetic QA generation, and an LLM-as-judge pipeline to move from basic retrieval to real-time, human-in-the-loop evaluation using Docling, DeepEval, and Marimo.