Designing the knowledge substrate
Docs, wikis, PDFs, and structured stores—normalization plan first.
session outline
- Content inventory: authoritative sources vs. stale mirrors.
- Metadata contracts: ACLs, freshness, ownership.
labs
- Sketch chunk boundaries for two real document types you bring (anonymized).
beyond-catalog topics (custom)
- Hybrid lexical + dense retrieval pairings for regulated vocabularies.
- Handling tables, scanned PDFs, and multi-language corpora common in India + Middle East rollouts.