Skip to main content

2 docs tagged with "llm"

View all tags

Prompt Architecture

Comprehensive architecture documentation for the prompt processing and chat completion system, covering client-side flow, server-side processing, WebSocket streaming, and performance optimizations.

Rapid Reply Architecture

Comprehensive architecture documentation for the Rapid Reply feature that provides immediate responses using fast mini models while the main model processes the complete request.