1 indexed skills ยท max 10 per page
davila7/claude-code-templates ยท AI/ML
vLLM achieves 24x higher throughput than standard transformers through PagedAttention (block-based KV cache) and continuous batching (mixing prefill/decode requests).