
LLM Engineer's Handbook
By :

Symbols
4-bit NormalFloat (NF4) 215
32-bit floating point (fp32) 211, 212
A
acceptance tests 464
actions 437
Activate-aware Weight Quantization (AWQ) 313
advanced RAG
advanced RAG post-retrieval optimization
advanced RAG pre-retrieval optimizations 324
advanced RAG retrieval optimization
filtered vector search 332-334
advanced RAG techniques
post-retrieval optimization 334-338
pre-retrieval optimizations 324-332
retrieval optimization 332-334
alerts 473
AlpacaEval 264
Amazon Resource Name (ARN) 375
Application Auto Scaling 396, 397
Application Load Balancer (ALB) 395
asynchronous inference 361, 362
scalable policy, creating 397
...