Building Production LLM Systems is the comprehensive engineering guide for deploying open-source language models at enterprise scale. Written by Cedar Moon, a veteran who has shipped multiple million-dollar LLM systems, this practitioner's handbook covers everything from selecting the right model family (LLaMA, Mistral, Mixtral) to building production-ready infrastructure that rivals proprietary APIs.
Across 15 detailed chapters, you'll master quantization techniques that run 70B models on consumer GPUs, inference optimization strategies delivering 3-8× higher throughput, fine-tuning pipelines for 405B parameters on a single card, and RAG systems ready for regulatory scrutiny. The book includes production-tested code, real infrastructure cost models, security hardening for regulated industries, and MLOps patterns for continuous improvement.
This isn't theory—it's battle-tested wisdom from real deployments across finance, healthcare, and defense. Whether you're running your first 7B model or orchestrating a thousand-GPU cluster, you'll gain the complete playbook to build LLM systems that are faster, cheaper, and more secure than any API—while maintaining full ownership of your AI stack.
Stop renting intelligence. Start owning it.
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
Anbieter: California Books, Miami, FL, USA
Zustand: New. Print on Demand. Bestandsnummer des Verkäufers I-9798277266137
Anzahl: Mehr als 20 verfügbar
Anbieter: CitiRetail, Stevenage, Vereinigtes Königreich
Paperback. Zustand: new. Paperback. Building Production LLM Systems is the comprehensive engineering guide for deploying open-source language models at enterprise scale. Written by Cedar Moon, a veteran who has shipped multiple million-dollar LLM systems, this practitioner's handbook covers everything from selecting the right model family (LLaMA, Mistral, Mixtral) to building production-ready infrastructure that rivals proprietary APIs.Across 15 detailed chapters, you'll master quantization techniques that run 70B models on consumer GPUs, inference optimization strategies delivering 3-8 higher throughput, fine-tuning pipelines for 405B parameters on a single card, and RAG systems ready for regulatory scrutiny. The book includes production-tested code, real infrastructure cost models, security hardening for regulated industries, and MLOps patterns for continuous improvement.This isn't theory-it's battle-tested wisdom from real deployments across finance, healthcare, and defense. Whether you're running your first 7B model or orchestrating a thousand-GPU cluster, you'll gain the complete playbook to build LLM systems that are faster, cheaper, and more secure than any API-while maintaining full ownership of your AI stack.Stop renting intelligence. Start owning it. This item is printed on demand. Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability. Bestandsnummer des Verkäufers 9798277266137
Anzahl: 1 verfügbar