Written for ML architects, ML and LLM engineers, and technical leads, with platform and SRE engineers as a strong secondary audience.
Large language models fail in production in ways no load test anticipates. A retrieval-augmented pipeline answers confidently with fabricated citations when the vector index drifts. An agent loop burns a month of API budget in forty minutes because one tool call returned an unexpected schema. A prompt that cleared every red-team review gets hijacked by a malicious document in the first week of real traffic. Latency spikes vanish with no correlated metric because the KV cache was sized for a context window half as long as users actually send. These are the predictable failure modes of LLM systems, and teams that ship reliable LLM features design against them with patterns that hold across models, vendors, and inference frameworks.
Inside this book, readers will learn how to:
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
Anbieter: California Books, Miami, FL, USA
Zustand: New. Bestandsnummer des Verkäufers I-9798904980153
Anzahl: Mehr als 20 verfügbar
Anbieter: PBShop.store US, Wood Dale, IL, USA
PAP. Zustand: New. New Book. Shipped from UK. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Bestandsnummer des Verkäufers L0-9798904980153
Anzahl: Mehr als 20 verfügbar
Anbieter: Bluemindbooks, PACHECO, CA, USA
Zustand: New. New Book. Bestandsnummer des Verkäufers NJ-INGR-9798904980153
Anbieter: PBShop.store UK, Fairford, GLOS, Vereinigtes Königreich
PAP. Zustand: New. New Book. Delivered from our UK warehouse in 4 to 14 business days. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Bestandsnummer des Verkäufers L0-9798904980153
Anzahl: Mehr als 20 verfügbar
Anbieter: CitiRetail, Stevenage, Vereinigtes Königreich
Paperback. Zustand: new. Paperback. Written for ML architects, ML and LLM engineers, and technical leads, with platform and SRE engineers as a strong secondary audience.Large language models fail in production in ways no load test anticipates. A retrieval-augmented pipeline answers confidently with fabricated citations when the vector index drifts. An agent loop burns a month of API budget in forty minutes because one tool call returned an unexpected schema. A prompt that cleared every red-team review gets hijacked by a malicious document in the first week of real traffic. Latency spikes vanish with no correlated metric because the KV cache was sized for a context window half as long as users actually send. These are the predictable failure modes of LLM systems, and teams that ship reliable LLM features design against them with patterns that hold across models, vendors, and inference frameworks.Inside this book, readers will learn how to: Operate LLMs in production with the reliability, cost discipline, and observability the rest of the stack already expects.Select and size models using a structured framework that weighs capability, latency, cost, and compliance before the first token is generated.Choose between fine-tuning and prompting applying LoRA, PEFT, and instruction-tuning where they outperform prompt engineering and recognizing when they do not.Serve LLMs at scale with continuous batching, paged attention, and KV cache management to maximize throughput and meet latency SLOs.Engineer LLM costs through semantic caching, model tiering, and FinOps practices that keep inference spend predictable as usage grows.Defend against prompt injection and detect hallucinations using layered input validation, output verification, and guardrails that hold under adversarial conditions.Instrument LLM systems for observability capturing reasoning traces, semantic drift signals, and metrics that distinguish a degraded model from a degraded pipeline.Drive quality with eval-driven development building replay harnesses, canary deployments, and evaluation gates so every model update ships with a known risk profile.Build production RAG and agent systems with retrieval quality metrics, tool-call guardrails, and loop termination policies that keep costs bounded.Navigate compliance obligations including the EU AI Act and sector-specific rules without stalling LLM feature delivery.The LLM tooling landscape rotates fast: the serving framework favored today will be superseded before the next model generation ships. The underlying patterns do not rotate: how to reason about context budget and cache hit rate, how to structure an evaluation harness that survives a model swap, and how to build an observability layer that surfaces model-layer failures the application cannot see. Teams that internalize these patterns move faster when tools change because they replace syntax, not understanding.The book is organized in four parts: Foundations covers the LLM landscape, model selection, and adaptation patterns; Adaptation and Serving addresses fine-tuning, serving architectures, and cost engineering; Production Concerns covers safety, compliance, observability, and LLM CI/CD; and Frontier Patterns and Case Studies applies all prior material to RAG, agentic systems, and tool-using LLMs, closing with end-to-end reference architectures.Written for ML architects, ML and LLM engineers, and technical leads, with platform and SRE engineers as a strong secondary audience. Every chapter opens with a production incident, teaches canonical patterns by name, and closes with a checklist the team can apply the same day. Readers finish with the vocabulary, playbook, and pattern library Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability. Bestandsnummer des Verkäufers 9798904980153
Anzahl: 1 verfügbar