Managing Production Large Language Models: Playbook for Designing, Deploying, and Operating LLM at Scale and Machine Learning FinOps Blueprints (Enterprise Machine Learning Operations) - Softcover

O'Neal, Jordan

9798904980153: Managing Production Large Language Models: Playbook for Designing, Deploying, and Operating LLM at Scale and Machine Learning FinOps Blueprints (Enterprise Machine Learning Operations)

Softcover

ISBN 13: 9798904980153

Verlag: Cybersoft Publishing LLc, 2026

Alle Exemplare dieser ISBN-Ausgabe

0 Gebraucht

5 Neu

Von EUR 29,37

Written for ML architects, ML and LLM engineers, and technical leads, with platform and SRE engineers as a strong secondary audience.
Large language models fail in production in ways no load test anticipates. A retrieval-augmented pipeline answers confidently with fabricated citations when the vector index drifts. An agent loop burns a month of API budget in forty minutes because one tool call returned an unexpected schema. A prompt that cleared every red-team review gets hijacked by a malicious document in the first week of real traffic. Latency spikes vanish with no correlated metric because the KV cache was sized for a context window half as long as users actually send. These are the predictable failure modes of LLM systems, and teams that ship reliable LLM features design against them with patterns that hold across models, vendors, and inference frameworks.
Inside this book, readers will learn how to:

Operate LLMs in production with the reliability, cost discipline, and observability the rest of the stack already expects.
Select and size models using a structured framework that weighs capability, latency, cost, and compliance before the first token is generated.
Choose between fine-tuning and prompting applying LoRA, PEFT, and instruction-tuning where they outperform prompt engineering and recognizing when they do not.
Serve LLMs at scale with continuous batching, paged attention, and KV cache management to maximize throughput and meet latency SLOs.
Engineer LLM costs through semantic caching, model tiering, and FinOps practices that keep inference spend predictable as usage grows.
Defend against prompt injection and detect hallucinations using layered input validation, output verification, and guardrails that hold under adversarial conditions.
Instrument LLM systems for observability capturing reasoning traces, semantic drift signals, and metrics that distinguish a degraded model from a degraded pipeline.
Drive quality with eval-driven development building replay harnesses, canary deployments, and evaluation gates so every model update ships with a known risk profile.
Build production RAG and agent systems with retrieval quality metrics, tool-call guardrails, and loop termination policies that keep costs bounded.
Navigate compliance obligations including the EU AI Act and sector-specific rules without stalling LLM feature delivery.

The LLM tooling landscape rotates fast: the serving framework favored today will be superseded before the next model generation ships. The underlying patterns do not rotate: how to reason about context budget and cache hit rate, how to structure an evaluation harness that survives a model swap, and how to build an observability layer that surfaces model-layer failures the application cannot see. Teams that internalize these patterns move faster when tools change because they replace syntax, not understanding.
The book is organized in four parts: Foundations covers the LLM landscape, model selection, and adaptation patterns; Adaptation and Serving addresses fine-tuning, serving architectures, and cost engineering; Production Concerns covers safety, compliance, observability, and LLM CI/CD; and Frontier Patterns and Case Studies applies all prior material to RAG, agentic systems, and tool-using LLMs, closing with end-to-end reference architectures.
Written for ML architects, ML and LLM engineers, and technical leads, with platform and SRE engineers as a strong secondary audience. Every chapter opens with a production incident, teaches canonical patterns by name, and closes with a checklist the team can apply the same day. Readers finish with the vocabulary, playbook, and pattern library to ship reliable LLM features with confidence.

Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.

Verlag: Cybersoft Publishing LLc
Erscheinungsdatum: 2026
Sprache: Englisch
ISBN 13: 9798904980153
Einband: Taschenbuch
Anzahl der Seiten: 406
Kontakt zum Hersteller: Manufactured by Amazon on behalf of the author
https://www.amazon.de/hz/contact-us

c/o Amazon Media EU S.�.r.l., 38 Avenue John F. Kennedy
Luxembourg
L-1855
Luxemburg

Suchergebnisse f�r Managing Production Large Language Models: Playbook...

Beispielbild f�r diese ISBN

Managing Production Large Language Models: Playbook for Designing, Deploying, and Operating LLM at Scale and Machine Learning FinOps Blueprints (Enterprise Machine Learning Operations)

O'Neal, Jordan

Verlag: Cybersoft Publishing LLc, 2026

ISBN 13: 9798904980153

Neu Softcover

Anbieter: California Books, Miami, FL, USA

Verk�uferbewertung 4 von 5 Sternen

Zustand: New. Bestandsnummer des Verk�ufers I-9798904980153

Verk�ufer kontaktieren

Neu kaufen

EUR 29,37

Versand gratis
Versand innerhalb von USA

Anzahl: Mehr als 20 verf�gbar

In den Warenkorb

Beispielbild f�r diese ISBN

Managing Production Large Language Models

Jordan O'Neal

Verlag: Cybersoft Publishing LLC, 2026

ISBN 13: 9798904980153

Neu PAP

Print-on-Demand

Anbieter: PBShop.store US, Wood Dale, IL, USA

Verk�uferbewertung 5 von 5 Sternen

PAP. Zustand: New. New Book. Shipped from UK. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Bestandsnummer des Verk�ufers L0-9798904980153

Verk�ufer kontaktieren

Neu kaufen

EUR 32,64

Versand gratis
Versand innerhalb von USA

Anzahl: Mehr als 20 verf�gbar

In den Warenkorb

Beispielbild f�r diese ISBN

Managing Production Large Language Models: Playbook for Designing, Deploying, and Operating LLM at Scale and Machine Learning FinOps Blueprints

O'Neal, Jordan

Verlag: Cybersoft Publishing LLc, 2026

ISBN 13: 9798904980153

Neu Softcover

Anbieter: Bluemindbooks, PACHECO, CA, USA

Verk�uferbewertung 5 von 5 Sternen

Zustand: New. New Book. Bestandsnummer des Verk�ufers NJ-INGR-9798904980153

Verk�ufer kontaktieren

Neu kaufen

EUR 33,25

Versand gratis
Versand innerhalb von USA

Anzahl: 1 verf�gbar

In den Warenkorb

Beispielbild f�r diese ISBN

Managing Production Large Language Models

Jordan O'Neal

Verlag: Cybersoft Publishing LLC, 2026

ISBN 13: 9798904980153

Neu PAP

Print-on-Demand

Anbieter: PBShop.store UK, Fairford, GLOS, Vereinigtes K�nigreich

Verk�uferbewertung 5 von 5 Sternen

PAP. Zustand: New. New Book. Delivered from our UK warehouse in 4 to 14 business days. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Bestandsnummer des Verk�ufers L0-9798904980153

Verk�ufer kontaktieren

Neu kaufen

EUR 29,39

EUR 5,82 Versand
Versand von Vereinigtes K�nigreich nach USA

Anzahl: Mehr als 20 verf�gbar

In den Warenkorb

Beispielbild f�r diese ISBN

Managing Production Large Language Models (Paperback)

Jordan O'Neal

Verlag: Cybersoft Publishing LLC, 2026

ISBN 13: 9798904980153

Neu Paperback

Anbieter: CitiRetail, Stevenage, Vereinigtes K�nigreich

Verk�uferbewertung 5 von 5 Sternen

Paperback. Zustand: new. Paperback. Written for ML architects, ML and LLM engineers, and technical leads, with platform and SRE engineers as a strong secondary audience.Large language models fail in production in ways no load test anticipates. A retrieval-augmented pipeline answers confidently with fabricated citations when the vector index drifts. An agent loop burns a month of API budget in forty minutes because one tool call returned an unexpected schema. A prompt that cleared every red-team review gets hijacked by a malicious document in the first week of real traffic. Latency spikes vanish with no correlated metric because the KV cache was sized for a context window half as long as users actually send. These are the predictable failure modes of LLM systems, and teams that ship reliable LLM features design against them with patterns that hold across models, vendors, and inference frameworks.Inside this book, readers will learn how to: Operate LLMs in production with the reliability, cost discipline, and observability the rest of the stack already expects.Select and size models using a structured framework that weighs capability, latency, cost, and compliance before the first token is generated.Choose between fine-tuning and prompting applying LoRA, PEFT, and instruction-tuning where they outperform prompt engineering and recognizing when they do not.Serve LLMs at scale with continuous batching, paged attention, and KV cache management to maximize throughput and meet latency SLOs.Engineer LLM costs through semantic caching, model tiering, and FinOps practices that keep inference spend predictable as usage grows.Defend against prompt injection and detect hallucinations using layered input validation, output verification, and guardrails that hold under adversarial conditions.Instrument LLM systems for observability capturing reasoning traces, semantic drift signals, and metrics that distinguish a degraded model from a degraded pipeline.Drive quality with eval-driven development building replay harnesses, canary deployments, and evaluation gates so every model update ships with a known risk profile.Build production RAG and agent systems with retrieval quality metrics, tool-call guardrails, and loop termination policies that keep costs bounded.Navigate compliance obligations including the EU AI Act and sector-specific rules without stalling LLM feature delivery.The LLM tooling landscape rotates fast: the serving framework favored today will be superseded before the next model generation ships. The underlying patterns do not rotate: how to reason about context budget and cache hit rate, how to structure an evaluation harness that survives a model swap, and how to build an observability layer that surfaces model-layer failures the application cannot see. Teams that internalize these patterns move faster when tools change because they replace syntax, not understanding.The book is organized in four parts: Foundations covers the LLM landscape, model selection, and adaptation patterns; Adaptation and Serving addresses fine-tuning, serving architectures, and cost engineering; Production Concerns covers safety, compliance, observability, and LLM CI/CD; and Frontier Patterns and Case Studies applies all prior material to RAG, agentic systems, and tool-using LLMs, closing with end-to-end reference architectures.Written for ML architects, ML and LLM engineers, and technical leads, with platform and SRE engineers as a strong secondary audience. Every chapter opens with a production incident, teaches canonical patterns by name, and closes with a checklist the team can apply the same day. Readers finish with the vocabulary, playbook, and pattern library Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability. Bestandsnummer des Verk�ufers 9798904980153

Verk�ufer kontaktieren

Neu kaufen

EUR 33,40

EUR 42,87 Versand
Versand von Vereinigtes K�nigreich nach USA

Anzahl: 1 verf�gbar

In den Warenkorb

Managing Production Large Language Models: Playbook for Designing, Deploying, and Operating LLM at Scale and Machine Learning FinOps Blueprints (Enterprise Machine Learning Operations) - Softcover

O&#39;Neal, Jordan

Inhaltsangabe

Suchergebnisse f�r Managing Production Large Language Models: Playbook...

Managing Production Large Language Models: Playbook for Designing, Deploying, and Operating LLM at Scale and Machine Learning FinOps Blueprints (Enterprise Machine Learning Operations)

Neu kaufen

Managing Production Large Language Models

Neu kaufen

Managing Production Large Language Models: Playbook for Designing, Deploying, and Operating LLM at Scale and Machine Learning FinOps Blueprints

Neu kaufen

Managing Production Large Language Models

Neu kaufen

Managing Production Large Language Models (Paperback)

Neu kaufen

O'Neal, Jordan