This timely text presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as ABFT. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Features: provides a survey of resilience methods and performance models; examines the various sources for errors and faults in large-scale systems; reviews the spectrum of techniques that can be applied to design a fault-tolerant MPI; investigates different approaches to replication; discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems.
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC).
The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models.
Topics and features:
This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing.
Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Supérieure de Lyon, France, and a Visiting Research Scholar in the ICL.
„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.
EUR 30,00 für den Versand von Deutschland nach USA
Versandziele, Kosten & DauerEUR 3,43 für den Versand innerhalb von/der USA
Versandziele, Kosten & DauerAnbieter: Universitätsbuchhandlung Herta Hold GmbH, Berlin, Deutschland
ix, 320p. Hardcover. Versand aus Deutschland / We dispatch from Germany via Air Mail. Einband bestoßen, daher Mängelexemplar gestempelt, sonst sehr guter Zustand. Imperfect copy due to slightly bumped cover, apart from this in very good condition. Stamped. Stamped. Computer Communications and Networks. Sprache: Englisch. Bestandsnummer des Verkäufers 4823IB
Anzahl: 2 verfügbar
Anbieter: Books Puddle, New York, NY, USA
Zustand: New. pp. 320. Bestandsnummer des Verkäufers 26372815544
Anzahl: 1 verfügbar
Anbieter: Majestic Books, Hounslow, Vereinigtes Königreich
Zustand: New. pp. 320. Bestandsnummer des Verkäufers 374278503
Anzahl: 1 verfügbar
Anbieter: GreatBookPrices, Columbia, MD, USA
Zustand: New. Bestandsnummer des Verkäufers 23922726-n
Anzahl: Mehr als 20 verfügbar
Anbieter: Lucky's Textbooks, Dallas, TX, USA
Zustand: New. Bestandsnummer des Verkäufers ABLIING23Mar3113020090860
Anzahl: Mehr als 20 verfügbar
Anbieter: Biblios, Frankfurt am main, HESSE, Deutschland
Zustand: New. pp. 320. Bestandsnummer des Verkäufers 18372815538
Anzahl: 1 verfügbar
Anbieter: Buchpark, Trebbin, Deutschland
Zustand: Sehr gut. Zustand: Sehr gut | Sprache: Englisch | Produktart: Bücher. Bestandsnummer des Verkäufers 25708812/12
Anzahl: 1 verfügbar
Anbieter: GreatBookPrices, Columbia, MD, USA
Zustand: As New. Unread book in perfect condition. Bestandsnummer des Verkäufers 23922726
Anzahl: Mehr als 20 verfügbar
Anbieter: Ria Christie Collections, Uxbridge, Vereinigtes Königreich
Zustand: New. In. Bestandsnummer des Verkäufers ria9783319209425_new
Anzahl: Mehr als 20 verfügbar
Anbieter: BuchWeltWeit Ludwig Meier e.K., Bergisch Gladbach, Deutschland
Buch. Zustand: Neu. This item is printed on demand - it takes 3-4 days longer - Neuware -This timely text presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as ABFT. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Features: provides a survey of resilience methods and performance models; examines the various sources for errors and faults in large-scale systems; reviews the spectrum of techniques that can be applied to design a fault-tolerant MPI; investigates different approaches to replication; discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems. 332 pp. Englisch. Bestandsnummer des Verkäufers 9783319209425
Anzahl: 2 verfügbar