Scouting and Scoring: How We Know What We Know about Baseball - Hardcover

Phillips, Christopher

 
9780691180212: Scouting and Scoring: How We Know What We Know about Baseball

Inhaltsangabe

An in-depth look at the intersection of judgment and statistics in baseball

Scouting and scoring are considered fundamentally different ways of ascertaining value in baseball. Scouting seems to rely on experience and intuition, scoring on performance metrics and statistics. In Scouting and Scoring, Christopher Phillips rejects these simplistic divisions. He shows how both scouts and scorers rely on numbers, bureaucracy, trust, and human labor in order to make sound judgments about the value of baseball players.

Tracing baseball’s story from the nineteenth century to today, Phillips explains that the sport was one of the earliest and most consequential fields for the introduction of numerical analysis. New technologies and methods of data collection were supposed to enable teams to quantify the drafting and managing of players—replacing scouting with scoring. But that’s not how things turned out. Over the decades, scouting and scoring started looking increasingly similar. Scouts expressed their judgments in highly formulaic ways, using numerical grades and scientific instruments to evaluate players. Scorers drew on moral judgments, depended on human labor to maintain and correct data, and designed bureaucratic systems to make statistics appear reliable. From the invention of official scorers and Statcast to the creation of the Major League Scouting Bureau, the history of baseball reveals the inextricable connections between human expertise and data science.

A unique consideration of the role of quantitative measurement and human judgment, Scouting and Scoring provides an entirely fresh understanding of baseball by showing what the sport reveals about reliable knowledge in the modern world.

Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.

Über die Autorin bzw. den Autor

Christopher J. Phillips is associate professor of history at Carnegie Mellon University. He is the author of The New Math: A Political History. His work has appeared in such publications as the New York Times, Science, and Nature. He lives in Pittsburgh.

Auszug. © Genehmigter Nachdruck. Alle Rechte vorbehalten.

Scouting and Scoring

How We Know What We Know about Baseball

By Christopher J. Phillips

PRINCETON UNIVERSITY PRESS

Copyright © 2019 Princeton University Press
All rights reserved.
ISBN: 978-0-691-18021-2

Contents

Introduction, 1,
1 The Bases of Data, 13,
2 Henry Chadwick and Scoring Technology, 33,
3 Official Scoring, 59,
4 From Project Scoresheet to Big Data, 97,
5 The Practice of Pricing the Body, 136,
6 Measuring Head and Heart, 170,
7 A Machine for Objectivity, 200,
Conclusion, 243,
Acknowledgments, 255,
Abbreviations Used in Notes, 257,
Notes, 259,
Index, 297,


CHAPTER 1

The Bases of Data


"He has the most doubles of any right-handed batter in history." It was a claim confidently bandied about as Craig Biggio was considered for the National Baseball Hall of Fame. And it seemed an objectively true and relatively simple sort of fact — the number of doubles he had hit was greater than the number of doubles any other right-handed player had hit. It was easy to check by heading to baseball-reference.com or some other encyclopedia.

Baseball Reference's list of most doubles hit in a career includes Tris Speaker, Pete Rose, Stan Musial, and Ty Cobb — all either switch- or left-handed-hitters — and then Craig Biggio, with 668 career doubles. But how do we know that Biggio really hit 11 more doubles than the next right-handed hitter on the list, Nap Lajoie? Lajoie played from 1896 to 1916, before sabermetrics, fantasy leagues, and highlight reels — even before radio broadcasts or daily statistical updates. More troubling, at the time of Biggio's candidacy, the powers that be in Major League Baseball disagreed with Baseball Reference and other organizations about whether Lajoie or Cobb had the highest batting average in 1910. Given such distance and uncertainty, how can we be confident Lajoie didn't have more doubles lurking in the records — or that he actually did hit exactly 30 doubles in, say, 1907?

Even a simple claim about performance statistics implies a reliable record of hits going back nearly 150 years. It may seem an objective fact, right or wrong, but that's not to suggest it is a simple or easy thing to be confident about. Believing that Biggio set a record for doubles requires believing in an entire history of recordkeeping and error checking, in an entire structure of people and tools meant to ensure the accuracy and reliability of facts. If we want to figure out how we know that Lajoie hit 30 doubles in 1907, then we might as well start by asking where Baseball Reference actually got that number.

Baseball Reference's clean interface makes the facts displayed there seem natural, eternal, and indisputable. Biggio's page reveals a dizzying array of numbers, sorted neatly by year and category. Some of the stats provided, like "batting average" and "runs scored," are essentially as old as professional baseball; others, such as "wins above replacement" and "adjusted batting runs," are more recent creations. The interface provides a clever bubble that appears when the cursor is hovered over a statistic, explaining how the number has been calculated. The site even allows users to sum across seasons or other subcategories. The whole structure is geared toward providing a clear display of mathematical certainty. Or, as the founder of the site, Sean Forman, explained, the site's purpose is to "answer questions as quickly, easily, and accurately as possible."

When Forman first put his website online in mid-2000, its ability to generate quick answers was its selling point. Even in this relatively early stage of the internet, there were already other places where fans could find similar data online, including stats.com and totalbaseball.com. These competitors often also had big names, or at least names with authority — totalbaseball.com had signed agreements with Sports Illustrated, and stats.com was licensed by a variety of national publications.

The advantage baseball-reference.com offered was a superior interface, which Forman called putting a "friendly face" on existing data. He minimized images and ads, with 95 percent of the pages under 20 kilobytes (kb) — no minor thing, given residential download speeds generally maxed out at 56 kb per second at the turn of the century. Forman had started Baseball Reference as he was finishing his doctoral dissertation on computational protein folding, a field seemingly irrelevant to baseball until he explains that his research was "basically optimization." Forman was good at taking a complicated mess of facts and interconnections, analyzing them, and cleaning them up.

The casual fan might assume Baseball Reference's numbers were coming directly from Major League Baseball or from its official statistician, the Elias Sports Bureau. At the time of Biggio's candidacy, however, Forman had no formal relationship with Elias, and he had never spoken with anyone there. As is the case with many encyclopedias, the specific origins of any given statistic, save a generic note at the bottom of every page, were left unspecified. However elegant its interface, Baseball Reference didn't — and doesn't — provide many overt reasons to trust the statistics that appear there.

As it turns out, Forman had initially taken his data from the statistical database freely provided online in 1996 by another internet-savvy baseball fan, Sean Lahman, at baseball.com. Lahman, in turn, had built his database using the CD-ROM that came along with the third edition of the groundbreaking encyclopedia Total Baseball in 1993. The CD included image files of the entire encyclopedia, with its own reader on the disk to view the individual files. Lahman noticed that the publishers of the CD, Creative Multi-media Corporation, didn't protect its contents very well. With a day job designing databases of digital images for Kodak, Lahman had the skills to post Total Baseball's statistics online for anyone to download.

It's misleading to talk about posting statistics online as if Lahman were simply copying the files from the CD-ROM. A book is just as much a technology for holding and displaying data as a computer file — and perhaps has proved more robust and user-friendly. But Lahman didn't just want to read the book on a computer. Lahman wanted to reverse-engineer a database. He gathered ("scraped") the statistics from the files and then organized them into a relational database by assigning unique IDs to each player, team, and statistical category so that they would be easily searchable. Ultimately, he was able to create his own database, one that relied on the facts as conveyed by Total Baseball but that was presented not as the image of a printed table, but as an editable Microsoft Access file.

Lahman decided to put his database online as a result of two frustrations: first, that so many online repositories unexpectedly disappeared in the early days of the internet, and second, that baseball data were often presented in ways that were not conducive to research. Watching Ken Burns's film Baseball during the 1994 strike had given Lahman the idea of combining his computing skills with his interest in baseball. In this period before it was common to "surf" the "world wide web," many connoisseurs of baseball statistics had found their way to an active Usenet group: rec.sports.baseball. Lahman found the baseball group a useful resource for sharing statistical...

„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.

Weitere beliebte Ausgaben desselben Titels

9780691217161: Scouting and Scoring: How We Know What We Know About Baseball

Vorgestellte Ausgabe

ISBN 10:  0691217165 ISBN 13:  9780691217161
Verlag: Princeton University Press, 2021
Softcover