Path: blob/master/datasets/baseballdb/core/readme2014.txt
412 views
The Lahman Baseball Database122014 Version3Release Date: January 24, 201545----------------------------------------------------------------------67README CONTENTS80.1 Copyright Notice90.2 Contact Information10111.0 Release Contents121.1 Introduction131.2 What's New141.3 Acknowledgements151.4 Using this Database161.5 Revision History17182.0 Data Tables192.1 MASTER table202.2 Batting Table212.3 Pitching table222.4 Fielding Table232.5 All-Star table242.6 Hall of Fame table252.7 Managers table262.8 Teams table272.9 BattingPost table282.10 PitchingPost table292.11 TeamFranchises table302.12 FieldingOF table312.13 ManagersHalf table322.14 TeamsHalf table332.15 Salaries table342.16 SeriesPost table352.17 AwardsManagers table362.18 AwardsPlayers table372.19 AwardsShareManagers table382.20 AwardsSharePlayers table392.21 FieldingPost table402.22 Appearances table412.23 Schools table422.24 SchoolsPlayers table434445----------------------------------------------------------------------46470.1 Copyright Notice & Limited Use License4849This database is copyright 1996-2015 by Sean Lahman.5051This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. For details see: http://creativecommons.org/licenses/by-sa/3.0/525354For licensing information or further information, contact Sean Lahman55at: [email protected]5657----------------------------------------------------------------------58590.2 Contact Information6061Web site: http://www.baseball1.com62E-Mail : [email protected]6364If you're interested in contributing to the maintenance of this65database or making suggestions for improvement, please consider66joining our mailinglist at:6768http://groups.yahoo.com/group/baseball-databank/6970If you are interested in similar databases for other sports, please71vist the Open Source Sports website at http://OpenSourceSports.com7273----------------------------------------------------------------------741.0 Release Contents7576This release of the database can be downloaded in several formats. The77contents of each version are listed below.7879MS Access Versions:80lahman2014.mdb812014readme.txt8283SQL version84lahman2043.sql85lahman2014_tables.sql862014readme.txt8788Comma Delimited Version:892014readme.txt90AllStarFull.csv91Appearances.csv92AwardsManagers.csv93AwardsPlayers.csv94AwardsShareManagers.csv95AwardsSharePlayers.csv96Batting.csv97BattingPost.csv98CollegePlaying.csv99Fielding.csv100FieldingOF.csv101FieldingPost.csv102HallOfFame.csv103Managers.csv104ManagersHalf.csv105Master.csv106Pitching.csv107PitchingPost.csv108Salaries.csv109Schools.csv110SeriesPost.csv111Teams.csv112TeamsFranchises.csv113TeamsHalf.csv114115----------------------------------------------------------------------1161.1 Introduction117118This database contains pitching, hitting, and fielding statistics for119Major League Baseball from 1871 through 2014. It includes data from120the two current leagues (American and National), the four other "major"121leagues (American Association, Union Association, Players League, and122Federal League), and the National Association of 1871-1875.123124This database was created by Sean Lahman, who pioneered the effort to125make baseball statistics freely available to the general public. What126started as a one man effort in 1994 has grown tremendously, and now a127team of researchers have collected their efforts to make this the128largest and most accurate source for baseball statistics available129anywhere. (See Acknowledgements below for a list of the key130contributors to this project.)131132None of what we have done would have been possible without the133pioneering work of Hy Turkin, S.C. Thompson, David Neft, and Pete134Palmer (among others). All baseball fans owe a debt of gratitude135to the people who have worked so hard to build the tremendous set136of data that we have today. Our thanks also to the many members of137the Society for American Baseball Research who have helped us over138the years. We strongly urge you to support and join their efforts.139Please vist their website (www.sabr.org).140141If you have any problems or find any errors, please let us know. Any142feedback is appreciated143144----------------------------------------------------------------------1451.2 What's New in 2014146147Player stats have been updated through 2014 season.148149Removed two deprecated fields from the batting table. The G_batting and150G_old fields were rendered obsolete when we created the appearances table.151They've beenremoved from the batting table starting with this version152153SchoolsPlayers has been replaced with a new table called CollegePlaying.154This reflects advances in the compilation of this data, largely led by155Ted Turocy. The old table reported college attendance for major league156players by listing a start date and end date. The new version has a157separate record for each year that a player attended. This allows158us to better account for players who attended multiple colleges or159skipped a season, as well as to identify teammates.160161162----------------------------------------------------------------------1631.3 Acknowledgements164165Much of the raw data contained in this database comes from the work of166Pete Palmer, the legendary statistician, who has had a hand in most167of the baseball encylopedias published since 1974. He is largely168responsible for bringing the batting, pitching, and fielding data out169of the dark ages and into the computer era. Without him, none of this170would be possible. For more on Pete's work, please read his own171account at: http://sabr.org/cmsfiles/PalmerDatabaseHistory.pdf172173Three people have been key contributors to the work that followed, first174by taking the raw data and creating a relational database, and later175by extending the database to make it more accesible to researchers.176177Sean Lahman launched the Baseball Archive's website back before178most people had heard of the world wide web. Frustrated by the179lack of sports data available, he led the effort to build a180baseball database that everyone could use. Baseball researchers181everywhere owe him a debt of gratitude. Lahman served as an associate182editor for three editions of Total Baseball and contributed to five183editions of The ESPN Baseball Encyclopedia. He has also been active in184developing databases for other sports.185186The work of Sean Forman to create and maintain an online encyclopedia187at "baseball-reference.com" has been remarkable. Recognized as the188premier online reference source, Forman's site provides an oustanding189interface to the raw data. His efforts to help streamline the database190have been extremely helpful. Most importantly, Forman has spearheaded191the effort to provide standards that enable several different baseball192databases to be used together. He was also instrumental in launching193the Baseball Databank, a forum for researchers to gather and share194their work.195196Since 2001, these two Seans have led a group of researchers197who volunteered to maintain and update the database.198199Ted Turocy has done the lion's share of the work to updating the main200data tables since 2012, including significant imporvements to the201demographic data in the master table. In his role as SABR data czar,202he led the effort to document college playing stints for all203major league players. Turocy also spearheads the Chadwick Baseball204Bureau. For more details on his tools and services, visit:205http://chadwick.sourceforge.net/doc/index.html206207A handful of researchers have made substantial contributions to208maintain this database over years. Listed alphabetically, they209are: Derek Adair, Mike Crain, Kevin Johnson, Rod Nelson, Tom Tango,210and Paul Wendt. These folks did much of the heavy lifting, and are211largely responsible for the improvements made since 2000.212213Others who made important contributions include: Dvd Avins,214Clifford Blau, Bill Burgess, Clem Comly, Jeff Burk, Randy Cox,215Mitch Dickerman, Paul DuBois, Mike Emeigh, F.X. Flinn, Bill Hickman,216Jerry Hoffman, Dan Holmes, Micke Hovmoller, Peter Kreutzer,217Danile Levine, Bruce Macleod, Ken Matinale, Michael Mavrogiannis,218Cliff Otto, Alberto Perdomo, Dave Quinn, John Rickert, Tom Ruane,219Theron Skyles, Hans Van Slooten, Michael Westbay, and Rob Wood.220221Many other people have made significant contributions to the database222over the years. The contribution of Tom Ruane's effort to the overall223quality of the underlying data has been tremendous. His work at224retrosheet.org integrates the yearly data with the day-by-day data,225creating a reference source of startling depth. It is unlikely than226any individual has contributed as much to the field of baseball227research in the past five years as Ruane has.228229Sean Holtz helped with a major overhaul and redesign before the2302000 season. Keith Woolner was instrumental in helping turn231a huge collection of stats into a relational database in the mid-1990s.232Clifford Otto & Ted Nye also helped provide guidance to the early233versions. Lee Sinnis, John Northey & Erik Greenwood helped supply key234pieces of data. Many others have written in with corrections and235suggestions that made each subsequent version even better than what236preceded it.237238The work of the SABR Baseball Records Committee, led by Lyle Spatz239has been invaluable. So has the work of Bill Carle and the SABR240Biographical Committee. David Vincent, keeper of the Home Run Log and241other bits of hard to find info, has always been helpful. The recent242addition of colleges to player bios is the result of much research by243members of SABR's Collegiate Baseball committee.244245Salary data was first supplied by Doug Pappas, who passed away during246the summer of 2004. He was the leading authority on many subjects,247most significantly the financial history of Major League Baseball.248We are grateful that he allowed us to include some of the data he249compiled. His work has been continued by the SABR Business of250Baseball committee.251252Thanks is also due to the staff at the National Baseball Library253in Cooperstown who have been so helpful over the years, including254Tim Wiles, Jim Gates, Bruce Markusen, and the rest of the staff.255256A special debt of gratitude is owed to Dave Smith and the folks at257Retrosheet. There is no other group working so hard to compile and258share baseball data. Their website (www.retrosheet.org) will give259you a taste of the wealth of information Dave and the gang have collected.260261Thanks to all contributors great and small. What you have created is262a wonderful thing.263264----------------------------------------------------------------------2651.4 Using this Database266267This version of the database is available in Microsoft Access268format, SQL files or in a generic, comma delimited format. Because this is a269relational database, you will not be able to use the data in a270flat-database application.271272Please note that this is not a stand alone application. It requires273a database application or some other application designed specifically274to interact with the database.275276If you are unable to import the data directly, you should download the277database in the delimted text format. Then use the documentation278in sections 2.1 through 2.22 of this document to import the data into279your database application.280281----------------------------------------------------------------------2821.5 Revision History283284Version Date Comments2851.0 December 1992 Database ported from dBase2861.1 May 1993 Becomes fully relational2871.2 July 1993 Corrections made to full database2881.21 December 1993 1993 statistics added2891.3 July 1994 Pre-1900 data added2901.31 February 1995 1994 Statistics added2911.32 August 1995 Statistics added for other leagues2921.4 September 1995 Fielding Data added2931.41 November 1995 1995 statistics added2941.42 March 1996 HOF/All-Star tables added2951.5-MS October 1996 1st public release - MS Access format2961.5-GV October 1996 Released generic comma-delimted files2971.6-MS December 1996 Updated with 1996 stats, some corrections2981.61-MS December 1996 Corrected error in MASTER table2991.62 February 1997 Corrected 1914-1915 batters data and updated3002.0 February 1998 Major Revisions-added teams & managers3012.1 October 1998 Interim release w/1998 stats3022.2 January 1999 New release w/post-season stats & awards added3033.0 November 1999 Major release - fixed errors and 1999 statistics added3044.0 May 2001 Major release - proofed & redesigned tables3054.5 March 2002 Updated with 2001 stats and added new biographical data3065.0 December 2002 Major revision - new tables and data3075.1 January 2004 Updated with 2003 data, and new pitching categories3085.2 November 2004 Updated with 2004 season statistics.3095.3 December 2005 Updated with 2005 season statistics.3105.4 December 2006 Updated with 2006 season statistics.3115.5 December 2007 Updated with 2007 season statistics.3125.6 December 2008 Updated with 2008 season statistics.3135.7 December 2009 Updated for 2009 and added several tables.3145.8 December 2010 Updated with 2010 season statistics.3155.9 December 2011 Updated for 2011 and removed obsolete tables.3162012 December 2012 Updated with 2012 season statistics3172013 December 2013 Updated with 2013 season statistics3182014 December 2014 Updated with 2013 season statistics319320321322------------------------------------------------------------------------------3232.0 Data Tables324325The design follows these general principles. Each player is assigned a326unique number (playerID). All of the information relating to that player327is tagged with his playerID. The playerIDs are linked to names and328birthdates in the MASTER table.329330The database is comprised of the following main tables:331332MASTER - Player names, DOB, and biographical info333Batting - batting statistics334Pitching - pitching statistics335Fielding - fielding statistics336337It is supplemented by these tables:338339AllStarFull - All-Star appearances340HallofFame - Hall of Fame voting data341Managers - managerial statistics342Teams - yearly stats and standings343BattingPost - post-season batting statistics344PitchingPost - post-season pitching statistics345TeamFranchises - franchise information346FieldingOF - outfield position data347FieldingPost- post-season fieldinf data348ManagersHalf - split season data for managers349TeamsHalf - split season data for teams350Salaries - player salary data351SeriesPost - post-season series information352AwardsManagers - awards won by managers353AwardsPlayers - awards won by players354AwardsShareManagers - award voting for manager awards355AwardsSharePlayers - award voting for player awards356Appearances - details on the positions a player appeared at357Schools - list of colleges that players attended358CollegePlaying - list of players and the colleges they attended359360361Sections 2.1 through 2.24 of this document describe each of the tables in362detail and the fields that each contains.363364365--------------------------------------------------------------------------3662.1 MASTER table367368369playerID A unique code asssigned to each player. The playerID links370the data in this file with records in the other files.371birthYear Year player was born372birthMonth Month player was born373birthDay Day player was born374birthCountry Country where player was born375birthState State where player was born376birthCity City where player was born377deathYear Year player died378deathMonth Month player died379deathDay Day player died380deathCountry Country where player died381deathState State where player died382deathCity City where player died383nameFirst Player's first name384nameLast Player's last name385nameGiven Player's given name (typically first and middle)386weight Player's weight in pounds387height Player's height in inches388bats Player's batting hand (left, right, or both)389throws Player's throwing hand (left or right)390debut Date that player made first major league appearance391finalGame Date that player made first major league appearance (blank if still active)392retroID ID used by retrosheet393bbrefID ID used by Baseball Reference website394395396------------------------------------------------------------------------------3972.2 Batting Table398playerID Player ID code399yearID Year400stint player's stint (order of appearances within a season)401teamID Team402lgID League403G Games404AB At Bats405R Runs406H Hits4072B Doubles4083B Triples409HR Homeruns410RBI Runs Batted In411SB Stolen Bases412CS Caught Stealing413BB Base on Balls414SO Strikeouts415IBB Intentional walks416HBP Hit by pitch417SH Sacrifice hits418SF Sacrifice flies419GIDP Grounded into double plays420421------------------------------------------------------------------------------4222.3 Pitching table423424playerID Player ID code425yearID Year426stint player's stint (order of appearances within a season)427teamID Team428lgID League429W Wins430L Losses431G Games432GS Games Started433CG Complete Games434SHO Shutouts435SV Saves436IPOuts Outs Pitched (innings pitched x 3)437H Hits438ER Earned Runs439HR Homeruns440BB Walks441SO Strikeouts442BAOpp Opponent's Batting Average443ERA Earned Run Average444IBB Intentional Walks445WP Wild Pitches446HBP Batters Hit By Pitch447BK Balks448BFP Batters faced by Pitcher449GF Games Finished450R Runs Allowed451SH Sacrifices by opposing batters452SF Sacrifice flies by opposing batters453GIDP Grounded into double plays by opposing batter454------------------------------------------------------------------------------4552.4 Fielding Table456457playerID Player ID code458yearID Year459stint player's stint (order of appearances within a season)460teamID Team461lgID League462Pos Position463G Games464GS Games Started465InnOuts Time played in the field expressed as outs466PO Putouts467A Assists468E Errors469DP Double Plays470PB Passed Balls (by catchers)471WP Wild Pitches (by catchers)472SB Opponent Stolen Bases (by catchers)473CS Opponents Caught Stealing (by catchers)474ZR Zone Rating475476------------------------------------------------------------------------------4772.5 AllstarFull table478479playerID Player ID code480YearID Year481gameNum Game number (zero if only one All-Star game played that season)482gameID Retrosheet ID for the game idea483teamID Team484lgID League485GP 1 if Played in the game486startingPos If player was game starter, the position played487------------------------------------------------------------------------------4882.6 HallOfFame table489490playerID Player ID code491yearID Year of ballot492votedBy Method by which player was voted upon493ballots Total ballots cast in that year494needed Number of votes needed for selection in that year495votes Total votes received496inducted Whether player was inducted by that vote or not (Y or N)497category Category in which candidate was honored498needed_note Explanation of qualifiers for special elections499------------------------------------------------------------------------------5002.7 Managers table501502playerID Player ID Number503yearID Year504teamID Team505lgID League506inseason Managerial order. Zero if the individual managed the team507the entire year. Otherwise denotes where the manager appeared508in the managerial order (1 for first manager, 2 for second, etc.)509G Games managed510W Wins511L Losses512rank Team's final position in standings that year513plyrMgr Player Manager (denoted by 'Y')514515------------------------------------------------------------------------------5162.8 Teams table517518yearID Year519lgID League520teamID Team521franchID Franchise (links to TeamsFranchise table)522divID Team's division523Rank Position in final standings524G Games played525GHome Games played at home526W Wins527L Losses528DivWin Division Winner (Y or N)529WCWin Wild Card Winner (Y or N)530LgWin League Champion(Y or N)531WSWin World Series Winner (Y or N)532R Runs scored533AB At bats534H Hits by batters5352B Doubles5363B Triples537HR Homeruns by batters538BB Walks by batters539SO Strikeouts by batters540SB Stolen bases541CS Caught stealing542HBP Batters hit by pitch543SF Sacrifice flies544RA Opponents runs scored545ER Earned runs allowed546ERA Earned run average547CG Complete games548SHO Shutouts549SV Saves550IPOuts Outs Pitched (innings pitched x 3)551HA Hits allowed552HRA Homeruns allowed553BBA Walks allowed554SOA Strikeouts by pitchers555E Errors556DP Double Plays557FP Fielding percentage558name Team's full name559park Name of team's home ballpark560attendance Home attendance total561BPF Three-year park factor for batters562PPF Three-year park factor for pitchers563teamIDBR Team ID used by Baseball Reference website564teamIDlahman45 Team ID used in Lahman database version 4.5565teamIDretro Team ID used by Retrosheet566567------------------------------------------------------------------------------5682.9 BattingPost table569570yearID Year571round Level of playoffs572playerID Player ID code573teamID Team574lgID League575G Games576AB At Bats577R Runs578H Hits5792B Doubles5803B Triples581HR Homeruns582RBI Runs Batted In583SB Stolen Bases584CS Caught stealing585BB Base on Balls586SO Strikeouts587IBB Intentional walks588HBP Hit by pitch589SH Sacrifices590SF Sacrifice flies591GIDP Grounded into double plays592593------------------------------------------------------------------------------5942.10 PitchingPost table595596playerID Player ID code597yearID Year598round Level of playoffs599teamID Team600lgID League601W Wins602L Losses603G Games604GS Games Started605CG Complete Games606SHO Shutouts607SV Saves608IPOuts Outs Pitched (innings pitched x 3)609H Hits610ER Earned Runs611HR Homeruns612BB Walks613SO Strikeouts614BAOpp Opponents' batting average615ERA Earned Run Average616IBB Intentional Walks617WP Wild Pitches618HBP Batters Hit By Pitch619BK Balks620BFP Batters faced by Pitcher621GF Games Finished622R Runs Allowed623SH Sacrifice Hits allowed624SF Sacrifice Flies allowed625GIDP Grounded into Double Plays626627------------------------------------------------------------------------------6282.11 TeamFranchises table629630franchID Franchise ID631franchName Franchise name632active Whetehr team is currently active (Y or N)633NAassoc ID of National Association team franchise played as634635------------------------------------------------------------------------------6362.12 FieldingOF table637638playerID Player ID code639yearID Year640stint player's stint (order of appearances within a season)641Glf Games played in left field642Gcf Games played in center field643Grf Games played in right field644645------------------------------------------------------------------------------6462.13 ManagersHalf table647648playerID Manager ID code649yearID Year650teamID Team651lgID League652inseason Managerial order. One if the individual managed the team653the entire year. Otherwise denotes where the manager appeared654in the managerial order (1 for first manager, 2 for second, etc.)655half First or second half of season656G Games managed657W Wins658L Losses659rank Team's position in standings for the half660661------------------------------------------------------------------------------6622.14 TeamsHalf table663664yearID Year665lgID League666teamID Team667half First or second half of season668divID Division669DivWin Won Division (Y or N)670rank Team's position in standings for the half671G Games played672W Wins673L Losses674675------------------------------------------------------------------------------6762.15 Salaries table677678yearID Year679teamID Team680lgID League681playerID Player ID code682salary Salary683684------------------------------------------------------------------------------6852.16 SeriesPost table686687yearID Year688round Level of playoffs689teamIDwinner Team ID of the team that won the series690lgIDwinner League ID of the team that won the series691teamIDloser Team ID of the team that lost the series692lgIDloser League ID of the team that lost the series693wins Wins by team that won the series694losses Losses by team that won the series695ties Tie games696------------------------------------------------------------------------------6972.17 AwardsManagers table698699playerID Manager ID code700awardID Name of award won701yearID Year702lgID League703tie Award was a tie (Y or N)704notes Notes about the award705706------------------------------------------------------------------------------7072.18 AwardsPlayers table708709playerID Player ID code710awardID Name of award won711yearID Year712lgID League713tie Award was a tie (Y or N)714notes Notes about the award715716------------------------------------------------------------------------------7172.19 AwardsShareManagers table718719awardID name of award votes were received for720yearID Year721lgID League722playerID Manager ID code723pointsWon Number of points received724pointsMax Maximum numner of points possible725votesFirst Number of first place votes726727------------------------------------------------------------------------------7282.20 AwardsSharePlayers table729730awardID name of award votes were received for731yearID Year732lgID League733playerID Player ID code734pointsWon Number of points received735pointsMax Maximum numner of points possible736votesFirst Number of first place votes737738------------------------------------------------------------------------------7392.21 FieldingPost table740741playerID Player ID code742yearID Year743teamID Team744lgID League745round Level of playoffs746Pos Position747G Games748GS Games Started749InnOuts Time played in the field expressed as outs750PO Putouts751A Assists752E Errors753DP Double Plays754TP Triple Plays755PB Passed Balls756SB Stolen Bases allowed (by catcher)757CS Caught Stealing (by catcher)758759------------------------------------------------------------------------------7602.22 Appearances table761762yearID Year763teamID Team764lgID League765playerID Player ID code766G_all Total games played767GS Games started768G_batting Games in which player batted769G_defense Games in which player appeared on defense770G_p Games as pitcher771G_c Games as catcher772G_1b Games as firstbaseman773G_2b Games as secondbaseman774G_3b Games as thirdbaseman775G_ss Games as shortstop776G_lf Games as leftfielder777G_cf Games as centerfielder778G_rf Games as right fielder779G_of Games as outfielder780G_dh Games as designated hitter781G_ph Games as pinch hitter782G_pr Games as pinch runner783784785------------------------------------------------------------------------------7862.23 Schools table787schoolID school ID code788schoolName school name789schoolCity city where school is located790schoolState state where school's city is located791schoolNick nickname for school's baseball team792793794------------------------------------------------------------------------------7952.24 CollegePlaying table796playerid Player ID code797schoolID school ID code798year year799800801<end of file>802803804