baseball.computer is a free, open-source database of historical baseball statistics and play-by-play data. It builds on top of historical data sources and new database technology to provide a novel combination of detail, flexibility, and power for baseball analysis. It is currently in a preview release, and it will have bugs in data and functionality.
You can query and visualize data from your browser using the Query Engine, or you can use the database inside of any number of programming languages (examples in Python and R are provided in the links above).
Nearly all of the information in the database is sourced from Retrosheet, which is a volunteer-run organization that collects and distributes historical baseball data. The data is available for free, and only requires the following attribution:
The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".
I am a data engineer living in New York. If you have any questions or suggestions, feel free to reach out:
david.roher@baseball.computer