# Camelot: PDF Table Parsing for Humans Camelot is a Python 2.7 library and command-line tool for extracting tabular data from PDF files. ## Usage
>>> import camelot
>>> tables = camelot.read_pdf("foo.pdf")
>>> tables
<TableList n=2>
>>> tables.export("foo.csv", f="csv", compress=True) # json, excel, html
>>> tables[0]
<Table shape=(3,4)>
>>> tables[0].to_csv("foo.csv") # to_json, to_excel, to_html
>>> tables[0].parsing_report
{
"accuracy": 96,
"whitespace": 80,
"order": 1,
"page": 1
}
>>> df = tables[0].df
## Dependencies
The dependencies include [tk](https://wiki.tcl.tk/3743) and [ghostscript](https://www.ghostscript.com/).
## Installation
Make sure you have the most updated versions for `pip` and `setuptools`. You can update them by
pip install -U pip setuptools### Installing dependencies tk and ghostscript can be installed using your system's default package manager. #### Linux * Ubuntu
sudo apt-get install python-opencv python-tk ghostscript* Arch Linux
sudo pacman -S tk ghostscript#### OS X
brew install tcl-tk ghostscriptFinally, `cd` into the project directory and install by
python setup.py install## Development ### Code You can check the latest sources with the command:
git clone https://github.com/socialcopsdev/camelot.git### Contributing See [Contributing guidelines](). ### Testing
python setup.py test## License BSD License