A Python library to extract tabular data from PDFs
 
 
Go to file
Vinayak Mehta d7afe56711 Add CONTRIBUTING and CODE_OF_CONDUCT 2018-09-12 18:52:30 +05:30
camelot Add Stream benchmarks 2018-09-12 07:21:35 +05:30
docs Add CONTRIBUTING and CODE_OF_CONDUCT 2018-09-12 18:52:30 +05:30
tests Port tests 2018-09-09 05:29:24 +05:30
.coveragerc Add coveragerc and update Makefile 2016-08-08 17:24:13 +05:30
.gitignore Add docstrings and update docs 2018-09-09 10:00:22 +05:30
CODE_OF_CONDUCT.md Add CONTRIBUTING and CODE_OF_CONDUCT 2018-09-12 18:52:30 +05:30
CONTRIBUTING.md Add CONTRIBUTING and CODE_OF_CONDUCT 2018-09-12 18:52:30 +05:30
LICENSE Add LICENSE and _templates 2018-09-11 18:47:29 +05:30
README.md Add CONTRIBUTING and CODE_OF_CONDUCT 2018-09-12 18:52:30 +05:30
requirements-dev.txt Fix setup.py 2018-09-11 08:31:37 +05:30
requirements.txt Fix setup.py 2018-09-11 08:31:37 +05:30
setup.cfg Add setup.cfg 2018-09-09 05:41:42 +05:30
setup.py Fix setup.py 2018-09-11 08:31:37 +05:30

README.md

Camelot: PDF Table Parsing for Humans

license python-version

Camelot is a Python library which makes it easy for anyone to extract tables from PDF files!

Usage

>>> import camelot
>>> tables = camelot.read_pdf('foo.pdf')
>>> tables
<TableList n=2>
>>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html
>>> tables[0]
<Table shape=(3,4)>
>>> tables[0].parsing_report
{
    'accuracy': 96,
    'whitespace': 80,
    'order': 1,
    'page': 1
}
>>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html
>>> tables[0].df # get a pandas DataFrame!

There's a command-line interface too!

Installation

After installing dependencies, you can simply use pip:

$ pip install camelot-py

Documentation

Th documentation is available at link.

Development

The Contributor's Guide has detailed information about contributing code, documentation, tests and more. We've included some basic information in this README.

Source code

You can check the latest sources with the command:

$ git clone https://www.github.com/socialcopsdev/camelot.git

Setting up development environment

You can install the development dependencies with the command:

$ pip install camelot-py[dev]

Testing

After installation, you can run tests using:

$ python setup.py test

License

This project is licensed under the MIT License.