The Stream class would raise an IndexError when the 'columns' argument was specified
and the number of tables identified was larger than the number of items in the
'columns' argument.
This IndexError makes extracting tables from a PDF comprised mainly of known,
consistent table structures of interest to the caller, but that may be variable in
height, starting position, or number, rather cumbersome with the Stream parser.
This is especially true within an automated or programmatic context.
Either the caller must call 'camelot.read_pdf' once per page, or
manipulate the 'columns' argument so as to avoid the IndexError. The former
isn't guaranteed to work, as a single page can contain multiple tables,
and therefore, in such a situation, the caller must resort to the latter even if
extracting tables from a single page.
The Stream class continues to function exactly the same when the 'table_areas'
argument is provided; this commit only changes the behavior of the Stream parser
when 'table_areas' is not provided.
This commit allows all tables to be easily extracted by specifying 'pages=all'
and providing the appropriate 'columns' argument value to
'camelot.read_pdf'.
Extracting all tables from such a PDF is already possible with the
Lattice parser, this commit makes this possible with the Stream
parser as well.
Callers are responsible for filtering out any extraneous tables.
* Make setup.py pep8
Add new line at end of file, fix bare except, remove unused import.
* Make tests/*.py pep8
Add some newlines at and of files and a visual indent.
* Make docs/*.py pep8
Fix block comments and add new lines at end of files.
* Make camelot/*.py pep8
Fixed unused import, a few weirdly ordered imports, a docstring typo and many new lines at the end of lines.
* Fix imports
Fix import order and remove a couple more unused imports.
* Fix indents
Fix indentation (no opening delimiter alignment).
* Add newlines
* Add unknown flavor test
* Add input kwargs test
* Remove unused utils
* Add unsupported format test
* Add stream unequal tables-columns length test
* Add python3 compat
* Add no tables found test
* Convert util info log to warning