Commit Graph

48 Commits (f0b2cffb176d1914924400274101de70c81c28d4)

Author SHA1 Message Date
Francois Huet f0b2cffb17 Replace constant padding with expansion heuristic
Fixed all unit tests.
Removed constant padding added around tables in the last step of the
initial discovery mode of the stream algorithm.
Replaced it with a heuristic that attempts to expand the table up while
respecting columns identified so far.
Updated unit tests to reflect new behavior, improved rejection of
extraneous information in few cases.
Added unit test covering a use case where the header has vertical test.
Made improvements to better support vertical text in tables.
2020-04-05 17:05:06 -07:00
Vinayak Mehta 2115a0e177 Blacken code 2019-07-03 23:47:42 +05:30
Vinayak Mehta de3281c1b6 Add test 2019-05-27 22:18:23 +05:30
Vinayak Mehta b2a8348f13 Fix #312 2019-05-26 17:13:59 +05:30
Vinayak Mehta f94777038a Update stream table regions logic 2019-01-04 20:27:53 +05:30
Vinayak Mehta 03f301b25c Add table regions support 2019-01-04 19:17:54 +05:30
Vinayak Mehta 605ffdd444 Add test 2019-01-03 16:13:41 +05:30
Vinayak Mehta f605bd8f94 Fix #239 2019-01-03 14:55:47 +05:30
Vinayak Mehta 50b4468aff Rename kwargs and add tests 2018-12-21 15:09:37 +05:30
Vinayak Mehta f6aa21c31f Add strip_text 2018-12-20 16:32:16 +05:30
Vinayak Mehta 3f5af18738 Add resolution 2018-12-20 15:01:29 +05:30
Vinayak Mehta e0090fbb0a Add edge close tolerance 2018-12-20 13:58:54 +05:30
Vinayak Mehta 48b2dce633 Update advanced docs 2018-12-19 18:19:39 +05:30
Vinayak Mehta ca6cefa362 Add extra_kwargs 2018-12-17 11:49:05 +05:30
Vinayak Mehta 69136431b6 Fix #215 2018-12-13 14:36:50 +05:30
Vinayak Mehta 5e71f0b0e6 Fix #192 2018-12-13 12:50:30 +05:30
Vinayak Mehta 33cea45346 Fix #105 2018-12-13 00:45:22 +05:30
Vinayak Mehta 591cfd5291 Change kwarg name 2018-12-12 10:15:04 +05:30
Vinayak Mehta e50f9c8847 Change suppress_warnings to verbose 2018-12-12 09:58:34 +05:30
Vinayak Mehta 87a2f4fdc9 Add textedge plot type 2018-12-12 07:36:07 +05:30
Vinayak Mehta 23ec6b55f7 Add docstrings and update docs 2018-11-23 21:04:10 +05:30
Vinayak Mehta 1f71513004 Fix no table found warning and add tests for two tables 2018-11-23 19:28:55 +05:30
Vinayak Mehta 0251422e33 Add fix to include table headers 2018-11-23 03:27:23 +05:30
Vinayak Mehta a1e1fd781d Fix comments 2018-11-23 02:51:22 +05:30
Vinayak Mehta 4e2aee18c3 Add get_table_areas textedges method 2018-11-22 19:48:51 +05:30
Vinayak Mehta a587ea3782 Add get_relevant textedges method 2018-11-22 18:24:31 +05:30
Vinayak Mehta 123227aa8c Add TextEdge and TextEdges helper classes 2018-11-22 05:31:02 +05:30
Vinayak Mehta defaead679
Add table bbox attribute (#193) 2018-11-04 01:33:41 +05:30
Parth P Panchal 32df09ad1c Renames the keyword `table_area` to `table_areas` (#171)
`table_areas` sounds more apt since it is a list and there can be
multiple table areas on a page.

Closes #165
2018-10-24 23:06:53 +05:30
Oshawk 90aaba6eec [MRG + 1] Make pep8 (#125)
* Make setup.py pep8

Add new line at end of file, fix bare except, remove unused import.

* Make tests/*.py pep8

Add some newlines at and of files and a visual indent.

* Make docs/*.py pep8

Fix block comments and add new lines at end of files.

* Make camelot/*.py pep8

Fixed unused import, a few weirdly ordered imports, a docstring typo and  many new lines at the end of lines.

* Fix imports

Fix import order and remove a couple more unused imports.

* Fix indents

Fix indentation (no opening delimiter alignment).

* Add newlines
2018-10-05 16:55:43 +05:30
Vinayak Mehta c5bde5e2ad
[MRG] Add error/warning tests (#113)
* Add unknown flavor test

* Add input kwargs test

* Remove unused utils

* Add unsupported format test

* Add stream unequal tables-columns length test

* Add python3 compat

* Add no tables found test

* Convert util info log to warning
2018-10-02 19:28:42 +05:30
Vinayak Mehta fc0542bd3c
Add Python 3 compatibility (#109)
* Add python3 compat

* Update .gitignore

* Update .gitignore again

* Remove debugging return

* Add unicode_literals import

* Bump version

* Add python3-tk note
2018-09-28 21:58:29 +05:30
Vinayak Mehta be2733ebd2 Add utf8 header 2018-09-24 16:27:26 +05:30
Vinayak Mehta a70befe528 Update docs 2018-09-23 14:04:21 +05:30
Vinayak Mehta 959a252aa3 Fix CLI 2018-09-23 12:45:01 +05:30
Vinayak Mehta 7aaa7b2460 Deprecate debug and add plot docstrings 2018-09-23 11:56:40 +05:30
Vinayak Mehta 3170a9689f Add flavors 2018-09-23 10:53:32 +05:30
Vinayak Mehta 0ba3469d21 Add Stream benchmarks 2018-09-12 07:21:35 +05:30
Vinayak Mehta 17ea5f335e Fix docstrings and interlinks 2018-09-11 08:31:37 +05:30
Vinayak Mehta 9878de4dfc Add docstrings and update docs 2018-09-09 10:00:22 +05:30
Vinayak Mehta 7c3e531b07 Port tests 2018-09-09 05:29:24 +05:30
Vinayak Mehta 04383920b4 Rename parser keyword arguments 2018-09-08 05:38:43 +05:30
Vinayak Mehta e615580e55 Fix plot_geometry 2018-09-07 06:25:13 +05:30
Vinayak Mehta b3f840bba9 Change utils function names 2018-09-07 06:04:45 +05:30
Vinayak Mehta 20acda2259 Fix current logging 2018-09-07 05:53:19 +05:30
Vinayak Mehta 0c329634e7 Add export to TableList and Table 2018-09-07 05:13:34 +05:30
Vinayak Mehta 557189da24 Refactor core 2018-09-06 07:42:41 +05:30
Vinayak Mehta b91df8a1b8 Create parsers module 2018-09-06 06:13:58 +05:30