Frh
8a63e8e794
Minor linting
2020-04-29 12:31:02 -07:00
Frh
c0903b8ca9
Improve column detection for hybrid flavor
...
No longer rely on the mode but on the parsing analysis during network
detection.
Added unit test for complex table with vertical header and mixed
horizontal / vertical text.
2020-04-29 11:46:40 -07:00
Frh
04fc542dc3
Fix off by one error in column identification
2020-04-29 09:45:55 -07:00
Frh
918416e7e4
Improve hybrid table body discovery algo
...
While searching for table body boundaries, exclude rows that include
cells crossing previously discovered rows.
2020-04-28 22:43:55 -07:00
Frh
c51c24a416
Linting
2020-04-25 22:47:23 -07:00
Frh
2624010197
Remove f-strings, fix url based unit tests
...
f-strings fail unit tests in Python <3.7, removed them for .format.
Made download_url simulate Mozilla/5.0 to restore unit tests, since
server targetted was 403ing.
2020-04-25 21:14:56 -07:00
Frh
5290fb6a7d
Refactor out _text_bbox
2020-04-24 15:18:38 -07:00
Frh
414708d8c7
Move generic code to utils
2020-04-22 19:08:06 -07:00
Frh
36d5a09ad6
Refactor common code hybrid / stream
2020-04-22 17:33:15 -07:00
Frh
7b0ac03f8e
Prefer showing diffs at the row level
2020-04-22 14:50:45 -07:00
Frh
0be58de1cb
Fix in table diff
2020-04-22 14:23:52 -07:00
Frh
9a82408a9a
Prettier plotting, improve gaps calculation
2020-04-22 14:08:22 -07:00
Frh
175655d31b
Add support for region/area for hybrid
2020-04-20 11:20:59 -07:00
Frh
58823e57e9
More refactoring / linting
2020-04-19 15:41:45 -07:00
Frh
c27a8026d6
More linting, refactor
2020-04-19 14:42:18 -07:00
Frh
50f11867af
Lint, refactor
2020-04-19 14:30:32 -07:00
Frh
583868756a
Prep work for new hybrid parser introduction
...
Refactor parsers by moving common code to the base class
Maintain Python 3.5 compatibility by removing f"{}"
2020-04-19 11:32:22 -07:00
Frh
697289e409
Refactor base classes and improve plotting
...
Move common code to base class to reduce duplication
Stream plots display pdf background for better context
2020-04-18 23:03:27 -07:00
Frh
816471e426
Fix unit tests, lint, drop Python 2 support
...
Drop EOL Python 2 support. Resolve unit test discrepancies.
Update unit tests to pass in Travis across all supported Py.
Linting.
2020-04-18 17:25:47 -07:00
Vinayak Mehta
a97b50ef21
Update flavor kwargs
2019-07-06 22:59:51 +05:30
Dimiter Naydenov
240ea6c411
Fixed strip_text argument getting ignored
2019-07-04 12:12:52 +03:00
Vinayak Mehta
2115a0e177
Blacken code
2019-07-03 23:47:42 +05:30
Vinayak Mehta
ce727d9558
Fix split text bug
2019-03-22 02:28:29 +05:30
Vinayak Mehta
03f301b25c
Add table regions support
2019-01-04 19:17:54 +05:30
Vinayak Mehta
9d90cadac0
Fix variable name
2019-01-03 15:47:05 +05:30
Vinayak Mehta
f605bd8f94
Fix #239
2019-01-03 14:55:47 +05:30
Vinayak Mehta
62ed4753cd
Make python2 compat
2018-12-24 13:10:48 +05:30
Vinayak Mehta
2b3461deab
Add support to read from url
2018-12-24 12:55:52 +05:30
Vinayak Mehta
50b4468aff
Rename kwargs and add tests
2018-12-21 15:09:37 +05:30
Vinayak Mehta
f6aa21c31f
Add strip_text
2018-12-20 16:32:16 +05:30
Vinayak Mehta
ca6cefa362
Add extra_kwargs
2018-12-17 11:49:05 +05:30
Vinayak Mehta
5e71f0b0e6
Fix #192
2018-12-13 12:50:30 +05:30
Oshawk
90aaba6eec
[MRG + 1] Make pep8 ( #125 )
...
* Make setup.py pep8
Add new line at end of file, fix bare except, remove unused import.
* Make tests/*.py pep8
Add some newlines at and of files and a visual indent.
* Make docs/*.py pep8
Fix block comments and add new lines at end of files.
* Make camelot/*.py pep8
Fixed unused import, a few weirdly ordered imports, a docstring typo and many new lines at the end of lines.
* Fix imports
Fix import order and remove a couple more unused imports.
* Fix indents
Fix indentation (no opening delimiter alignment).
* Add newlines
2018-10-05 16:55:43 +05:30
Vinayak Mehta
6e8079df84
[MRG] Add tests for output formats and parser kwargs ( #126 )
...
* Remove unused image processing code
* Add opencv back-compat comment
* Add tests for parser special cases
* Fix lattice table area test
* Add tests for output format
* Add openpyxl dep
2018-10-05 16:15:30 +05:30
Vinayak Mehta
c5bde5e2ad
[MRG] Add error/warning tests ( #113 )
...
* Add unknown flavor test
* Add input kwargs test
* Remove unused utils
* Add unsupported format test
* Add stream unequal tables-columns length test
* Add python3 compat
* Add no tables found test
* Convert util info log to warning
2018-10-02 19:28:42 +05:30
Vinayak Mehta
fc0542bd3c
Add Python 3 compatibility ( #109 )
...
* Add python3 compat
* Update .gitignore
* Update .gitignore again
* Remove debugging return
* Add unicode_literals import
* Bump version
* Add python3-tk note
2018-09-28 21:58:29 +05:30
Vinayak Mehta
3170a9689f
Add flavors
2018-09-23 10:53:32 +05:30
Vinayak Mehta
17ea5f335e
Fix docstrings and interlinks
2018-09-11 08:31:37 +05:30
Vinayak Mehta
7bb1aee9b6
Add CLI
2018-09-10 15:16:41 +05:30
Vinayak Mehta
d3beaafc99
Add temporary directory context manager
2018-09-09 18:10:55 +05:30
Vinayak Mehta
9a6ed555c8
Fix get_rotation
2018-09-09 10:04:54 +05:30
Vinayak Mehta
9878de4dfc
Add docstrings and update docs
2018-09-09 10:00:22 +05:30
Vinayak Mehta
04383920b4
Rename parser keyword arguments
2018-09-08 05:38:43 +05:30
Vinayak Mehta
b3f840bba9
Change utils function names
2018-09-07 06:04:45 +05:30
Vinayak Mehta
20acda2259
Fix current logging
2018-09-07 05:53:19 +05:30
Vinayak Mehta
b91df8a1b8
Create parsers module
2018-09-06 06:13:58 +05:30
Vinayak Mehta
96af09d9cd
Add BaseParser and refactor extract_tables
2018-09-06 05:28:34 +05:30
Vinayak Mehta
a4d3165e94
Add docstring stubs
2018-09-05 19:35:46 +05:30
Vinayak Mehta
bf63432494
Remove docstrings
2018-09-05 19:04:40 +05:30
Vinayak Mehta
9124e3374c
Add properties to Table
2018-09-05 18:20:46 +05:30