Commit Graph

60 Commits (a401d33fd93e37b12dcf348fd8a14b8d838a1a19)

Author SHA1 Message Date
Frh a401d33fd9 Refactor out _text_bbox 2020-06-11 17:20:36 -07:00
Frh 7ad5b843ab Move generic code to utils 2020-06-11 17:20:36 -07:00
Frh 14cd328644 Refactor common code hybrid / stream 2020-06-11 17:20:36 -07:00
Frh db645627ff Prefer showing diffs at the row level 2020-06-11 17:20:36 -07:00
Frh a2a831110e Fix in table diff 2020-06-11 17:20:36 -07:00
Frh 1a47c3df89 Prettier plotting, improve gaps calculation 2020-06-11 17:20:36 -07:00
Frh e0e3ff4e07 Add support for region/area for hybrid 2020-06-11 17:20:36 -07:00
Frh 64576fd836 More refactoring / linting 2020-06-11 17:20:36 -07:00
Frh f37ed50fed More linting, refactor 2020-06-11 17:20:36 -07:00
Frh 20f18b478f Lint, refactor 2020-06-11 17:20:36 -07:00
Frh 37483ca202 Prep work for new hybrid parser introduction
Refactor parsers by moving common code to the base class
Maintain Python 3.5 compatibility by removing f"{}"
2020-06-11 17:20:36 -07:00
Frh 161f71230d Refactor base classes and improve plotting
Move common code to base class to reduce duplication
Stream plots display pdf background for better context
2020-06-11 17:20:36 -07:00
Frh bd2aab5b2d Fix unit tests, lint, drop Python 2 support
Drop EOL Python 2 support. Resolve unit test discrepancies.
Update unit tests to pass in Travis across all supported Py.
Linting.
2020-06-11 17:20:35 -07:00
Vinayak Mehta a97b50ef21 Update flavor kwargs 2019-07-06 22:59:51 +05:30
Dimiter Naydenov 240ea6c411 Fixed strip_text argument getting ignored 2019-07-04 12:12:52 +03:00
Vinayak Mehta 2115a0e177 Blacken code 2019-07-03 23:47:42 +05:30
Vinayak Mehta ce727d9558 Fix split text bug 2019-03-22 02:28:29 +05:30
Vinayak Mehta 03f301b25c Add table regions support 2019-01-04 19:17:54 +05:30
Vinayak Mehta 9d90cadac0 Fix variable name 2019-01-03 15:47:05 +05:30
Vinayak Mehta f605bd8f94 Fix #239 2019-01-03 14:55:47 +05:30
Vinayak Mehta 62ed4753cd Make python2 compat 2018-12-24 13:10:48 +05:30
Vinayak Mehta 2b3461deab Add support to read from url 2018-12-24 12:55:52 +05:30
Vinayak Mehta 50b4468aff Rename kwargs and add tests 2018-12-21 15:09:37 +05:30
Vinayak Mehta f6aa21c31f Add strip_text 2018-12-20 16:32:16 +05:30
Vinayak Mehta ca6cefa362 Add extra_kwargs 2018-12-17 11:49:05 +05:30
Vinayak Mehta 5e71f0b0e6 Fix #192 2018-12-13 12:50:30 +05:30
Oshawk 90aaba6eec [MRG + 1] Make pep8 (#125)
* Make setup.py pep8

Add new line at end of file, fix bare except, remove unused import.

* Make tests/*.py pep8

Add some newlines at and of files and a visual indent.

* Make docs/*.py pep8

Fix block comments and add new lines at end of files.

* Make camelot/*.py pep8

Fixed unused import, a few weirdly ordered imports, a docstring typo and  many new lines at the end of lines.

* Fix imports

Fix import order and remove a couple more unused imports.

* Fix indents

Fix indentation (no opening delimiter alignment).

* Add newlines
2018-10-05 16:55:43 +05:30
Vinayak Mehta 6e8079df84
[MRG] Add tests for output formats and parser kwargs (#126)
* Remove unused image processing code

* Add opencv back-compat comment

* Add tests for parser special cases

* Fix lattice table area test

* Add tests for output format

* Add openpyxl dep
2018-10-05 16:15:30 +05:30
Vinayak Mehta c5bde5e2ad
[MRG] Add error/warning tests (#113)
* Add unknown flavor test

* Add input kwargs test

* Remove unused utils

* Add unsupported format test

* Add stream unequal tables-columns length test

* Add python3 compat

* Add no tables found test

* Convert util info log to warning
2018-10-02 19:28:42 +05:30
Vinayak Mehta fc0542bd3c
Add Python 3 compatibility (#109)
* Add python3 compat

* Update .gitignore

* Update .gitignore again

* Remove debugging return

* Add unicode_literals import

* Bump version

* Add python3-tk note
2018-09-28 21:58:29 +05:30
Vinayak Mehta 3170a9689f Add flavors 2018-09-23 10:53:32 +05:30
Vinayak Mehta 17ea5f335e Fix docstrings and interlinks 2018-09-11 08:31:37 +05:30
Vinayak Mehta 7bb1aee9b6 Add CLI 2018-09-10 15:16:41 +05:30
Vinayak Mehta d3beaafc99 Add temporary directory context manager 2018-09-09 18:10:55 +05:30
Vinayak Mehta 9a6ed555c8 Fix get_rotation 2018-09-09 10:04:54 +05:30
Vinayak Mehta 9878de4dfc Add docstrings and update docs 2018-09-09 10:00:22 +05:30
Vinayak Mehta 04383920b4 Rename parser keyword arguments 2018-09-08 05:38:43 +05:30
Vinayak Mehta b3f840bba9 Change utils function names 2018-09-07 06:04:45 +05:30
Vinayak Mehta 20acda2259 Fix current logging 2018-09-07 05:53:19 +05:30
Vinayak Mehta b91df8a1b8 Create parsers module 2018-09-06 06:13:58 +05:30
Vinayak Mehta 96af09d9cd Add BaseParser and refactor extract_tables 2018-09-06 05:28:34 +05:30
Vinayak Mehta a4d3165e94 Add docstring stubs 2018-09-05 19:35:46 +05:30
Vinayak Mehta bf63432494 Remove docstrings 2018-09-05 19:04:40 +05:30
Vinayak Mehta 9124e3374c Add properties to Table 2018-09-05 18:20:46 +05:30
Vinayak Mehta e252e476b9 Add better y-cuts detection 2017-04-25 18:44:53 +05:30
Vinayak Mehta 5c5bd6199c Fix warnings and exceptions 2017-04-21 14:20:33 +05:30
Vinayak Mehta 4da754ddcb [ENH] Add OCR and better joint detection
* Add iterations for dilation

* Add OCRLattice and OCRStream

* Add debug
2017-04-18 18:25:47 +05:30
Vinayak Mehta 72233f25ce Parameterize thresholding blocksize and constant 2017-04-10 21:15:54 +05:30
Vinayak Mehta bc86346154 Don't let processes modify instance attributes 2017-02-07 22:13:33 +05:30
Vinayak Mehta 70f626373b Cosmits
* Remove unnecessary kwargs

* Direct ghostscript call output to /dev/null

* Change char_margin's default value
2017-01-07 15:58:45 +05:30