Frh
37483ca202
Prep work for new hybrid parser introduction
...
Refactor parsers by moving common code to the base class
Maintain Python 3.5 compatibility by removing f"{}"
2020-06-11 17:20:36 -07:00
Frh
161f71230d
Refactor base classes and improve plotting
...
Move common code to base class to reduce duplication
Stream plots display pdf background for better context
2020-06-11 17:20:36 -07:00
Frh
bd2aab5b2d
Fix unit tests, lint, drop Python 2 support
...
Drop EOL Python 2 support. Resolve unit test discrepancies.
Update unit tests to pass in Travis across all supported Py.
Linting.
2020-06-11 17:20:35 -07:00
Vinayak Mehta
a97b50ef21
Update flavor kwargs
2019-07-06 22:59:51 +05:30
Dimiter Naydenov
240ea6c411
Fixed strip_text argument getting ignored
2019-07-04 12:12:52 +03:00
Vinayak Mehta
2115a0e177
Blacken code
2019-07-03 23:47:42 +05:30
Vinayak Mehta
ce727d9558
Fix split text bug
2019-03-22 02:28:29 +05:30
Vinayak Mehta
03f301b25c
Add table regions support
2019-01-04 19:17:54 +05:30
Vinayak Mehta
9d90cadac0
Fix variable name
2019-01-03 15:47:05 +05:30
Vinayak Mehta
f605bd8f94
Fix #239
2019-01-03 14:55:47 +05:30
Vinayak Mehta
62ed4753cd
Make python2 compat
2018-12-24 13:10:48 +05:30
Vinayak Mehta
2b3461deab
Add support to read from url
2018-12-24 12:55:52 +05:30
Vinayak Mehta
50b4468aff
Rename kwargs and add tests
2018-12-21 15:09:37 +05:30
Vinayak Mehta
f6aa21c31f
Add strip_text
2018-12-20 16:32:16 +05:30
Vinayak Mehta
ca6cefa362
Add extra_kwargs
2018-12-17 11:49:05 +05:30
Vinayak Mehta
5e71f0b0e6
Fix #192
2018-12-13 12:50:30 +05:30
Oshawk
90aaba6eec
[MRG + 1] Make pep8 ( #125 )
...
* Make setup.py pep8
Add new line at end of file, fix bare except, remove unused import.
* Make tests/*.py pep8
Add some newlines at and of files and a visual indent.
* Make docs/*.py pep8
Fix block comments and add new lines at end of files.
* Make camelot/*.py pep8
Fixed unused import, a few weirdly ordered imports, a docstring typo and many new lines at the end of lines.
* Fix imports
Fix import order and remove a couple more unused imports.
* Fix indents
Fix indentation (no opening delimiter alignment).
* Add newlines
2018-10-05 16:55:43 +05:30
Vinayak Mehta
6e8079df84
[MRG] Add tests for output formats and parser kwargs ( #126 )
...
* Remove unused image processing code
* Add opencv back-compat comment
* Add tests for parser special cases
* Fix lattice table area test
* Add tests for output format
* Add openpyxl dep
2018-10-05 16:15:30 +05:30
Vinayak Mehta
c5bde5e2ad
[MRG] Add error/warning tests ( #113 )
...
* Add unknown flavor test
* Add input kwargs test
* Remove unused utils
* Add unsupported format test
* Add stream unequal tables-columns length test
* Add python3 compat
* Add no tables found test
* Convert util info log to warning
2018-10-02 19:28:42 +05:30
Vinayak Mehta
fc0542bd3c
Add Python 3 compatibility ( #109 )
...
* Add python3 compat
* Update .gitignore
* Update .gitignore again
* Remove debugging return
* Add unicode_literals import
* Bump version
* Add python3-tk note
2018-09-28 21:58:29 +05:30
Vinayak Mehta
3170a9689f
Add flavors
2018-09-23 10:53:32 +05:30
Vinayak Mehta
17ea5f335e
Fix docstrings and interlinks
2018-09-11 08:31:37 +05:30
Vinayak Mehta
7bb1aee9b6
Add CLI
2018-09-10 15:16:41 +05:30
Vinayak Mehta
d3beaafc99
Add temporary directory context manager
2018-09-09 18:10:55 +05:30
Vinayak Mehta
9a6ed555c8
Fix get_rotation
2018-09-09 10:04:54 +05:30
Vinayak Mehta
9878de4dfc
Add docstrings and update docs
2018-09-09 10:00:22 +05:30
Vinayak Mehta
04383920b4
Rename parser keyword arguments
2018-09-08 05:38:43 +05:30
Vinayak Mehta
b3f840bba9
Change utils function names
2018-09-07 06:04:45 +05:30
Vinayak Mehta
20acda2259
Fix current logging
2018-09-07 05:53:19 +05:30
Vinayak Mehta
b91df8a1b8
Create parsers module
2018-09-06 06:13:58 +05:30
Vinayak Mehta
96af09d9cd
Add BaseParser and refactor extract_tables
2018-09-06 05:28:34 +05:30
Vinayak Mehta
a4d3165e94
Add docstring stubs
2018-09-05 19:35:46 +05:30
Vinayak Mehta
bf63432494
Remove docstrings
2018-09-05 19:04:40 +05:30
Vinayak Mehta
9124e3374c
Add properties to Table
2018-09-05 18:20:46 +05:30
Vinayak Mehta
e252e476b9
Add better y-cuts detection
2017-04-25 18:44:53 +05:30
Vinayak Mehta
5c5bd6199c
Fix warnings and exceptions
2017-04-21 14:20:33 +05:30
Vinayak Mehta
4da754ddcb
[ENH] Add OCR and better joint detection
...
* Add iterations for dilation
* Add OCRLattice and OCRStream
* Add debug
2017-04-18 18:25:47 +05:30
Vinayak Mehta
72233f25ce
Parameterize thresholding blocksize and constant
2017-04-10 21:15:54 +05:30
Vinayak Mehta
bc86346154
Don't let processes modify instance attributes
2017-02-07 22:13:33 +05:30
Vinayak Mehta
70f626373b
Cosmits
...
* Remove unnecessary kwargs
* Direct ghostscript call output to /dev/null
* Change char_margin's default value
2017-01-07 15:58:45 +05:30
Vinayak Mehta
b01edee337
Handle rotation at entry
2016-10-18 15:33:38 +05:30
Vinayak Mehta
2a203a1865
Log warning when len(header) != len(cols)
2016-10-17 18:16:39 +05:30
Vinayak Mehta
40d30c1ab9
Add superscript and subscript flagging
...
* Add superscript flagging
* Add flagging param
* Add np.round to account for rotation error
2016-10-12 19:27:18 +05:30
Vinayak Mehta
a43d5ca2c7
Replace chars with textlines
...
* Add split function
* Add split_text and shift_text params
* Change get_rotation
* Move get_column_index to utils
* Add split_text and shift_text
* Fix split_text
2016-10-12 13:17:02 +05:30
Vinayak Mehta
4b8e96a86a
Update docs
...
* Update README
* Update index.rst
* Update docstrings
* Fix typo
* Edit docs
* Add error messages
2016-10-04 17:50:48 +05:30
Vinayak Mehta
79afb45e2e
Support for vertical tables in Stream
...
* Change var names
* Add test pdf
* Add tests for Lattice rotation
* Add support for vertical tables in Stream, test pdfs
* Add tests for Stream rotation
2016-09-15 20:51:59 +05:30
Vinayak Mehta
d86630e70b
Add table_area
...
[MRG] Add table_area
2016-09-05 18:51:59 +05:30
Vinayak Mehta
b2dd5f68fe
Fix vertical text detection in cells
...
* Fix vertical text detection in cells
* Add Cell instance method
* Change var names
2016-09-01 01:42:27 +05:30
Vinayak Mehta
552f9cf422
Add various metrics to score the quality of a parse
...
Add various metrics to score the quality of a parse
2016-08-30 14:52:49 +05:30
Vinayak Mehta
e9602bb353
Create python package
...
Add version support
Add new test file
[RFC] First phase
[RFC] Second phase
[RFC] Third phase
Add logging
Update README
Add debug
Add debug, fixes
Add pep8 changes
Add fix
Rename CLI tool
Add csv fix
Update README
Add fix for numpages
Update README
Update requirements.txt
Use yield
Add tuple unpacking fix
Fix n00b mistake
Add check for None
Fix check for None
Fix unicode
Add relative imports
2016-07-29 21:09:39 +05:30