Jonathan Lloyd
3def4a5aea
[MRG + 1] Add suppress_warnings flag ( #155 )
...
* Add suppress_warnings flag
* Add --quiet flag to cli (to suppress warnings)
* Remove TODO and update comment
2018-10-19 16:55:00 +05:30
Vinayak Mehta
45e7f7570e
Bump version
2018-10-08 03:54:21 +05:30
Vinayak Mehta
fe68328ef2
Move opencv-python to extra_requires ( #134 )
2018-10-08 01:10:48 +05:30
Vinayak Mehta
9b2fc53e58
Bump version
2018-10-05 20:22:46 +05:30
Vaibhav Mule
c53ea795fd
[MRG + 1] Add tests for repr ( #128 )
...
* add tests for repr
* remove repr for Cell
* add round for repr of Cell
* change decimal places to 2
* change tests for 2 decimal places
2018-10-05 20:19:24 +05:30
Oshawk
90aaba6eec
[MRG + 1] Make pep8 ( #125 )
...
* Make setup.py pep8
Add new line at end of file, fix bare except, remove unused import.
* Make tests/*.py pep8
Add some newlines at and of files and a visual indent.
* Make docs/*.py pep8
Fix block comments and add new lines at end of files.
* Make camelot/*.py pep8
Fixed unused import, a few weirdly ordered imports, a docstring typo and many new lines at the end of lines.
* Fix imports
Fix import order and remove a couple more unused imports.
* Fix indents
Fix indentation (no opening delimiter alignment).
* Add newlines
2018-10-05 16:55:43 +05:30
Vinayak Mehta
6e8079df84
[MRG] Add tests for output formats and parser kwargs ( #126 )
...
* Remove unused image processing code
* Add opencv back-compat comment
* Add tests for parser special cases
* Fix lattice table area test
* Add tests for output format
* Add openpyxl dep
2018-10-05 16:15:30 +05:30
Vinayak Mehta
cf7823f33c
[MRG] Add ghostscript fix for windows ( #124 )
...
* Add ghostscript fix for windows
* Add python2 fix
* Update install.rst
2018-10-05 02:06:37 +05:30
Vinayak Mehta
c5bde5e2ad
[MRG] Add error/warning tests ( #113 )
...
* Add unknown flavor test
* Add input kwargs test
* Remove unused utils
* Add unsupported format test
* Add stream unequal tables-columns length test
* Add python3 compat
* Add no tables found test
* Convert util info log to warning
2018-10-02 19:28:42 +05:30
Vinayak Mehta
fc0542bd3c
Add Python 3 compatibility ( #109 )
...
* Add python3 compat
* Update .gitignore
* Update .gitignore again
* Remove debugging return
* Add unicode_literals import
* Bump version
* Add python3-tk note
2018-09-28 21:58:29 +05:30
Vinayak Mehta
dfb0d4fb4c
Fix TableList repr
2018-09-27 04:42:23 +05:30
Vinayak Mehta
759e635a3c
Bump version
2018-09-25 12:32:01 +05:30
Vinayak Mehta
7731497a5b
Fix relative links
...
Fix broken links
2018-09-24 22:15:43 +05:30
Vinayak Mehta
be2733ebd2
Add utf8 header
2018-09-24 16:27:26 +05:30
Vinayak Mehta
93b4dabcc2
Update CLI
2018-09-24 01:00:30 +05:30
Vinayak Mehta
a70befe528
Update docs
2018-09-23 14:04:21 +05:30
Vinayak Mehta
959a252aa3
Fix CLI
2018-09-23 12:45:01 +05:30
Vinayak Mehta
7aaa7b2460
Deprecate debug and add plot docstrings
2018-09-23 11:56:40 +05:30
Vinayak Mehta
71d91fbebd
Fix plot_text
2018-09-23 11:45:20 +05:30
Vinayak Mehta
3170a9689f
Add flavors
2018-09-23 10:53:32 +05:30
Vinayak Mehta
021aca8f97
Update __version__.py
2018-09-15 03:34:04 +05:30
Vinayak Mehta
a4fcdc7781
Add advanced guide illustrations
2018-09-13 21:12:25 +05:30
Vinayak Mehta
3a980a46c1
Add quickstart
2018-09-13 15:50:30 +05:30
Vinayak Mehta
0ba3469d21
Add Stream benchmarks
2018-09-12 07:21:35 +05:30
Vinayak Mehta
b276909a4f
Add Lattice benchmarks
2018-09-12 05:58:22 +05:30
Vinayak Mehta
094be1a1dd
Add better table detection image
2018-09-12 02:29:25 +05:30
Vinayak Mehta
dc533e73e2
Add agstat to benchmark
2018-09-12 02:05:34 +05:30
Vinayak Mehta
17ea5f335e
Fix docstrings and interlinks
2018-09-11 08:31:37 +05:30
Vinayak Mehta
656808b8e2
Fix setup.py
2018-09-11 08:31:37 +05:30
Vinayak Mehta
118aac47bc
Merge pull request #99 from socialcopsdev/cli
...
Add CLI
2018-09-10 16:06:14 +05:30
Vinayak Mehta
544e0c9c3f
Update CLI help and README
2018-09-10 16:05:51 +05:30
Vinayak Mehta
7bb1aee9b6
Add CLI
2018-09-10 15:16:41 +05:30
Vinayak Mehta
1b013178a8
Add docstrings to table to_format methods
2018-09-09 18:41:40 +05:30
Vinayak Mehta
d3beaafc99
Add temporary directory context manager
2018-09-09 18:10:55 +05:30
Vinayak Mehta
9a6ed555c8
Fix get_rotation
2018-09-09 10:04:54 +05:30
Vinayak Mehta
9878de4dfc
Add docstrings and update docs
2018-09-09 10:00:22 +05:30
Vinayak Mehta
c91a9bb36d
Add future import
2018-09-09 05:36:07 +05:30
Vinayak Mehta
7c3e531b07
Port tests
2018-09-09 05:29:24 +05:30
Vinayak Mehta
04383920b4
Rename parser keyword arguments
2018-09-08 05:38:43 +05:30
Vinayak Mehta
e615580e55
Fix plot_geometry
2018-09-07 06:25:13 +05:30
Vinayak Mehta
b3f840bba9
Change utils function names
2018-09-07 06:04:45 +05:30
Vinayak Mehta
20acda2259
Fix current logging
2018-09-07 05:53:19 +05:30
Vinayak Mehta
09ac8f4640
Add property n to TableList
2018-09-07 05:17:09 +05:30
Vinayak Mehta
0c329634e7
Add export to TableList and Table
2018-09-07 05:13:34 +05:30
Vinayak Mehta
557189da24
Refactor core
2018-09-06 07:42:41 +05:30
Vinayak Mehta
ffeb853c55
Rename plot.py to plotting.py
2018-09-06 06:21:54 +05:30
Vinayak Mehta
42d7a4ac02
Add import os
2018-09-06 06:15:13 +05:30
Vinayak Mehta
b91df8a1b8
Create parsers module
2018-09-06 06:13:58 +05:30
Vinayak Mehta
d0005101a7
Add BaseParser docstring stub
2018-09-06 05:55:05 +05:30
Vinayak Mehta
96af09d9cd
Add BaseParser and refactor extract_tables
2018-09-06 05:28:34 +05:30
Vinayak Mehta
a4d3165e94
Add docstring stubs
2018-09-05 19:35:46 +05:30
Vinayak Mehta
bf63432494
Remove docstrings
2018-09-05 19:04:40 +05:30
Vinayak Mehta
08cbababca
Add properties to GeometryList
2018-09-05 19:00:30 +05:30
Vinayak Mehta
73e52939f5
Add parsing_report property
2018-09-05 18:50:10 +05:30
Vinayak Mehta
9124e3374c
Add properties to Table
2018-09-05 18:20:46 +05:30
Vinayak Mehta
b9d77cb983
Decouple debug geometry from tables
2018-09-05 15:18:31 +05:30
Vinayak Mehta
941994f0bf
Make present code work with new API
2018-09-04 23:34:49 +05:30
Vinayak Mehta
e3aabb720f
Add stream and lattice to parsers
2018-09-04 21:28:37 +05:30
Vinayak Mehta
5d29f0c21c
Move Pdf class to core as FileHandler
2018-09-04 07:02:30 +05:30
Vinayak Mehta
c689735da2
Move cell and table to core
2018-09-04 03:49:43 +05:30
Vinayak Mehta
72c42c74db
Remove ocr
2018-09-01 16:23:54 +05:30
Vinayak Mehta
861ed0b64e
Fix lattice fill
2017-05-05 15:02:29 +05:30
Vinayak Mehta
e252e476b9
Add better y-cuts detection
2017-04-25 18:44:53 +05:30
Vinayak Mehta
76e1d32417
Add minor fix
...
Minor fix
2017-04-24 16:53:54 +05:30
Vinayak Mehta
bef33c75b1
Fix ValueError
2017-04-21 20:15:35 +05:30
Vinayak Mehta
fdb4b0d494
Update version
2017-04-21 15:41:32 +05:30
Vinayak Mehta
5c5bd6199c
Fix warnings and exceptions
2017-04-21 14:20:33 +05:30
Vinayak Mehta
18e1a799a1
Remove remove_empty
2017-04-21 13:22:37 +05:30
Vinayak Mehta
d28e4b8c1e
Change default value for iterations
2017-04-21 13:20:48 +05:30
Vinayak Mehta
4da754ddcb
[ENH] Add OCR and better joint detection
...
* Add iterations for dilation
* Add OCRLattice and OCRStream
* Add debug
2017-04-18 18:25:47 +05:30
Vinayak Mehta
7246e1a73d
Parallelize pdf split
2017-04-11 18:30:05 +05:30
Vinayak Mehta
4a87a77003
Remove ncols
2017-04-11 15:50:12 +05:30
Vinayak Mehta
72233f25ce
Parameterize thresholding blocksize and constant
2017-04-10 21:15:54 +05:30
Vinayak Mehta
84d354ba10
Add deepcopy and debug scripts
2017-04-10 18:59:48 +05:30
Vinayak Mehta
3eb18ef199
More logs
2017-02-07 22:23:05 +05:30
Vinayak Mehta
bc86346154
Don't let processes modify instance attributes
2017-02-07 22:13:33 +05:30
Vinayak Mehta
970256e19d
Add OCR support for image based pdfs with lines
...
* Cosmits
* Remove unnecessary kwargs
* Direct ghostscript call output to /dev/null
* Change char_margin's default value
* Add image attribute in Table and Cell
* Add OCR
* Fix coordinates
* Add table_area
* Add ocr options to cli
* Direct ghostscript call output to /dev/null
* Add ocr dostring
* Add requirements
* Update README
2017-01-07 16:37:56 +05:30
Vinayak Mehta
70f626373b
Cosmits
...
* Remove unnecessary kwargs
* Direct ghostscript call output to /dev/null
* Change char_margin's default value
2017-01-07 15:58:45 +05:30
Vinayak Mehta
bd1d57a561
Update version
2017-01-07 15:50:20 +05:30
Vinayak Mehta
10eda3f204
Deprecate Stream ncolumns
2016-11-07 21:30:48 +05:30
Vinayak Mehta
72c2a0020f
Minor fix
2016-10-20 18:54:06 +05:30
Vinayak Mehta
5c6a74fb2a
Add new params
2016-10-18 18:23:35 +05:30
Vinayak Mehta
b01edee337
Handle rotation at entry
2016-10-18 15:33:38 +05:30
Vinayak Mehta
2a203a1865
Log warning when len(header) != len(cols)
2016-10-17 18:16:39 +05:30
Vinayak Mehta
adb948d363
Fix column parameter
2016-10-13 16:54:45 +05:30
Vinayak Mehta
40d30c1ab9
Add superscript and subscript flagging
...
* Add superscript flagging
* Add flagging param
* Add np.round to account for rotation error
2016-10-12 19:27:18 +05:30
Vinayak Mehta
e8b93a9624
Add headers param
2016-10-12 13:59:10 +05:30
Vinayak Mehta
a43d5ca2c7
Replace chars with textlines
...
* Add split function
* Add split_text and shift_text params
* Change get_rotation
* Move get_column_index to utils
* Add split_text and shift_text
* Fix split_text
2016-10-12 13:17:02 +05:30
Vinayak Mehta
52a2876ab1
Fix tarea type conversion
2016-10-04 19:57:53 +05:30
Vinayak Mehta
4b8e96a86a
Update docs
...
* Update README
* Update index.rst
* Update docstrings
* Fix typo
* Edit docs
* Add error messages
2016-10-04 17:50:48 +05:30
Vinayak Mehta
d46eeeab1a
Change jpg to png
2016-09-27 18:37:38 +05:30
Vinayak Mehta
75c7deffaa
Minor Stream fix
2016-09-27 17:27:34 +05:30
Vinayak Mehta
79afb45e2e
Support for vertical tables in Stream
...
* Change var names
* Add test pdf
* Add tests for Lattice rotation
* Add support for vertical tables in Stream, test pdfs
* Add tests for Stream rotation
2016-09-15 20:51:59 +05:30
Vinayak Mehta
8ce7b74671
Replace imagemagick with ghostscript
...
* Replace imagemagick with ghostscript
* Add quiet option
* Avoid repetition
* Remove Wand requirement
* Replace jpeg with png
2016-09-13 17:35:07 +05:30
Vinayak Mehta
757ba0444a
Remove jtol
2016-09-13 17:28:21 +05:30
Vinayak Mehta
439059817d
Update tests with new API
...
* Update Lattice tests with new API
* Update Stream tests with new API, fix CLI
* Add table_area test, Stream fixes
2016-09-09 16:56:25 +05:30
Vinayak Mehta
a94c350a7b
Fix param flow
...
* Fix param flow
* Add check for None
2016-09-09 14:52:38 +05:30
Vinayak Mehta
766260d5d9
Remove hybrid.py
2016-09-08 21:17:24 +05:30
Vinayak Mehta
98f47d1bd7
Fix table_bbox when no tarea is given
2016-09-05 21:26:16 +05:30
Vinayak Mehta
d86630e70b
Add table_area
...
[MRG] Add table_area
2016-09-05 18:51:59 +05:30