Vinayak Mehta
|
d0005101a7
|
Add BaseParser docstring stub
|
2018-09-06 05:55:05 +05:30 |
Vinayak Mehta
|
96af09d9cd
|
Add BaseParser and refactor extract_tables
|
2018-09-06 05:28:34 +05:30 |
Vinayak Mehta
|
a4d3165e94
|
Add docstring stubs
|
2018-09-05 19:35:46 +05:30 |
Vinayak Mehta
|
bf63432494
|
Remove docstrings
|
2018-09-05 19:04:40 +05:30 |
Vinayak Mehta
|
08cbababca
|
Add properties to GeometryList
|
2018-09-05 19:00:30 +05:30 |
Vinayak Mehta
|
73e52939f5
|
Add parsing_report property
|
2018-09-05 18:50:10 +05:30 |
Vinayak Mehta
|
9124e3374c
|
Add properties to Table
|
2018-09-05 18:20:46 +05:30 |
Vinayak Mehta
|
b9d77cb983
|
Decouple debug geometry from tables
|
2018-09-05 15:18:31 +05:30 |
Vinayak Mehta
|
941994f0bf
|
Make present code work with new API
|
2018-09-04 23:34:49 +05:30 |
Vinayak Mehta
|
e3aabb720f
|
Add stream and lattice to parsers
|
2018-09-04 21:28:37 +05:30 |
Vinayak Mehta
|
5d29f0c21c
|
Move Pdf class to core as FileHandler
|
2018-09-04 07:02:30 +05:30 |
Vinayak Mehta
|
0c9e21d881
|
Update README
|
2018-09-04 03:53:30 +05:30 |
Vinayak Mehta
|
c689735da2
|
Move cell and table to core
|
2018-09-04 03:49:43 +05:30 |
Vinayak Mehta
|
ae64264d3e
|
Update README and requirements
|
2018-09-02 19:04:24 +05:30 |
Vinayak Mehta
|
d65ee180e5
|
Update README
|
2018-09-01 16:26:15 +05:30 |
Vinayak Mehta
|
72c42c74db
|
Remove ocr
|
2018-09-01 16:23:54 +05:30 |
Vinayak Mehta
|
9753889ea2
|
Add option to specify end in page range
|
2017-08-16 14:53:15 +05:30 |
Vinayak Mehta
|
861ed0b64e
|
Fix lattice fill
|
2017-05-05 15:02:29 +05:30 |
Vinayak Mehta
|
e252e476b9
|
Add better y-cuts detection
|
2017-04-25 18:44:53 +05:30 |
Vinayak Mehta
|
76e1d32417
|
Add minor fix
Minor fix
|
2017-04-24 16:53:54 +05:30 |
Vinayak Mehta
|
bef33c75b1
|
Fix ValueError
|
2017-04-21 20:15:35 +05:30 |
Vinayak Mehta
|
fdb4b0d494
|
Update version
|
2017-04-21 15:41:32 +05:30 |
Vinayak Mehta
|
5c5bd6199c
|
Fix warnings and exceptions
|
2017-04-21 14:20:33 +05:30 |
Vinayak Mehta
|
18e1a799a1
|
Remove remove_empty
|
2017-04-21 13:22:37 +05:30 |
Vinayak Mehta
|
d28e4b8c1e
|
Change default value for iterations
|
2017-04-21 13:20:48 +05:30 |
Vinayak Mehta
|
4b3e7fb6f6
|
Add debug script
|
2017-04-18 18:32:18 +05:30 |
Vinayak Mehta
|
ae83972f80
|
Update README
|
2017-04-18 18:27:38 +05:30 |
Vinayak Mehta
|
4da754ddcb
|
[ENH] Add OCR and better joint detection
* Add iterations for dilation
* Add OCRLattice and OCRStream
* Add debug
|
2017-04-18 18:25:47 +05:30 |
Vinayak Mehta
|
dd909e2b53
|
Fix debug script
|
2017-04-11 20:26:01 +05:30 |
Vinayak Mehta
|
7246e1a73d
|
Parallelize pdf split
|
2017-04-11 18:30:05 +05:30 |
Vinayak Mehta
|
4a87a77003
|
Remove ncols
|
2017-04-11 15:50:12 +05:30 |
Vinayak Mehta
|
8e8f5bbb3b
|
Add zip of csvs option
|
2017-04-11 14:14:54 +05:30 |
Vinayak Mehta
|
72233f25ce
|
Parameterize thresholding blocksize and constant
|
2017-04-10 21:15:54 +05:30 |
Vinayak Mehta
|
8b07aa2702
|
Minor fixes
|
2017-04-10 19:08:39 +05:30 |
Vinayak Mehta
|
778366b2dd
|
Remove directory
|
2017-04-10 19:03:43 +05:30 |
Vinayak Mehta
|
84d354ba10
|
Add deepcopy and debug scripts
|
2017-04-10 18:59:48 +05:30 |
Vinayak Mehta
|
4dd0d2330e
|
Fix shift text
|
2017-03-21 16:04:55 +05:30 |
Vinayak Mehta
|
3651fb2347
|
Remove ncolumns everywhere
|
2017-03-01 19:53:48 +05:30 |
Vinayak Mehta
|
edcf770d93
|
Remove verbose option
|
2017-02-07 23:44:01 +05:30 |
Vinayak Mehta
|
3eb18ef199
|
More logs
|
2017-02-07 22:23:05 +05:30 |
Vinayak Mehta
|
bc86346154
|
Don't let processes modify instance attributes
|
2017-02-07 22:13:33 +05:30 |
Vinayak Mehta
|
970256e19d
|
Add OCR support for image based pdfs with lines
* Cosmits
* Remove unnecessary kwargs
* Direct ghostscript call output to /dev/null
* Change char_margin's default value
* Add image attribute in Table and Cell
* Add OCR
* Fix coordinates
* Add table_area
* Add ocr options to cli
* Direct ghostscript call output to /dev/null
* Add ocr dostring
* Add requirements
* Update README
|
2017-01-07 16:37:56 +05:30 |
Vinayak Mehta
|
70f626373b
|
Cosmits
* Remove unnecessary kwargs
* Direct ghostscript call output to /dev/null
* Change char_margin's default value
|
2017-01-07 15:58:45 +05:30 |
Vinayak Mehta
|
bd1d57a561
|
Update version
|
2017-01-07 15:50:20 +05:30 |
Vinayak Mehta
|
10eda3f204
|
Deprecate Stream ncolumns
|
2016-11-07 21:30:48 +05:30 |
Vinayak Mehta
|
72c2a0020f
|
Minor fix
|
2016-10-20 18:54:06 +05:30 |
Vinayak Mehta
|
ed44d603f5
|
Update README
|
2016-10-18 18:27:24 +05:30 |
Vinayak Mehta
|
5c6a74fb2a
|
Add new params
|
2016-10-18 18:23:35 +05:30 |
Vinayak Mehta
|
b01edee337
|
Handle rotation at entry
|
2016-10-18 15:33:38 +05:30 |
Vinayak Mehta
|
2a203a1865
|
Log warning when len(header) != len(cols)
|
2016-10-17 18:16:39 +05:30 |