Vinayak Mehta
|
20acda2259
|
Fix current logging
|
2018-09-07 05:53:19 +05:30 |
Vinayak Mehta
|
09ac8f4640
|
Add property n to TableList
|
2018-09-07 05:17:09 +05:30 |
Vinayak Mehta
|
0c329634e7
|
Add export to TableList and Table
|
2018-09-07 05:13:34 +05:30 |
Vinayak Mehta
|
557189da24
|
Refactor core
|
2018-09-06 07:42:41 +05:30 |
Vinayak Mehta
|
ffeb853c55
|
Rename plot.py to plotting.py
|
2018-09-06 06:21:54 +05:30 |
Vinayak Mehta
|
42d7a4ac02
|
Add import os
|
2018-09-06 06:15:13 +05:30 |
Vinayak Mehta
|
b91df8a1b8
|
Create parsers module
|
2018-09-06 06:13:58 +05:30 |
Vinayak Mehta
|
d0005101a7
|
Add BaseParser docstring stub
|
2018-09-06 05:55:05 +05:30 |
Vinayak Mehta
|
96af09d9cd
|
Add BaseParser and refactor extract_tables
|
2018-09-06 05:28:34 +05:30 |
Vinayak Mehta
|
a4d3165e94
|
Add docstring stubs
|
2018-09-05 19:35:46 +05:30 |
Vinayak Mehta
|
bf63432494
|
Remove docstrings
|
2018-09-05 19:04:40 +05:30 |
Vinayak Mehta
|
08cbababca
|
Add properties to GeometryList
|
2018-09-05 19:00:30 +05:30 |
Vinayak Mehta
|
73e52939f5
|
Add parsing_report property
|
2018-09-05 18:50:10 +05:30 |
Vinayak Mehta
|
9124e3374c
|
Add properties to Table
|
2018-09-05 18:20:46 +05:30 |
Vinayak Mehta
|
b9d77cb983
|
Decouple debug geometry from tables
|
2018-09-05 15:18:31 +05:30 |
Vinayak Mehta
|
941994f0bf
|
Make present code work with new API
|
2018-09-04 23:34:49 +05:30 |
Vinayak Mehta
|
e3aabb720f
|
Add stream and lattice to parsers
|
2018-09-04 21:28:37 +05:30 |
Vinayak Mehta
|
5d29f0c21c
|
Move Pdf class to core as FileHandler
|
2018-09-04 07:02:30 +05:30 |
Vinayak Mehta
|
c689735da2
|
Move cell and table to core
|
2018-09-04 03:49:43 +05:30 |
Vinayak Mehta
|
72c42c74db
|
Remove ocr
|
2018-09-01 16:23:54 +05:30 |
Vinayak Mehta
|
861ed0b64e
|
Fix lattice fill
|
2017-05-05 15:02:29 +05:30 |
Vinayak Mehta
|
e252e476b9
|
Add better y-cuts detection
|
2017-04-25 18:44:53 +05:30 |
Vinayak Mehta
|
76e1d32417
|
Add minor fix
Minor fix
|
2017-04-24 16:53:54 +05:30 |
Vinayak Mehta
|
bef33c75b1
|
Fix ValueError
|
2017-04-21 20:15:35 +05:30 |
Vinayak Mehta
|
fdb4b0d494
|
Update version
|
2017-04-21 15:41:32 +05:30 |
Vinayak Mehta
|
5c5bd6199c
|
Fix warnings and exceptions
|
2017-04-21 14:20:33 +05:30 |
Vinayak Mehta
|
18e1a799a1
|
Remove remove_empty
|
2017-04-21 13:22:37 +05:30 |
Vinayak Mehta
|
d28e4b8c1e
|
Change default value for iterations
|
2017-04-21 13:20:48 +05:30 |
Vinayak Mehta
|
4da754ddcb
|
[ENH] Add OCR and better joint detection
* Add iterations for dilation
* Add OCRLattice and OCRStream
* Add debug
|
2017-04-18 18:25:47 +05:30 |
Vinayak Mehta
|
7246e1a73d
|
Parallelize pdf split
|
2017-04-11 18:30:05 +05:30 |
Vinayak Mehta
|
4a87a77003
|
Remove ncols
|
2017-04-11 15:50:12 +05:30 |
Vinayak Mehta
|
72233f25ce
|
Parameterize thresholding blocksize and constant
|
2017-04-10 21:15:54 +05:30 |
Vinayak Mehta
|
84d354ba10
|
Add deepcopy and debug scripts
|
2017-04-10 18:59:48 +05:30 |
Vinayak Mehta
|
3eb18ef199
|
More logs
|
2017-02-07 22:23:05 +05:30 |
Vinayak Mehta
|
bc86346154
|
Don't let processes modify instance attributes
|
2017-02-07 22:13:33 +05:30 |
Vinayak Mehta
|
970256e19d
|
Add OCR support for image based pdfs with lines
* Cosmits
* Remove unnecessary kwargs
* Direct ghostscript call output to /dev/null
* Change char_margin's default value
* Add image attribute in Table and Cell
* Add OCR
* Fix coordinates
* Add table_area
* Add ocr options to cli
* Direct ghostscript call output to /dev/null
* Add ocr dostring
* Add requirements
* Update README
|
2017-01-07 16:37:56 +05:30 |
Vinayak Mehta
|
70f626373b
|
Cosmits
* Remove unnecessary kwargs
* Direct ghostscript call output to /dev/null
* Change char_margin's default value
|
2017-01-07 15:58:45 +05:30 |
Vinayak Mehta
|
bd1d57a561
|
Update version
|
2017-01-07 15:50:20 +05:30 |
Vinayak Mehta
|
10eda3f204
|
Deprecate Stream ncolumns
|
2016-11-07 21:30:48 +05:30 |
Vinayak Mehta
|
72c2a0020f
|
Minor fix
|
2016-10-20 18:54:06 +05:30 |
Vinayak Mehta
|
5c6a74fb2a
|
Add new params
|
2016-10-18 18:23:35 +05:30 |
Vinayak Mehta
|
b01edee337
|
Handle rotation at entry
|
2016-10-18 15:33:38 +05:30 |
Vinayak Mehta
|
2a203a1865
|
Log warning when len(header) != len(cols)
|
2016-10-17 18:16:39 +05:30 |
Vinayak Mehta
|
adb948d363
|
Fix column parameter
|
2016-10-13 16:54:45 +05:30 |
Vinayak Mehta
|
40d30c1ab9
|
Add superscript and subscript flagging
* Add superscript flagging
* Add flagging param
* Add np.round to account for rotation error
|
2016-10-12 19:27:18 +05:30 |
Vinayak Mehta
|
e8b93a9624
|
Add headers param
|
2016-10-12 13:59:10 +05:30 |
Vinayak Mehta
|
a43d5ca2c7
|
Replace chars with textlines
* Add split function
* Add split_text and shift_text params
* Change get_rotation
* Move get_column_index to utils
* Add split_text and shift_text
* Fix split_text
|
2016-10-12 13:17:02 +05:30 |
Vinayak Mehta
|
52a2876ab1
|
Fix tarea type conversion
|
2016-10-04 19:57:53 +05:30 |
Vinayak Mehta
|
4b8e96a86a
|
Update docs
* Update README
* Update index.rst
* Update docstrings
* Fix typo
* Edit docs
* Add error messages
|
2016-10-04 17:50:48 +05:30 |
Vinayak Mehta
|
d46eeeab1a
|
Change jpg to png
|
2016-09-27 18:37:38 +05:30 |