Commit Graph

81 Commits (077b3c3e0fe2b3bf7b59b586ed8026ef154e3860)

Author SHA1 Message Date
Vinayak Mehta 077b3c3e0f Fix setup.py 2018-09-11 06:00:04 +05:30
Vinayak Mehta 118aac47bc
Merge pull request #99 from socialcopsdev/cli
Add CLI
2018-09-10 16:06:14 +05:30
Vinayak Mehta 544e0c9c3f Update CLI help and README 2018-09-10 16:05:51 +05:30
Vinayak Mehta 7bb1aee9b6 Add CLI 2018-09-10 15:16:41 +05:30
Vinayak Mehta 1b013178a8 Add docstrings to table to_format methods 2018-09-09 18:41:40 +05:30
Vinayak Mehta d3beaafc99 Add temporary directory context manager 2018-09-09 18:10:55 +05:30
Vinayak Mehta 9a6ed555c8 Fix get_rotation 2018-09-09 10:04:54 +05:30
Vinayak Mehta 9878de4dfc Add docstrings and update docs 2018-09-09 10:00:22 +05:30
Vinayak Mehta c91a9bb36d Add future import 2018-09-09 05:36:07 +05:30
Vinayak Mehta 7c3e531b07 Port tests 2018-09-09 05:29:24 +05:30
Vinayak Mehta 04383920b4 Rename parser keyword arguments 2018-09-08 05:38:43 +05:30
Vinayak Mehta e615580e55 Fix plot_geometry 2018-09-07 06:25:13 +05:30
Vinayak Mehta b3f840bba9 Change utils function names 2018-09-07 06:04:45 +05:30
Vinayak Mehta 20acda2259 Fix current logging 2018-09-07 05:53:19 +05:30
Vinayak Mehta 09ac8f4640 Add property n to TableList 2018-09-07 05:17:09 +05:30
Vinayak Mehta 0c329634e7 Add export to TableList and Table 2018-09-07 05:13:34 +05:30
Vinayak Mehta 557189da24 Refactor core 2018-09-06 07:42:41 +05:30
Vinayak Mehta ffeb853c55 Rename plot.py to plotting.py 2018-09-06 06:21:54 +05:30
Vinayak Mehta 42d7a4ac02 Add import os 2018-09-06 06:15:13 +05:30
Vinayak Mehta b91df8a1b8 Create parsers module 2018-09-06 06:13:58 +05:30
Vinayak Mehta d0005101a7 Add BaseParser docstring stub 2018-09-06 05:55:05 +05:30
Vinayak Mehta 96af09d9cd Add BaseParser and refactor extract_tables 2018-09-06 05:28:34 +05:30
Vinayak Mehta a4d3165e94 Add docstring stubs 2018-09-05 19:35:46 +05:30
Vinayak Mehta bf63432494 Remove docstrings 2018-09-05 19:04:40 +05:30
Vinayak Mehta 08cbababca Add properties to GeometryList 2018-09-05 19:00:30 +05:30
Vinayak Mehta 73e52939f5 Add parsing_report property 2018-09-05 18:50:10 +05:30
Vinayak Mehta 9124e3374c Add properties to Table 2018-09-05 18:20:46 +05:30
Vinayak Mehta b9d77cb983 Decouple debug geometry from tables 2018-09-05 15:18:31 +05:30
Vinayak Mehta 941994f0bf Make present code work with new API 2018-09-04 23:34:49 +05:30
Vinayak Mehta e3aabb720f Add stream and lattice to parsers 2018-09-04 21:28:37 +05:30
Vinayak Mehta 5d29f0c21c Move Pdf class to core as FileHandler 2018-09-04 07:02:30 +05:30
Vinayak Mehta c689735da2 Move cell and table to core 2018-09-04 03:49:43 +05:30
Vinayak Mehta 72c42c74db Remove ocr 2018-09-01 16:23:54 +05:30
Vinayak Mehta 861ed0b64e Fix lattice fill 2017-05-05 15:02:29 +05:30
Vinayak Mehta e252e476b9 Add better y-cuts detection 2017-04-25 18:44:53 +05:30
Vinayak Mehta 76e1d32417 Add minor fix
Minor fix
2017-04-24 16:53:54 +05:30
Vinayak Mehta bef33c75b1 Fix ValueError 2017-04-21 20:15:35 +05:30
Vinayak Mehta fdb4b0d494 Update version 2017-04-21 15:41:32 +05:30
Vinayak Mehta 5c5bd6199c Fix warnings and exceptions 2017-04-21 14:20:33 +05:30
Vinayak Mehta 18e1a799a1 Remove remove_empty 2017-04-21 13:22:37 +05:30
Vinayak Mehta d28e4b8c1e Change default value for iterations 2017-04-21 13:20:48 +05:30
Vinayak Mehta 4da754ddcb [ENH] Add OCR and better joint detection
* Add iterations for dilation

* Add OCRLattice and OCRStream

* Add debug
2017-04-18 18:25:47 +05:30
Vinayak Mehta 7246e1a73d Parallelize pdf split 2017-04-11 18:30:05 +05:30
Vinayak Mehta 4a87a77003 Remove ncols 2017-04-11 15:50:12 +05:30
Vinayak Mehta 72233f25ce Parameterize thresholding blocksize and constant 2017-04-10 21:15:54 +05:30
Vinayak Mehta 84d354ba10 Add deepcopy and debug scripts 2017-04-10 18:59:48 +05:30
Vinayak Mehta 3eb18ef199 More logs 2017-02-07 22:23:05 +05:30
Vinayak Mehta bc86346154 Don't let processes modify instance attributes 2017-02-07 22:13:33 +05:30
Vinayak Mehta 970256e19d Add OCR support for image based pdfs with lines
* Cosmits

* Remove unnecessary kwargs

* Direct ghostscript call output to /dev/null

* Change char_margin's default value

* Add image attribute in Table and Cell

* Add OCR

* Fix coordinates

* Add table_area

* Add ocr options to cli

* Direct ghostscript call output to /dev/null

* Add ocr dostring

* Add requirements

* Update README
2017-01-07 16:37:56 +05:30
Vinayak Mehta 70f626373b Cosmits
* Remove unnecessary kwargs

* Direct ghostscript call output to /dev/null

* Change char_margin's default value
2017-01-07 15:58:45 +05:30