Commit Graph

216 Commits (58b2c1d0fd88cf7a2c725425eb514f2c3d11fabe)

Author SHA1 Message Date
Frh 58b2c1d0fd Define TextEdge as a bounded TextAlignment 2020-04-23 18:26:55 -07:00
Frh 5db49d4fde More refactoring across stream and hybrid.
Stream now much faster, whole test is 72s instead of 92s
2020-04-23 14:42:13 -07:00
Frh adb14d3522 Refactoring TextEdges code across hybrid and stream 2020-04-23 12:55:09 -07:00
Frh 414708d8c7 Move generic code to utils 2020-04-22 19:08:06 -07:00
Frh 36d5a09ad6 Refactor common code hybrid / stream 2020-04-22 17:33:15 -07:00
Frh 489e996bd8 Address last unit test 2020-04-22 16:02:49 -07:00
Frh 7b0ac03f8e Prefer showing diffs at the row level 2020-04-22 14:50:45 -07:00
Frh df3d28837d Loosen cells header expansion algorithm
Accept cells if they're at least 50% within the table's bounds.
2020-04-22 14:24:47 -07:00
Frh 0be58de1cb Fix in table diff 2020-04-22 14:23:52 -07:00
Frh 9a82408a9a Prettier plotting, improve gaps calculation 2020-04-22 14:08:22 -07:00
Frh cd338ff4e2 Draw parse constraints for easier debug
* Display regions and areas rectangles
2020-04-21 14:24:44 -07:00
Frh ad27a11d35 Refactor code in plotting 2020-04-21 13:57:12 -07:00
Frh fb69bd9299 Improve hybrid plotting
* plot info passed through debug_info
* display each text edge
2020-04-20 16:54:06 -07:00
Frh 175655d31b Add support for region/area for hybrid 2020-04-20 11:20:59 -07:00
Frh 57c5957bad Interim check-in, test failing and lots of todos 2020-04-19 18:26:38 -07:00
Frh 69c7728867 More linting 2020-04-19 17:05:33 -07:00
Frh 89fe090ec4 Linting 2020-04-19 16:40:14 -07:00
Frh d520a77bb7 Initial Hybrid parser, for now identical to Stream 2020-04-19 16:27:01 -07:00
Frh 58823e57e9 More refactoring / linting 2020-04-19 15:41:45 -07:00
Frh c27a8026d6 More linting, refactor 2020-04-19 14:42:18 -07:00
Frh 50f11867af Lint, refactor 2020-04-19 14:30:32 -07:00
Frh cff7a9698b Further refactor
Move common parse error stats computation to base parser
Move copy_spanning_text logic to the table
2020-04-19 13:28:17 -07:00
Frh 583868756a Prep work for new hybrid parser introduction
Refactor parsers by moving common code to the base class
Maintain Python 3.5 compatibility by removing f"{}"
2020-04-19 11:32:22 -07:00
Frh 697289e409 Refactor base classes and improve plotting
Move common code to base class to reduce duplication
Stream plots display pdf background for better context
2020-04-18 23:03:27 -07:00
Frh 816471e426 Fix unit tests, lint, drop Python 2 support
Drop EOL Python 2 support. Resolve unit test discrepancies.
Update unit tests to pass in Travis across all supported Py.
Linting.
2020-04-18 17:25:47 -07:00
Dimiter Naydenov b2929a9e92
Merge pull request #34 from KOLANICH/win_ghostscript_callback_fix
Fixed calling convention of callback functions
2019-07-24 13:39:18 +03:00
KOLANICH 5687fbc8b2 Fixed calling convention of callback functions 2019-07-16 21:08:34 +03:00
KOLANICH 9e356b1b0a Fixed library discovery on Windows 2019-07-16 21:07:23 +03:00
Vinayak Mehta 0efb3ca1b0 Update HISTORY.md and bump version 2019-07-07 16:07:28 +05:30
Vinayak Mehta a97b50ef21 Update flavor kwargs 2019-07-06 22:59:51 +05:30
Dimiter Naydenov 0f8cda4793
Merge pull request #5 from camelot-dev/fix-cli-group-name
[MRG] No need to monkey-patch Click.HelpFormatter
2019-07-04 18:26:35 +03:00
Dimiter Naydenov 13616c2fb4 No need to monkey-patch Click.HelpFormatter 2019-07-04 13:13:32 +03:00
Dimiter Naydenov 240ea6c411 Fixed strip_text argument getting ignored 2019-07-04 12:12:52 +03:00
Vinayak Mehta 16ddd10644
Update image_processing.py 2019-07-04 00:06:46 +05:30
Vinayak Mehta 2115a0e177 Blacken code 2019-07-03 23:47:42 +05:30
Vinayak Mehta de3281c1b6 Add test 2019-05-27 22:18:23 +05:30
Vinayak Mehta b2a8348f13 Fix #312 2019-05-26 17:13:59 +05:30
Vinayak Mehta 355ae818a0
Merge branch 'master' into fix-split-bug 2019-04-20 21:06:47 +05:30
Vinayak Mehta ce727d9558 Fix split text bug 2019-03-22 02:28:29 +05:30
Sym Roe 8446271aa4
Always sort TableList after reading PDF 2019-02-25 09:48:47 +00:00
Sym Roe c019e582bf
Add __lt__ to Table to allow sorting
Refs #277
2019-02-25 09:20:09 +00:00
yatintaluja 6c4b468800 Fix #245 2019-01-16 16:33:17 +05:30
yatintaluja 5330620ea2 Bump version 2019-01-16 16:30:05 +05:30
Vinayak Mehta 45ae980988 Bump version 2019-01-06 13:00:08 +05:30
Vinayak Mehta 215e5ea2a5 Move ghostscript import 2019-01-06 01:50:54 +05:30
Vinayak Mehta 9d38b2f5af Bump version 2019-01-05 13:23:31 +05:30
Vinayak Mehta ab5391c76f Merge branch 'master' of github.com:socialcopsdev/camelot into replace-gs-c-api 2019-01-05 11:22:38 +05:30
Vinayak Mehta 506cec7f6b Add sqlite support 2019-01-05 01:50:27 +05:30
Vinayak Mehta f94777038a Update stream table regions logic 2019-01-04 20:27:53 +05:30
Vinayak Mehta eaca147b9d Apply mask at threshold level 2019-01-04 20:15:41 +05:30