Commit Graph

88 Commits (84ec5c6acd6ada91ee2f7a943acfa277e4465041)

Author SHA1 Message Date
Frh 84ec5c6acd Rename member for clarity, fixed unit test
_textlines_alignments becomes _textline_to_alignments
2020-04-25 17:15:16 -07:00
Frh 22f4287788 Improve edgeplot for hybrid 2020-04-25 13:31:10 -07:00
Frh bb842f21b9 Further refactoring 2020-04-24 21:11:31 -07:00
Frh f42557ab8b Common parent TextBaseParser for Stream and Hybrid 2020-04-24 15:54:58 -07:00
Frh 5290fb6a7d Refactor out _text_bbox 2020-04-24 15:18:38 -07:00
Frh 8ad9e569cf Further simplification 2020-04-24 12:48:51 -07:00
Frh efe81292ca Enforce text_edge as subcase of text_alignment
TextNetworks is a list of TextAlignments
2020-04-24 12:42:13 -07:00
Frh 58b2c1d0fd Define TextEdge as a bounded TextAlignment 2020-04-23 18:26:55 -07:00
Frh 5db49d4fde More refactoring across stream and hybrid.
Stream now much faster, whole test is 72s instead of 92s
2020-04-23 14:42:13 -07:00
Frh adb14d3522 Refactoring TextEdges code across hybrid and stream 2020-04-23 12:55:09 -07:00
Frh 414708d8c7 Move generic code to utils 2020-04-22 19:08:06 -07:00
Frh 36d5a09ad6 Refactor common code hybrid / stream 2020-04-22 17:33:15 -07:00
Frh 489e996bd8 Address last unit test 2020-04-22 16:02:49 -07:00
Frh df3d28837d Loosen cells header expansion algorithm
Accept cells if they're at least 50% within the table's bounds.
2020-04-22 14:24:47 -07:00
Frh 9a82408a9a Prettier plotting, improve gaps calculation 2020-04-22 14:08:22 -07:00
Frh fb69bd9299 Improve hybrid plotting
* plot info passed through debug_info
* display each text edge
2020-04-20 16:54:06 -07:00
Frh 175655d31b Add support for region/area for hybrid 2020-04-20 11:20:59 -07:00
Frh 57c5957bad Interim check-in, test failing and lots of todos 2020-04-19 18:26:38 -07:00
Frh 89fe090ec4 Linting 2020-04-19 16:40:14 -07:00
Frh d520a77bb7 Initial Hybrid parser, for now identical to Stream 2020-04-19 16:27:01 -07:00
Frh 58823e57e9 More refactoring / linting 2020-04-19 15:41:45 -07:00
Frh c27a8026d6 More linting, refactor 2020-04-19 14:42:18 -07:00
Frh 50f11867af Lint, refactor 2020-04-19 14:30:32 -07:00
Frh cff7a9698b Further refactor
Move common parse error stats computation to base parser
Move copy_spanning_text logic to the table
2020-04-19 13:28:17 -07:00
Frh 583868756a Prep work for new hybrid parser introduction
Refactor parsers by moving common code to the base class
Maintain Python 3.5 compatibility by removing f"{}"
2020-04-19 11:32:22 -07:00
Frh 697289e409 Refactor base classes and improve plotting
Move common code to base class to reduce duplication
Stream plots display pdf background for better context
2020-04-18 23:03:27 -07:00
Frh 816471e426 Fix unit tests, lint, drop Python 2 support
Drop EOL Python 2 support. Resolve unit test discrepancies.
Update unit tests to pass in Travis across all supported Py.
Linting.
2020-04-18 17:25:47 -07:00
Vinayak Mehta 2115a0e177 Blacken code 2019-07-03 23:47:42 +05:30
Vinayak Mehta de3281c1b6 Add test 2019-05-27 22:18:23 +05:30
Vinayak Mehta b2a8348f13 Fix #312 2019-05-26 17:13:59 +05:30
Vinayak Mehta 215e5ea2a5 Move ghostscript import 2019-01-06 01:50:54 +05:30
Vinayak Mehta ab5391c76f Merge branch 'master' of github.com:socialcopsdev/camelot into replace-gs-c-api 2019-01-05 11:22:38 +05:30
Vinayak Mehta f94777038a Update stream table regions logic 2019-01-04 20:27:53 +05:30
Vinayak Mehta eaca147b9d Apply mask at threshold level 2019-01-04 20:15:41 +05:30
Vinayak Mehta 03f301b25c Add table regions support 2019-01-04 19:17:54 +05:30
Vinayak Mehta 605ffdd444 Add test 2019-01-03 16:13:41 +05:30
Vinayak Mehta f605bd8f94 Fix #239 2019-01-03 14:55:47 +05:30
Vinayak Mehta 27fa226c71 Fix merge conflict 2018-12-22 11:07:24 +05:30
Vinayak Mehta 50b4468aff Rename kwargs and add tests 2018-12-21 15:09:37 +05:30
Vinayak Mehta f6aa21c31f Add strip_text 2018-12-20 16:32:16 +05:30
Vinayak Mehta 3f5af18738 Add resolution 2018-12-20 15:01:29 +05:30
Vinayak Mehta e0090fbb0a Add edge close tolerance 2018-12-20 13:58:54 +05:30
Vinayak Mehta 48b2dce633 Update advanced docs 2018-12-19 18:19:39 +05:30
Vinayak Mehta 736fb25b56 Change gs resolution 2018-12-18 20:47:09 +05:30
Vinayak Mehta 4938c48853 Remove _errors and ghostscript test 2018-12-18 07:43:52 +05:30
Vinayak Mehta 9879a87c6f Add ghostscript 2018-12-17 19:09:57 +05:30
Vinayak Mehta 9aa219695f Fix merge conflict 2018-12-17 15:33:38 +05:30
Vinayak Mehta 6301fee523 Fix AttributeError 2018-12-17 12:00:41 +05:30
Vinayak Mehta ca6cefa362 Add extra_kwargs 2018-12-17 11:49:05 +05:30
Vinayak Mehta 69136431b6 Fix #215 2018-12-13 14:36:50 +05:30