Commit Graph

89 Commits (81de841ca04208e32c257746a9745320a90d370c)

Author SHA1 Message Date
Frh 81de841ca0 Plot improvements, address 132
Plot takes an optional axes parameter, allowing notebooks more
flexibility.
Header heuristic in hybrid won't include headers which span the
entire table.
Added unit test for issue #132

Fixes https://github.com/camelot-dev/camelot/issues/132
2020-06-11 17:20:36 -07:00
Frh dbaab66e43 Rename member for clarity, fixed unit test
_textlines_alignments becomes _textline_to_alignments
2020-06-11 17:20:36 -07:00
Frh a0e46916e2 Improve edgeplot for hybrid 2020-06-11 17:20:36 -07:00
Frh c9a73a1ad7 Further refactoring 2020-06-11 17:20:36 -07:00
Frh 18581640be Common parent TextBaseParser for Stream and Hybrid 2020-06-11 17:20:36 -07:00
Frh a401d33fd9 Refactor out _text_bbox 2020-06-11 17:20:36 -07:00
Frh 87d95a098c Further simplification 2020-06-11 17:20:36 -07:00
Frh 22b6e33efa Enforce text_edge as subcase of text_alignment
TextNetworks is a list of TextAlignments
2020-06-11 17:20:36 -07:00
Frh 2d97fbc036 Define TextEdge as a bounded TextAlignment 2020-06-11 17:20:36 -07:00
Frh 8903ef77d4 More refactoring across stream and hybrid.
Stream now much faster, whole test is 72s instead of 92s
2020-06-11 17:20:36 -07:00
Frh 92c8abdca3 Refactoring TextEdges code across hybrid and stream 2020-06-11 17:20:36 -07:00
Frh 7ad5b843ab Move generic code to utils 2020-06-11 17:20:36 -07:00
Frh 14cd328644 Refactor common code hybrid / stream 2020-06-11 17:20:36 -07:00
Frh bfc2719aff Address last unit test 2020-06-11 17:20:36 -07:00
Frh 356af846db Loosen cells header expansion algorithm
Accept cells if they're at least 50% within the table's bounds.
2020-06-11 17:20:36 -07:00
Frh 1a47c3df89 Prettier plotting, improve gaps calculation 2020-06-11 17:20:36 -07:00
Frh 1ccaa0630d Improve hybrid plotting
* plot info passed through debug_info
* display each text edge
2020-06-11 17:20:36 -07:00
Frh e0e3ff4e07 Add support for region/area for hybrid 2020-06-11 17:20:36 -07:00
Frh f5fe92c22e Interim check-in, test failing and lots of todos 2020-06-11 17:20:36 -07:00
Frh 07e2e1640d Linting 2020-06-11 17:20:36 -07:00
Frh f9a6543c36 Initial Hybrid parser, for now identical to Stream 2020-06-11 17:20:36 -07:00
Frh 64576fd836 More refactoring / linting 2020-06-11 17:20:36 -07:00
Frh f37ed50fed More linting, refactor 2020-06-11 17:20:36 -07:00
Frh 20f18b478f Lint, refactor 2020-06-11 17:20:36 -07:00
Frh ff2ce6f47c Further refactor
Move common parse error stats computation to base parser
Move copy_spanning_text logic to the table
2020-06-11 17:20:36 -07:00
Frh 37483ca202 Prep work for new hybrid parser introduction
Refactor parsers by moving common code to the base class
Maintain Python 3.5 compatibility by removing f"{}"
2020-06-11 17:20:36 -07:00
Frh 161f71230d Refactor base classes and improve plotting
Move common code to base class to reduce duplication
Stream plots display pdf background for better context
2020-06-11 17:20:36 -07:00
Frh bd2aab5b2d Fix unit tests, lint, drop Python 2 support
Drop EOL Python 2 support. Resolve unit test discrepancies.
Update unit tests to pass in Travis across all supported Py.
Linting.
2020-06-11 17:20:35 -07:00
Vinayak Mehta 2115a0e177 Blacken code 2019-07-03 23:47:42 +05:30
Vinayak Mehta de3281c1b6 Add test 2019-05-27 22:18:23 +05:30
Vinayak Mehta b2a8348f13 Fix #312 2019-05-26 17:13:59 +05:30
Vinayak Mehta 215e5ea2a5 Move ghostscript import 2019-01-06 01:50:54 +05:30
Vinayak Mehta ab5391c76f Merge branch 'master' of github.com:socialcopsdev/camelot into replace-gs-c-api 2019-01-05 11:22:38 +05:30
Vinayak Mehta f94777038a Update stream table regions logic 2019-01-04 20:27:53 +05:30
Vinayak Mehta eaca147b9d Apply mask at threshold level 2019-01-04 20:15:41 +05:30
Vinayak Mehta 03f301b25c Add table regions support 2019-01-04 19:17:54 +05:30
Vinayak Mehta 605ffdd444 Add test 2019-01-03 16:13:41 +05:30
Vinayak Mehta f605bd8f94 Fix #239 2019-01-03 14:55:47 +05:30
Vinayak Mehta 27fa226c71 Fix merge conflict 2018-12-22 11:07:24 +05:30
Vinayak Mehta 50b4468aff Rename kwargs and add tests 2018-12-21 15:09:37 +05:30
Vinayak Mehta f6aa21c31f Add strip_text 2018-12-20 16:32:16 +05:30
Vinayak Mehta 3f5af18738 Add resolution 2018-12-20 15:01:29 +05:30
Vinayak Mehta e0090fbb0a Add edge close tolerance 2018-12-20 13:58:54 +05:30
Vinayak Mehta 48b2dce633 Update advanced docs 2018-12-19 18:19:39 +05:30
Vinayak Mehta 736fb25b56 Change gs resolution 2018-12-18 20:47:09 +05:30
Vinayak Mehta 4938c48853 Remove _errors and ghostscript test 2018-12-18 07:43:52 +05:30
Vinayak Mehta 9879a87c6f Add ghostscript 2018-12-17 19:09:57 +05:30
Vinayak Mehta 9aa219695f Fix merge conflict 2018-12-17 15:33:38 +05:30
Vinayak Mehta 6301fee523 Fix AttributeError 2018-12-17 12:00:41 +05:30
Vinayak Mehta ca6cefa362 Add extra_kwargs 2018-12-17 11:49:05 +05:30