Frh
81de841ca0
Plot improvements, address 132
...
Plot takes an optional axes parameter, allowing notebooks more
flexibility.
Header heuristic in hybrid won't include headers which span the
entire table.
Added unit test for issue #132
Fixes https://github.com/camelot-dev/camelot/issues/132
2020-06-11 17:20:36 -07:00
Frh
dbaab66e43
Rename member for clarity, fixed unit test
...
_textlines_alignments becomes _textline_to_alignments
2020-06-11 17:20:36 -07:00
Frh
a0e46916e2
Improve edgeplot for hybrid
2020-06-11 17:20:36 -07:00
Frh
c9a73a1ad7
Further refactoring
2020-06-11 17:20:36 -07:00
Frh
18581640be
Common parent TextBaseParser for Stream and Hybrid
2020-06-11 17:20:36 -07:00
Frh
a401d33fd9
Refactor out _text_bbox
2020-06-11 17:20:36 -07:00
Frh
87d95a098c
Further simplification
2020-06-11 17:20:36 -07:00
Frh
22b6e33efa
Enforce text_edge as subcase of text_alignment
...
TextNetworks is a list of TextAlignments
2020-06-11 17:20:36 -07:00
Frh
2d97fbc036
Define TextEdge as a bounded TextAlignment
2020-06-11 17:20:36 -07:00
Frh
8903ef77d4
More refactoring across stream and hybrid.
...
Stream now much faster, whole test is 72s instead of 92s
2020-06-11 17:20:36 -07:00
Frh
92c8abdca3
Refactoring TextEdges code across hybrid and stream
2020-06-11 17:20:36 -07:00
Frh
7ad5b843ab
Move generic code to utils
2020-06-11 17:20:36 -07:00
Frh
14cd328644
Refactor common code hybrid / stream
2020-06-11 17:20:36 -07:00
Frh
bfc2719aff
Address last unit test
2020-06-11 17:20:36 -07:00
Frh
db645627ff
Prefer showing diffs at the row level
2020-06-11 17:20:36 -07:00
Frh
356af846db
Loosen cells header expansion algorithm
...
Accept cells if they're at least 50% within the table's bounds.
2020-06-11 17:20:36 -07:00
Frh
a2a831110e
Fix in table diff
2020-06-11 17:20:36 -07:00
Frh
1a47c3df89
Prettier plotting, improve gaps calculation
2020-06-11 17:20:36 -07:00
Frh
d2cf8520cb
Draw parse constraints for easier debug
...
* Display regions and areas rectangles
2020-06-11 17:20:36 -07:00
Frh
310a8cd80a
Refactor code in plotting
2020-06-11 17:20:36 -07:00
Frh
1ccaa0630d
Improve hybrid plotting
...
* plot info passed through debug_info
* display each text edge
2020-06-11 17:20:36 -07:00
Frh
e0e3ff4e07
Add support for region/area for hybrid
2020-06-11 17:20:36 -07:00
Frh
f5fe92c22e
Interim check-in, test failing and lots of todos
2020-06-11 17:20:36 -07:00
Frh
878ef96fa7
More linting
2020-06-11 17:20:36 -07:00
Frh
07e2e1640d
Linting
2020-06-11 17:20:36 -07:00
Frh
f9a6543c36
Initial Hybrid parser, for now identical to Stream
2020-06-11 17:20:36 -07:00
Frh
64576fd836
More refactoring / linting
2020-06-11 17:20:36 -07:00
Frh
f37ed50fed
More linting, refactor
2020-06-11 17:20:36 -07:00
Frh
20f18b478f
Lint, refactor
2020-06-11 17:20:36 -07:00
Frh
ff2ce6f47c
Further refactor
...
Move common parse error stats computation to base parser
Move copy_spanning_text logic to the table
2020-06-11 17:20:36 -07:00
Frh
37483ca202
Prep work for new hybrid parser introduction
...
Refactor parsers by moving common code to the base class
Maintain Python 3.5 compatibility by removing f"{}"
2020-06-11 17:20:36 -07:00
Frh
161f71230d
Refactor base classes and improve plotting
...
Move common code to base class to reduce duplication
Stream plots display pdf background for better context
2020-06-11 17:20:36 -07:00
Frh
bd2aab5b2d
Fix unit tests, lint, drop Python 2 support
...
Drop EOL Python 2 support. Resolve unit test discrepancies.
Update unit tests to pass in Travis across all supported Py.
Linting.
2020-06-11 17:20:35 -07:00
Dimiter Naydenov
b2929a9e92
Merge pull request #34 from KOLANICH/win_ghostscript_callback_fix
...
Fixed calling convention of callback functions
2019-07-24 13:39:18 +03:00
KOLANICH
5687fbc8b2
Fixed calling convention of callback functions
2019-07-16 21:08:34 +03:00
KOLANICH
9e356b1b0a
Fixed library discovery on Windows
2019-07-16 21:07:23 +03:00
Vinayak Mehta
0efb3ca1b0
Update HISTORY.md and bump version
2019-07-07 16:07:28 +05:30
Vinayak Mehta
a97b50ef21
Update flavor kwargs
2019-07-06 22:59:51 +05:30
Dimiter Naydenov
0f8cda4793
Merge pull request #5 from camelot-dev/fix-cli-group-name
...
[MRG] No need to monkey-patch Click.HelpFormatter
2019-07-04 18:26:35 +03:00
Dimiter Naydenov
13616c2fb4
No need to monkey-patch Click.HelpFormatter
2019-07-04 13:13:32 +03:00
Dimiter Naydenov
240ea6c411
Fixed strip_text argument getting ignored
2019-07-04 12:12:52 +03:00
Vinayak Mehta
16ddd10644
Update image_processing.py
2019-07-04 00:06:46 +05:30
Vinayak Mehta
2115a0e177
Blacken code
2019-07-03 23:47:42 +05:30
Vinayak Mehta
de3281c1b6
Add test
2019-05-27 22:18:23 +05:30
Vinayak Mehta
b2a8348f13
Fix #312
2019-05-26 17:13:59 +05:30
Vinayak Mehta
355ae818a0
Merge branch 'master' into fix-split-bug
2019-04-20 21:06:47 +05:30
Vinayak Mehta
ce727d9558
Fix split text bug
2019-03-22 02:28:29 +05:30
Sym Roe
8446271aa4
Always sort TableList after reading PDF
2019-02-25 09:48:47 +00:00
Sym Roe
c019e582bf
Add __lt__ to Table to allow sorting
...
Refs #277
2019-02-25 09:20:09 +00:00
yatintaluja
6c4b468800
Fix #245
2019-01-16 16:33:17 +05:30