Commit Graph

  • d3d625a08d Unit test fixes Frh 2020-04-22 15:36:37 -0700
  • 13268beb6f Unit test fix Frh 2020-04-22 14:50:59 -0700
  • db645627ff Prefer showing diffs at the row level Frh 2020-04-22 14:50:45 -0700
  • 549ab0ebe6 Unit test fix Frh 2020-04-22 14:25:03 -0700
  • 356af846db Loosen cells header expansion algorithm Frh 2020-04-22 14:24:47 -0700
  • a2a831110e Fix in table diff Frh 2020-04-22 14:23:52 -0700
  • 1a47c3df89 Prettier plotting, improve gaps calculation Frh 2020-04-22 14:08:22 -0700
  • d2cf8520cb Draw parse constraints for easier debug Frh 2020-04-21 14:24:44 -0700
  • 310a8cd80a Refactor code in plotting Frh 2020-04-21 13:57:12 -0700
  • 1ccaa0630d Improve hybrid plotting Frh 2020-04-20 16:54:06 -0700
  • e0e3ff4e07 Add support for region/area for hybrid Frh 2020-04-20 11:20:59 -0700
  • f5fe92c22e Interim check-in, test failing and lots of todos Frh 2020-04-19 18:26:38 -0700
  • c1c9358778 More linting Frh 2020-04-19 17:35:19 -0700
  • 931b2f20f6 Try to silence bandit messages on valid asserts Frh 2020-04-19 17:17:25 -0700
  • 878ef96fa7 More linting Frh 2020-04-19 17:05:33 -0700
  • 07e2e1640d Linting Frh 2020-04-19 16:40:14 -0700
  • e8e80a8cbb Fix unit test Frh 2020-04-19 16:38:25 -0700
  • f9a6543c36 Initial Hybrid parser, for now identical to Stream Frh 2020-04-19 16:27:01 -0700
  • 64576fd836 More refactoring / linting Frh 2020-04-19 15:41:45 -0700
  • 8ed4cdf399 Fix unit test with plotting Frh 2020-04-19 15:07:59 -0700
  • f37ed50fed More linting, refactor Frh 2020-04-19 14:42:18 -0700
  • 20f18b478f Lint, refactor Frh 2020-04-19 14:30:32 -0700
  • ff2ce6f47c Further refactor Frh 2020-04-19 13:28:17 -0700
  • 37483ca202 Prep work for new hybrid parser introduction Frh 2020-04-19 11:32:22 -0700
  • 161f71230d Refactor base classes and improve plotting Frh 2020-04-18 23:03:27 -0700
  • bd2aab5b2d Fix unit tests, lint, drop Python 2 support Frh 2020-04-18 17:25:47 -0700
  • 5efbcdcebb
    Update requirements.txt Vinayak Mehta 2020-05-24 19:04:50 +0530
  • 189fe58bf2
    Update requirements.txt Vinayak Mehta 2020-05-24 19:01:03 +0530
  • 1575ec1bf0
    Add .readthedocs.yml Vinayak Mehta 2020-05-24 18:56:33 +0530
  • d5d6a5962b
    Bump version and update HISTORY.md v0.8.0 Vinayak Mehta 2020-05-24 18:36:13 +0530
  • 420d5aa624
    Merge pull request #146 from camelot-dev/add-python38-travis Vinayak Mehta 2020-05-24 18:31:27 +0530
  • a22fa63c4e
    Fix syntax errors Vinayak Mehta 2020-05-24 18:19:48 +0530
  • 52b2a595b4
    Add f-strings and remove python3.5 test job Vinayak Mehta 2020-05-24 18:14:43 +0530
  • afa1ba7c1f
    Fix test indent Vinayak Mehta 2020-05-24 17:38:48 +0530
  • f725f04223
    Remove future imports Vinayak Mehta 2020-05-24 17:33:13 +0530
  • 3afb72b872
    Fix read_pdf(url) and test data Vinayak Mehta 2020-05-24 17:26:52 +0530
  • 6dd9b6ce01
    Create FUNDING.yml Vinayak Mehta 2020-05-24 16:14:43 +0530
  • fc1b6f6227
    Add python38 test job for travis Vinayak Mehta 2020-05-24 15:27:48 +0530
  • ba5169b33d Enable process_background option for hybrid Frh 2020-05-08 15:08:12 -0700
  • 51909a886f
    Merge 27c3a41a46 into 7d4c9e53c6 Nguyễn Xuân Bình 2020-05-05 08:13:56 +0000
  • 27c3a41a46 [REF] add DataFrame to_excel params, update requirements.txt since Pandas needs XlsxWriter>=0.9.8 Binh Nguyen 2020-05-05 15:03:02 +0700
  • ae429fc248 Hybrid parser fixes Frh 2020-05-04 18:52:11 -0700
  • 79ea4adcd1 Add baseline test for hybrid Frh 2020-05-04 17:41:57 -0700
  • 77d289bd86 WIP: Introduce actual hybrid parser Frh 2020-05-04 16:27:01 -0700
  • 6711f877bf Rename WIP parser "network", actual Hybrid to come Frh 2020-05-02 16:14:03 -0700
  • c7ab3a4c32 Raise tolerance of plot differences Frh 2020-04-30 17:06:45 -0700
  • d663dd18fd Fix plotting unit tests Frh 2020-04-30 16:54:37 -0700
  • f3aded5b17 Linting Frh 2020-04-29 13:52:58 -0700
  • 8a63e8e794 Minor linting Frh 2020-04-29 12:31:02 -0700
  • c0903b8ca9 Improve column detection for hybrid flavor Frh 2020-04-29 11:46:40 -0700
  • 04fc542dc3 Fix off by one error in column identification Frh 2020-04-29 09:45:55 -0700
  • 918416e7e4 Improve hybrid table body discovery algo Frh 2020-04-28 22:43:55 -0700
  • 3220b02ebc Create notebook to help debug hybrid parser algo Plot vertical col anchors found by hybrid parser Include vertical text in col/row generation Frh 2020-04-28 12:26:12 -0700
  • 6add19ae27 Prep for vertical text improvements Frh 2020-04-28 11:46:12 -0700
  • c51c24a416 Linting Frh 2020-04-25 22:47:23 -0700
  • a2c5ee7f06 Add parser comparizon notebook Frh 2020-04-25 21:55:21 -0700
  • 30a0b2e4bc Add Parser comparison notebook to help visualizing Frh 2020-04-25 21:55:01 -0700
  • 56dd31090c Remove another f-string Frh 2020-04-25 21:33:15 -0700
  • 2624010197 Remove f-strings, fix url based unit tests Frh 2020-04-25 21:14:56 -0700
  • 016776939e Plot improvements, address 132 Frh 2020-04-25 20:51:00 -0700
  • 84ec5c6acd Rename member for clarity, fixed unit test Frh 2020-04-25 17:15:16 -0700
  • 22f4287788 Improve edgeplot for hybrid Frh 2020-04-25 13:31:10 -0700
  • bb842f21b9 Further refactoring Frh 2020-04-24 21:11:31 -0700
  • f42557ab8b Common parent TextBaseParser for Stream and Hybrid Frh 2020-04-24 15:54:58 -0700
  • 5290fb6a7d Refactor out _text_bbox Frh 2020-04-24 15:18:38 -0700
  • 8ad9e569cf Further simplification Frh 2020-04-24 12:48:51 -0700
  • efe81292ca Enforce text_edge as subcase of text_alignment Frh 2020-04-24 12:42:13 -0700
  • 58b2c1d0fd Define TextEdge as a bounded TextAlignment Frh 2020-04-23 18:26:55 -0700
  • 3ea8d81900 Update test to reflect different order of edges Frh 2020-04-23 14:45:35 -0700
  • 5db49d4fde More refactoring across stream and hybrid. Frh 2020-04-23 14:42:13 -0700
  • adb14d3522 Refactoring TextEdges code across hybrid and stream Frh 2020-04-23 12:55:09 -0700
  • 7fd08f84db
    Merge 882b168dd5 into 7d4c9e53c6 KOLANICH 2020-04-23 14:20:52 +0300
  • 414708d8c7 Move generic code to utils Frh 2020-04-22 19:08:06 -0700
  • 36d5a09ad6 Refactor common code hybrid / stream Frh 2020-04-22 17:33:15 -0700
  • 489e996bd8 Address last unit test Frh 2020-04-22 16:02:49 -0700
  • ec0ca1e009 Unit test fixes Frh 2020-04-22 15:36:37 -0700
  • 6962c714f9 Unit test fix Frh 2020-04-22 14:50:59 -0700
  • 7b0ac03f8e Prefer showing diffs at the row level Frh 2020-04-22 14:50:45 -0700
  • fab13ee5b8 Unit test fix Frh 2020-04-22 14:25:03 -0700
  • df3d28837d Loosen cells header expansion algorithm Frh 2020-04-22 14:24:47 -0700
  • 0be58de1cb Fix in table diff Frh 2020-04-22 14:23:52 -0700
  • 9a82408a9a Prettier plotting, improve gaps calculation Frh 2020-04-22 14:08:22 -0700
  • cd338ff4e2 Draw parse constraints for easier debug Frh 2020-04-21 14:24:44 -0700
  • ad27a11d35 Refactor code in plotting Frh 2020-04-21 13:57:12 -0700
  • fb69bd9299 Improve hybrid plotting Frh 2020-04-20 16:54:06 -0700
  • 175655d31b Add support for region/area for hybrid Frh 2020-04-20 11:20:59 -0700
  • 57c5957bad Interim check-in, test failing and lots of todos Frh 2020-04-19 18:26:38 -0700
  • d0bd1cfd1f More linting Frh 2020-04-19 17:35:19 -0700
  • dec8f2d0eb Try to silence bandit messages on valid asserts Frh 2020-04-19 17:17:25 -0700
  • 69c7728867 More linting Frh 2020-04-19 17:05:33 -0700
  • 89fe090ec4 Linting Frh 2020-04-19 16:40:14 -0700
  • e59b3f5efb Fix unit test Frh 2020-04-19 16:38:25 -0700
  • d520a77bb7 Initial Hybrid parser, for now identical to Stream Frh 2020-04-19 16:27:01 -0700
  • 58823e57e9 More refactoring / linting Frh 2020-04-19 15:41:45 -0700
  • d673a3b6e0 Fix unit test with plotting Frh 2020-04-19 15:07:59 -0700
  • c27a8026d6 More linting, refactor Frh 2020-04-19 14:42:18 -0700
  • 50f11867af Lint, refactor Frh 2020-04-19 14:30:32 -0700
  • cff7a9698b Further refactor Frh 2020-04-19 13:28:17 -0700
  • 583868756a Prep work for new hybrid parser introduction Frh 2020-04-19 11:32:22 -0700
  • 697289e409 Refactor base classes and improve plotting Frh 2020-04-18 23:03:27 -0700