camelot-py

Commit Graph

Author	SHA1	Message	Date
Jose Vargas	692b8fcf57	Merge `0962b8f4d4` into `8ca30f3a3c`	2020-10-25 23:33:15 +09:00
Martin Abente Lahaye	13a50e2ba2	handlers: Close file streams explicitly No harm in closing these streams explicitly. Best case scenario, this prevents descriptors leaks, worse case scenario, it reduces the amount of messages like the following during tests: ResourceWarning: unclosed file	2020-10-22 11:43:01 -03:00
Vinayak Mehta	5d20d56e48	Prevent taking max of an empty set	2020-08-25 22:50:31 +05:30
Vinayak Mehta	9087429501	Merge pull request #188 from anakin87/master [MRG] Add encoding kwarg to camelot.core.Table.to_html method	2020-08-25 19:16:50 +05:30
anakin87	579bc16be5	Update core.py Correct method camelot.core.Table.to_html	2020-08-25 15:27:17 +02:00
pevisscher	aae2c6b3d4	use correct re.sub signature `text_strip` currently passes the regex flags as the count parameters, which is hardcoded to `re.UNICODE` (value 32), and thus only replaces the first 32 values. see https://docs.python.org/3/library/re.html#re.sub for the signature	2020-08-24 16:51:06 +02:00
Vinayak Mehta	705473198f	Merge pull request #121 from jedie/patch-2 [MRG] Save plot when filename is specified	2020-08-14 02:36:28 +05:30
Vinayak Mehta	b741c0a9e9	Check for none and return none	2020-08-14 02:35:50 +05:30
Vinayak Mehta	fbe576ffcb	Revert the changes in v0.8.1	2020-07-27 17:38:14 +05:30
Vinayak Mehta	16beb15c43	Bump version and update HISTORY.md	2020-07-21 21:48:29 +05:30
Vinayak Mehta	a13e2f6f1f	Change error name and update pdfminer.six version	2020-07-21 21:21:01 +05:30
Vinayak Mehta	d5d6a5962b	Bump version and update HISTORY.md	2020-05-24 18:36:13 +05:30
Vinayak Mehta	a22fa63c4e	Fix syntax errors	2020-05-24 18:19:48 +05:30
Vinayak Mehta	52b2a595b4	Add f-strings and remove python3.5 test job	2020-05-24 18:14:43 +05:30
Vinayak Mehta	f725f04223	Remove future imports	2020-05-24 17:33:13 +05:30
Vinayak Mehta	3afb72b872	Fix read_pdf(url) and test data	2020-05-24 17:26:52 +05:30
Jens Diemer	f8b6181988	Fix #120 - Save plot	2020-03-15 13:20:27 +01:00
Jose Vargas	52adbbd796	[parsers.stream] - Use fall back column coordinates. The Stream class would raise an IndexError when the 'columns' argument was specified and the number of tables identified was larger than the number of items in the 'columns' argument. This IndexError makes extracting tables from a PDF comprised mainly of known, consistent table structures of interest to the caller, but that may be variable in height, starting position, or number, rather cumbersome with the Stream parser. This is especially true within an automated or programmatic context. Either the caller must call 'camelot.read_pdf' once per page, or manipulate the 'columns' argument so as to avoid the IndexError. The former isn't guaranteed to work, as a single page can contain multiple tables, and therefore, in such a situation, the caller must resort to the latter even if extracting tables from a single page. The Stream class continues to function exactly the same when the 'table_areas' argument is provided; this commit only changes the behavior of the Stream parser when 'table_areas' is not provided. This commit allows all tables to be easily extracted by specifying 'pages=all' and providing the appropriate 'columns' argument value to 'camelot.read_pdf'. Extracting all tables from such a PDF is already possible with the Lattice parser, this commit makes this possible with the Stream parser as well. Callers are responsible for filtering out any extraneous tables.	2020-01-31 20:29:26 -05:00
Dimiter Naydenov	b2929a9e92	Merge pull request #34 from KOLANICH/win_ghostscript_callback_fix Fixed calling convention of callback functions	2019-07-24 13:39:18 +03:00
KOLANICH	5687fbc8b2	Fixed calling convention of callback functions	2019-07-16 21:08:34 +03:00
KOLANICH	9e356b1b0a	Fixed library discovery on Windows	2019-07-16 21:07:23 +03:00
Vinayak Mehta	0efb3ca1b0	Update HISTORY.md and bump version	2019-07-07 16:07:28 +05:30
Vinayak Mehta	a97b50ef21	Update flavor kwargs	2019-07-06 22:59:51 +05:30
Dimiter Naydenov	0f8cda4793	Merge pull request #5 from camelot-dev/fix-cli-group-name [MRG] No need to monkey-patch Click.HelpFormatter	2019-07-04 18:26:35 +03:00
Dimiter Naydenov	13616c2fb4	No need to monkey-patch Click.HelpFormatter	2019-07-04 13:13:32 +03:00
Dimiter Naydenov	240ea6c411	Fixed strip_text argument getting ignored	2019-07-04 12:12:52 +03:00
Vinayak Mehta	16ddd10644	Update image_processing.py	2019-07-04 00:06:46 +05:30
Vinayak Mehta	2115a0e177	Blacken code	2019-07-03 23:47:42 +05:30
Vinayak Mehta	de3281c1b6	Add test	2019-05-27 22:18:23 +05:30
Vinayak Mehta	b2a8348f13	Fix #312	2019-05-26 17:13:59 +05:30
Vinayak Mehta	355ae818a0	Merge branch 'master' into fix-split-bug	2019-04-20 21:06:47 +05:30
Vinayak Mehta	ce727d9558	Fix split text bug	2019-03-22 02:28:29 +05:30
Sym Roe	8446271aa4	Always sort TableList after reading PDF	2019-02-25 09:48:47 +00:00
Sym Roe	c019e582bf	Add __lt__ to Table to allow sorting Refs #277	2019-02-25 09:20:09 +00:00
yatintaluja	6c4b468800	Fix #245	2019-01-16 16:33:17 +05:30
yatintaluja	5330620ea2	Bump version	2019-01-16 16:30:05 +05:30
Vinayak Mehta	45ae980988	Bump version	2019-01-06 13:00:08 +05:30
Vinayak Mehta	215e5ea2a5	Move ghostscript import	2019-01-06 01:50:54 +05:30
Vinayak Mehta	9d38b2f5af	Bump version	2019-01-05 13:23:31 +05:30
Vinayak Mehta	ab5391c76f	Merge branch 'master' of github.com:socialcopsdev/camelot into replace-gs-c-api	2019-01-05 11:22:38 +05:30
Vinayak Mehta	506cec7f6b	Add sqlite support	2019-01-05 01:50:27 +05:30
Vinayak Mehta	f94777038a	Update stream table regions logic	2019-01-04 20:27:53 +05:30
Vinayak Mehta	eaca147b9d	Apply mask at threshold level	2019-01-04 20:15:41 +05:30
Vinayak Mehta	03f301b25c	Add table regions support	2019-01-04 19:17:54 +05:30
Vinayak Mehta	605ffdd444	Add test	2019-01-03 16:13:41 +05:30
Vinayak Mehta	9d90cadac0	Fix variable name	2019-01-03 15:47:05 +05:30
Vinayak Mehta	f605bd8f94	Fix #239	2019-01-03 14:55:47 +05:30
Vinayak Mehta	7a0acd7929	Update CLI	2019-01-02 16:36:25 +05:30
Vinayak Mehta	859610e0dc	Add pages test	2019-01-02 16:35:49 +05:30
Vinayak Mehta	ea5747c5c4	Bump version	2018-12-24 15:51:29 +05:30

1 2 3 4 5

209 Commits (692b8fcf5725f4e0ebe86e16baa03a686a410811)