From 4eba7b6486a4cfdf22f8b66837943a6c9ebf6980 Mon Sep 17 00:00:00 2001 From: Vinayak Mehta Date: Mon, 28 Jun 2021 00:42:05 +0530 Subject: [PATCH] Update docs --- HISTORY.md | 6 ++++++ README.md | 6 +++--- docs/index.rst | 10 +++++----- docs/user/quickstart.rst | 4 ++-- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/HISTORY.md b/HISTORY.md index 42a3c41..0b3c011 100755 --- a/HISTORY.md +++ b/HISTORY.md @@ -4,6 +4,12 @@ Release History master ------ +**Improvements** + +- Add markdown export format. [#222](https://github.com/camelot-dev/camelot/pull/222/) by [Lucas Cimon](https://github.com/Lucas-C). + +**Documentation** + - Add faq section. [#216](https://github.com/camelot-dev/camelot/pull/216) by [Stefano Fiorucci](https://github.com/anakin87). 0.9.0 (2021-06-15) diff --git a/README.md b/README.md index 81fd71e..b5bde5c 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ >>> tables = camelot.read_pdf('foo.pdf') >>> tables <TableList n=1> ->>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, sqlite +>>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, markdown, sqlite >>> tables[0] <Table shape=(7, 7)> >>> tables[0].parsing_report @@ -32,7 +32,7 @@ 'order': 1, 'page': 1 } ->>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_sqlite +>>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_markdown, to_sqlite >>> tables[0].df # get a pandas DataFrame! @@ -55,7 +55,7 @@ You can check out some frequently asked questions [here](https://camelot-py.read - **Configurability**: Camelot gives you control over the table extraction process with [tweakable settings](https://camelot-py.readthedocs.io/en/master/user/advanced.html). - **Metrics**: You can discard bad tables based on metrics like accuracy and whitespace, without having to manually look at each table. -- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873). You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite. +- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873). You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML, Markdown, and Sqlite. See [comparison with similar libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools). diff --git a/docs/index.rst b/docs/index.rst index 65376b7..d82318f 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -54,7 +54,7 @@ Release v\ |version|. (:ref:`Installation `) >>> tables = camelot.read_pdf('foo.pdf') >>> tables - >>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html + >>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, markdown, sqlite >>> tables[0] >>> tables[0].parsing_report @@ -64,7 +64,7 @@ Release v\ |version|. (:ref:`Installation `) 'order': 1, 'page': 1 } - >>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html + >>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_markdown, to_sqlite >>> tables[0].df # get a pandas DataFrame! .. csv-table:: @@ -79,9 +79,9 @@ Camelot also comes packaged with a :ref:`command-line interface `! Why Camelot? ------------ -- **Configurability**: Camelot gives you control over the table extraction process with its :ref:`tweakable settings `. -- **Metrics**: Bad tables can be discarded based on metrics like accuracy and whitespace, without having to manually look at each table. -- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_. You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite. +- **Configurability**: Camelot gives you control over the table extraction process with :ref:`tweakable settings `. +- **Metrics**: You can discard bad tables based on metrics like accuracy and whitespace, without having to manually look at each table. +- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_. You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML, Markdown, and Sqlite. .. _ETL and data analysis workflows: https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873 diff --git a/docs/user/quickstart.rst b/docs/user/quickstart.rst index 144a302..ec7410c 100644 --- a/docs/user/quickstart.rst +++ b/docs/user/quickstart.rst @@ -56,7 +56,7 @@ Woah! The accuracy is top-notch and there is less whitespace, which means the ta .. csv-table:: :file: ../_static/csv/foo.csv -Looks good! You can now export the table as a CSV file using its :meth:`to_csv() ` method. Alternatively you can use :meth:`to_json() `, :meth:`to_excel() ` :meth:`to_html() ` or :meth:`to_sqlite() ` methods to export the table as JSON, Excel, HTML files or a sqlite database respectively. +Looks good! You can now export the table as a CSV file using its :meth:`to_csv() ` method. Alternatively you can use :meth:`to_json() `, :meth:`to_excel() ` :meth:`to_html() ` :meth:`to_markdown() ` or :meth:`to_sqlite() ` methods to export the table as JSON, Excel, HTML files or a sqlite database respectively. :: @@ -76,7 +76,7 @@ You can also export all tables at once, using the :class:`tables ` method exports files with a ``page-*-table-*`` suffix. In the example above, the single table in the list will be exported to ``foo-page-1-table-1.csv``. If the list contains multiple tables, multiple CSV files will be created. To avoid filling up your path with multiple files, you can use ``compress=True``, which will create a single ZIP file at your path with all the CSV files.