Update docs
parent
56efcaa925
commit
4eba7b6486
|
|
@ -4,6 +4,12 @@ Release History
|
|||
master
|
||||
------
|
||||
|
||||
**Improvements**
|
||||
|
||||
- Add markdown export format. [#222](https://github.com/camelot-dev/camelot/pull/222/) by [Lucas Cimon](https://github.com/Lucas-C).
|
||||
|
||||
**Documentation**
|
||||
|
||||
- Add faq section. [#216](https://github.com/camelot-dev/camelot/pull/216) by [Stefano Fiorucci](https://github.com/anakin87).
|
||||
|
||||
0.9.0 (2021-06-15)
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@
|
|||
>>> tables = camelot.read_pdf('foo.pdf')
|
||||
>>> tables
|
||||
<TableList n=1>
|
||||
>>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, sqlite
|
||||
>>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, markdown, sqlite
|
||||
>>> tables[0]
|
||||
<Table shape=(7, 7)>
|
||||
>>> tables[0].parsing_report
|
||||
|
|
@ -32,7 +32,7 @@
|
|||
'order': 1,
|
||||
'page': 1
|
||||
}
|
||||
>>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_sqlite
|
||||
>>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_markdown, to_sqlite
|
||||
>>> tables[0].df # get a pandas DataFrame!
|
||||
</pre>
|
||||
|
||||
|
|
@ -55,7 +55,7 @@ You can check out some frequently asked questions [here](https://camelot-py.read
|
|||
|
||||
- **Configurability**: Camelot gives you control over the table extraction process with [tweakable settings](https://camelot-py.readthedocs.io/en/master/user/advanced.html).
|
||||
- **Metrics**: You can discard bad tables based on metrics like accuracy and whitespace, without having to manually look at each table.
|
||||
- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873). You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite.
|
||||
- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873). You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML, Markdown, and Sqlite.
|
||||
|
||||
See [comparison with similar libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
|
||||
|
||||
|
|
|
|||
|
|
@ -54,7 +54,7 @@ Release v\ |version|. (:ref:`Installation <install>`)
|
|||
>>> tables = camelot.read_pdf('foo.pdf')
|
||||
>>> tables
|
||||
<TableList n=1>
|
||||
>>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html
|
||||
>>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, markdown, sqlite
|
||||
>>> tables[0]
|
||||
<Table shape=(7, 7)>
|
||||
>>> tables[0].parsing_report
|
||||
|
|
@ -64,7 +64,7 @@ Release v\ |version|. (:ref:`Installation <install>`)
|
|||
'order': 1,
|
||||
'page': 1
|
||||
}
|
||||
>>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html
|
||||
>>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_markdown, to_sqlite
|
||||
>>> tables[0].df # get a pandas DataFrame!
|
||||
|
||||
.. csv-table::
|
||||
|
|
@ -79,9 +79,9 @@ Camelot also comes packaged with a :ref:`command-line interface <cli>`!
|
|||
Why Camelot?
|
||||
------------
|
||||
|
||||
- **Configurability**: Camelot gives you control over the table extraction process with its :ref:`tweakable settings <advanced>`.
|
||||
- **Metrics**: Bad tables can be discarded based on metrics like accuracy and whitespace, without having to manually look at each table.
|
||||
- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_. You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite.
|
||||
- **Configurability**: Camelot gives you control over the table extraction process with :ref:`tweakable settings <advanced>`.
|
||||
- **Metrics**: You can discard bad tables based on metrics like accuracy and whitespace, without having to manually look at each table.
|
||||
- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_. You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML, Markdown, and Sqlite.
|
||||
|
||||
.. _ETL and data analysis workflows: https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873
|
||||
|
||||
|
|
|
|||
|
|
@ -56,7 +56,7 @@ Woah! The accuracy is top-notch and there is less whitespace, which means the ta
|
|||
.. csv-table::
|
||||
:file: ../_static/csv/foo.csv
|
||||
|
||||
Looks good! You can now export the table as a CSV file using its :meth:`to_csv() <camelot.core.Table.to_csv>` method. Alternatively you can use :meth:`to_json() <camelot.core.Table.to_json>`, :meth:`to_excel() <camelot.core.Table.to_excel>` :meth:`to_html() <camelot.core.Table.to_html>` or :meth:`to_sqlite() <camelot.core.Table.to_sqlite>` methods to export the table as JSON, Excel, HTML files or a sqlite database respectively.
|
||||
Looks good! You can now export the table as a CSV file using its :meth:`to_csv() <camelot.core.Table.to_csv>` method. Alternatively you can use :meth:`to_json() <camelot.core.Table.to_json>`, :meth:`to_excel() <camelot.core.Table.to_excel>` :meth:`to_html() <camelot.core.Table.to_html>` :meth:`to_markdown() <camelot.core.Table.to_markdown>` or :meth:`to_sqlite() <camelot.core.Table.to_sqlite>` methods to export the table as JSON, Excel, HTML files or a sqlite database respectively.
|
||||
|
||||
::
|
||||
|
||||
|
|
@ -76,7 +76,7 @@ You can also export all tables at once, using the :class:`tables <camelot.core.T
|
|||
|
||||
$ camelot --format csv --output foo.csv lattice foo.pdf
|
||||
|
||||
This will export all tables as CSV files at the path specified. Alternatively, you can use ``f='json'``, ``f='excel'``, ``f='html'`` or ``f='sqlite'``.
|
||||
This will export all tables as CSV files at the path specified. Alternatively, you can use ``f='json'``, ``f='excel'``, ``f='html'``, ``f='markdown'`` or ``f='sqlite'``.
|
||||
|
||||
.. note:: The :meth:`export() <camelot.core.TableList.export>` method exports files with a ``page-*-table-*`` suffix. In the example above, the single table in the list will be exported to ``foo-page-1-table-1.csv``. If the list contains multiple tables, multiple CSV files will be created. To avoid filling up your path with multiple files, you can use ``compress=True``, which will create a single ZIP file at your path with all the CSV files.
|
||||
|
||||
|
|
|
|||
Loading…
Reference in New Issue