Update README and index.rst

2020-09-08 00:35:32 +05:30 · 2020-09-08 00:35:32 +05:30 · 2a7a4f5b34
parent 6b42094db5
commit 2a7a4f5b34
2 changed files with 37 additions and 66 deletions
--- a/README.md
+++ b/README.md
@ -10,13 +10,13 @@
 [![image](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black) [![image](https://img.shields.io/badge/continous%20quality-deepsource-lightgrey)](https://deepsource.io/gh/camelot-dev/camelot/?ref=repository-badge)


-**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!
+**Camelot** is a Python library that can help you extract tables from PDFs!

-**Note:** You can also check out [Excalibur](https://github.com/camelot-dev/excalibur), which is a web interface for Camelot!
+**Note:** You can also check out [Excalibur](https://github.com/camelot-dev/excalibur), the web interface to Camelot!

 ---

-**Here's how you can extract tables from PDF files.** Check out the PDF used in this example [here](https://github.com/camelot-dev/camelot/blob/master/docs/_static/pdf/foo.pdf).
+**Here's how you can extract tables from PDFs.** You can check out the PDF used in this example [here](https://github.com/camelot-dev/camelot/blob/master/docs/_static/pdf/foo.pdf).

 <pre>
 >>> import camelot
@ -46,24 +46,27 @@
 | 2032_2     | 0.17      | 57.8          | 21.7%                | 0.3%            | 2.7%            | 1.2%           |
 | 4171_1     | 0.07      | 173.9         | 58.1%                | 1.6%            | 2.1%            | 0.5%           |

-There's a [command-line interface](https://camelot-py.readthedocs.io/en/master/user/cli.html) too!
+Camelot also comes packaged with a [command-line interface](https://camelot-py.readthedocs.io/en/master/user/cli.html)!

 **Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)

 ## Why Camelot?

- **You are in control.**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.
- Each table is a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873).
- **Export** to multiple formats, including JSON, Excel, HTML and Sqlite.
+- **Configurability**: Camelot gives you control over the table extraction process with its [tweakable settings](https://camelot-py.readthedocs.io/en/master/user/advanced.html).
+- **Metrics**: Bad tables can be discarded based on metrics like accuracy and whitespace, without having to manually look at each table.
+- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873). You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite.

-See [comparison with other PDF table extraction libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
+See [comparison with similar libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
+
+## Support the development
+
+If Camelot has helped you, please consider supporting its development with a one-time or monthly donation [on OpenCollective](https://opencollective.com/camelot).

 ## Installation

 ### Using conda

-The easiest way to install Camelot is to install it with [conda](https://conda.io/docs/), which is a package manager and  environment management system for the [Anaconda](http://docs.continuum.io/anaconda/) distribution.
+The easiest way to install Camelot is with [conda](https://conda.io/docs/), which is a package manager and environment management system for the [Anaconda](http://docs.continuum.io/anaconda/) distribution.

 <pre>
 $ conda install -c conda-forge camelot-py
@ -71,7 +74,7 @@ $ conda install -c conda-forge camelot-py

 ### Using pip

-After [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install-deps.html) ([tk](https://packages.ubuntu.com/bionic/python/python-tk) and [ghostscript](https://www.ghostscript.com/)), you can simply use pip to install Camelot:
+After [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install-deps.html) ([tk](https://packages.ubuntu.com/bionic/python/python-tk) and [ghostscript](https://www.ghostscript.com/)), you can also just use pip to install Camelot:

 <pre>
 $ pip install "camelot-py[cv]"
@ -94,40 +97,16 @@ $ pip install ".[cv]"

 ## Documentation

-Great documentation is available at [http://camelot-py.readthedocs.io/](http://camelot-py.readthedocs.io/).
-
-## Development
-
-The [Contributor's Guide](https://camelot-py.readthedocs.io/en/master/dev/contributing.html) has detailed information about contributing code, documentation, tests and more. We've included some basic information in this README.
-
-### Source code
-
-You can check the latest sources with:
-
-<pre>
-$ git clone https://www.github.com/camelot-dev/camelot
-</pre>
-
-### Setting up a development environment
-
-You can install the development dependencies easily, using pip:
-
-<pre>
-$ pip install "camelot-py[dev]"
-</pre>
-
-### Testing
-
-After installation, you can run tests using:
-
-<pre>
-$ python setup.py test
-</pre>
+The documentation is available at [http://camelot-py.readthedocs.io/](http://camelot-py.readthedocs.io/).

 ## Wrappers

 - [camelot-php](https://github.com/randomstate/camelot-php) provides a [PHP](https://www.php.net/) wrapper on Camelot.

+## Contributing
+
+The [Contributor's Guide](https://camelot-py.readthedocs.io/en/master/dev/contributing.html) has detailed information about contributing issues, documentation, code, and tests.
+
 ## Versioning

 Camelot uses [Semantic Versioning](https://semver.org/). For the available versions, see the tags on this repository. For the changelog, you can check out [HISTORY.md](https://github.com/camelot-dev/camelot/blob/master/HISTORY.md).
@ -135,9 +114,3 @@ Camelot uses [Semantic Versioning](https://semver.org/). For the available versi
 ## License

 This project is licensed under the MIT License, see the [LICENSE](https://github.com/camelot-dev/camelot/blob/master/LICENSE) file for details.
-
-## Support the development
-
-You can support our work on Camelot with a one-time or monthly donation [on OpenCollective](https://opencollective.com/camelot). Organizations who use camelot can also sponsor the project for an acknowledgement on [our documentation site](https://camelot-py.readthedocs.io/en/master/) and this README.
-
-Special thanks to all the users, organizations and contributors that support Camelot!
--- a/docs/index.rst
+++ b/docs/index.rst
@ -36,15 +36,15 @@ Release v\ |version|. (:ref:`Installation <install>`)
 .. image:: https://img.shields.io/badge/continous%20quality-deepsource-lightgrey
    :target: https://deepsource.io/gh/camelot-dev/camelot/?ref=repository-badge

-**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!
+**Camelot** is a Python library that can help you extract tables from PDFs!

-.. note:: You can also check out `Excalibur`_, which is a web interface for Camelot!
+.. note:: You can also check out `Excalibur`_, the web interface to Camelot!

 .. _Excalibur: https://github.com/camelot-dev/excalibur

 ----

-**Here's how you can extract tables from PDF files.** Check out the PDF used in this example `here`_.
+**Here's how you can extract tables from PDFs.** You can check out the PDF used in this example `here`_.

 .. _here: _static/pdf/foo.pdf

@ -70,7 +70,7 @@ Release v\ |version|. (:ref:`Installation <install>`)
 .. csv-table::
  :file: _static/csv/foo.csv

-There's a :ref:`command-line interface <cli>` too!
+Camelot also comes packaged with a :ref:`command-line interface <cli>`!

 .. note:: Camelot only works with text-based PDFs and not scanned documents. (As Tabula `explains`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)

@ -79,27 +79,27 @@ There's a :ref:`command-line interface <cli>` too!
 Why Camelot?
 ------------

- **You are in control.** Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.
- Each table is a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_.
- **Export** to multiple formats, including JSON, Excel and HTML.
-
-See `comparison with other PDF table extraction libraries and tools`_.
+- **Configurability**: Camelot gives you control over the table extraction process with its :ref:`tweakable settings <advanced>`.
+- **Metrics**: Bad tables can be discarded based on metrics like accuracy and whitespace, without having to manually look at each table.
+- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_. You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite.

 .. _ETL and data analysis workflows: https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873
-.. _comparison with other PDF table extraction libraries and tools: https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools

-Support us on OpenCollective
----------------------------
+See `comparison with similar libraries and tools`_.

-If Camelot helped you extract tables from PDFs, please consider supporting its development by `becoming a backer or a sponsor on OpenCollective`_!
+.. _comparison with similar libraries and tools: https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools

-.. _becoming a backer or a sponsor on OpenCollective: https://opencollective.com/camelot
+Support the development
+-----------------------
+
+If Camelot has helped you, please consider supporting its development with a one-time or monthly donation `on OpenCollective`_!
+
+.. _on OpenCollective: https://opencollective.com/camelot

 The User Guide
 --------------

-This part of the documentation begins with some background information about why Camelot was created, takes a small dip into the implementation details and then focuses on step-by-step instructions for getting the most out of Camelot.
+This part of the documentation begins with some background information about why Camelot was created, takes you through some implementation details, and then focuses on step-by-step instructions for getting the most out of Camelot.

 .. toctree::
   :maxdepth: 2
@ -115,8 +115,7 @@ This part of the documentation begins with some background information about why
 The API Documentation/Guide
 ---------------------------

-If you are looking for information on a specific function, class, or method,
-this part of the documentation is for you.
+If you are looking for information on a specific function, class, or method, this part of the documentation is for you.

 .. toctree::
   :maxdepth: 2
@ -126,8 +125,7 @@ this part of the documentation is for you.
 The Contributor Guide
 ---------------------

-If you want to contribute to the project, this part of the documentation is for
-you.
+If you want to contribute to the project, this part of the documentation is for you.

 .. toctree::
   :maxdepth: 2