diff --git a/README.md b/README.md index 74cef27..8324b9f 100644 --- a/README.md +++ b/README.md @@ -10,13 +10,13 @@ [![image](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black) [![image](https://img.shields.io/badge/continous%20quality-deepsource-lightgrey)](https://deepsource.io/gh/camelot-dev/camelot/?ref=repository-badge) -**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files! +**Camelot** is a Python library that can help you extract tables from PDFs! -**Note:** You can also check out [Excalibur](https://github.com/camelot-dev/excalibur), which is a web interface for Camelot! +**Note:** You can also check out [Excalibur](https://github.com/camelot-dev/excalibur), the web interface to Camelot! --- -**Here's how you can extract tables from PDF files.** Check out the PDF used in this example [here](https://github.com/camelot-dev/camelot/blob/master/docs/_static/pdf/foo.pdf). +**Here's how you can extract tables from PDFs.** You can check out the PDF used in this example [here](https://github.com/camelot-dev/camelot/blob/master/docs/_static/pdf/foo.pdf).
 >>> import camelot
@@ -46,24 +46,27 @@
 | 2032_2     | 0.17      | 57.8          | 21.7%                | 0.3%            | 2.7%            | 1.2%           |
 | 4171_1     | 0.07      | 173.9         | 58.1%                | 1.6%            | 2.1%            | 0.5%           |
 
-There's a [command-line interface](https://camelot-py.readthedocs.io/en/master/user/cli.html) too!
+Camelot also comes packaged with a [command-line interface](https://camelot-py.readthedocs.io/en/master/user/cli.html)!
 
 **Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
 
 ## Why Camelot?
 
-- **You are in control.**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
-- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.
-- Each table is a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873).
-- **Export** to multiple formats, including JSON, Excel, HTML and Sqlite.
+- **Configurability**: Camelot gives you control over the table extraction process with its [tweakable settings](https://camelot-py.readthedocs.io/en/master/user/advanced.html).
+- **Metrics**: Bad tables can be discarded based on metrics like accuracy and whitespace, without having to manually look at each table.
+- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873). You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite.
 
-See [comparison with other PDF table extraction libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
+See [comparison with similar libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
+
+## Support the development
+
+If Camelot has helped you, please consider supporting its development with a one-time or monthly donation [on OpenCollective](https://opencollective.com/camelot).
 
 ## Installation
 
 ### Using conda
 
-The easiest way to install Camelot is to install it with [conda](https://conda.io/docs/), which is a package manager and  environment management system for the [Anaconda](http://docs.continuum.io/anaconda/) distribution.
+The easiest way to install Camelot is with [conda](https://conda.io/docs/), which is a package manager and environment management system for the [Anaconda](http://docs.continuum.io/anaconda/) distribution.
 
 
 $ conda install -c conda-forge camelot-py
@@ -71,7 +74,7 @@ $ conda install -c conda-forge camelot-py
 
 ### Using pip
 
-After [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install-deps.html) ([tk](https://packages.ubuntu.com/bionic/python/python-tk) and [ghostscript](https://www.ghostscript.com/)), you can simply use pip to install Camelot:
+After [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install-deps.html) ([tk](https://packages.ubuntu.com/bionic/python/python-tk) and [ghostscript](https://www.ghostscript.com/)), you can also just use pip to install Camelot:
 
 
 $ pip install "camelot-py[cv]"
@@ -94,40 +97,16 @@ $ pip install ".[cv]"
 
 ## Documentation
 
-Great documentation is available at [http://camelot-py.readthedocs.io/](http://camelot-py.readthedocs.io/).
-
-## Development
-
-The [Contributor's Guide](https://camelot-py.readthedocs.io/en/master/dev/contributing.html) has detailed information about contributing code, documentation, tests and more. We've included some basic information in this README.
-
-### Source code
-
-You can check the latest sources with:
-
-
-$ git clone https://www.github.com/camelot-dev/camelot
-
- -### Setting up a development environment - -You can install the development dependencies easily, using pip: - -
-$ pip install "camelot-py[dev]"
-
- -### Testing - -After installation, you can run tests using: - -
-$ python setup.py test
-
+The documentation is available at [http://camelot-py.readthedocs.io/](http://camelot-py.readthedocs.io/). ## Wrappers - [camelot-php](https://github.com/randomstate/camelot-php) provides a [PHP](https://www.php.net/) wrapper on Camelot. +## Contributing + +The [Contributor's Guide](https://camelot-py.readthedocs.io/en/master/dev/contributing.html) has detailed information about contributing issues, documentation, code, and tests. + ## Versioning Camelot uses [Semantic Versioning](https://semver.org/). For the available versions, see the tags on this repository. For the changelog, you can check out [HISTORY.md](https://github.com/camelot-dev/camelot/blob/master/HISTORY.md). @@ -135,9 +114,3 @@ Camelot uses [Semantic Versioning](https://semver.org/). For the available versi ## License This project is licensed under the MIT License, see the [LICENSE](https://github.com/camelot-dev/camelot/blob/master/LICENSE) file for details. - -## Support the development - -You can support our work on Camelot with a one-time or monthly donation [on OpenCollective](https://opencollective.com/camelot). Organizations who use camelot can also sponsor the project for an acknowledgement on [our documentation site](https://camelot-py.readthedocs.io/en/master/) and this README. - -Special thanks to all the users, organizations and contributors that support Camelot! diff --git a/docs/index.rst b/docs/index.rst index 950240e..c3e1de4 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -36,15 +36,15 @@ Release v\ |version|. (:ref:`Installation `) .. image:: https://img.shields.io/badge/continous%20quality-deepsource-lightgrey :target: https://deepsource.io/gh/camelot-dev/camelot/?ref=repository-badge -**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files! +**Camelot** is a Python library that can help you extract tables from PDFs! -.. note:: You can also check out `Excalibur`_, which is a web interface for Camelot! +.. note:: You can also check out `Excalibur`_, the web interface to Camelot! .. _Excalibur: https://github.com/camelot-dev/excalibur ---- -**Here's how you can extract tables from PDF files.** Check out the PDF used in this example `here`_. +**Here's how you can extract tables from PDFs.** You can check out the PDF used in this example `here`_. .. _here: _static/pdf/foo.pdf @@ -70,7 +70,7 @@ Release v\ |version|. (:ref:`Installation `) .. csv-table:: :file: _static/csv/foo.csv -There's a :ref:`command-line interface ` too! +Camelot also comes packaged with a :ref:`command-line interface `! .. note:: Camelot only works with text-based PDFs and not scanned documents. (As Tabula `explains`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".) @@ -79,27 +79,27 @@ There's a :ref:`command-line interface ` too! Why Camelot? ------------ -- **You are in control.** Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.) -- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table. -- Each table is a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_. -- **Export** to multiple formats, including JSON, Excel and HTML. - -See `comparison with other PDF table extraction libraries and tools`_. +- **Configurability**: Camelot gives you control over the table extraction process with its :ref:`tweakable settings `. +- **Metrics**: Bad tables can be discarded based on metrics like accuracy and whitespace, without having to manually look at each table. +- **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_. You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite. .. _ETL and data analysis workflows: https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873 -.. _comparison with other PDF table extraction libraries and tools: https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools -Support us on OpenCollective ----------------------------- +See `comparison with similar libraries and tools`_. -If Camelot helped you extract tables from PDFs, please consider supporting its development by `becoming a backer or a sponsor on OpenCollective`_! +.. _comparison with similar libraries and tools: https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools -.. _becoming a backer or a sponsor on OpenCollective: https://opencollective.com/camelot +Support the development +----------------------- + +If Camelot has helped you, please consider supporting its development with a one-time or monthly donation `on OpenCollective`_! + +.. _on OpenCollective: https://opencollective.com/camelot The User Guide -------------- -This part of the documentation begins with some background information about why Camelot was created, takes a small dip into the implementation details and then focuses on step-by-step instructions for getting the most out of Camelot. +This part of the documentation begins with some background information about why Camelot was created, takes you through some implementation details, and then focuses on step-by-step instructions for getting the most out of Camelot. .. toctree:: :maxdepth: 2 @@ -115,8 +115,7 @@ This part of the documentation begins with some background information about why The API Documentation/Guide --------------------------- -If you are looking for information on a specific function, class, or method, -this part of the documentation is for you. +If you are looking for information on a specific function, class, or method, this part of the documentation is for you. .. toctree:: :maxdepth: 2 @@ -126,8 +125,7 @@ this part of the documentation is for you. The Contributor Guide --------------------- -If you want to contribute to the project, this part of the documentation is for -you. +If you want to contribute to the project, this part of the documentation is for you. .. toctree:: :maxdepth: 2