Update README

Update docs
pull/2/head
Vinayak Mehta 2018-10-03 12:32:17 +05:30
parent 2a1d21af32
commit 9ff61c70d3
3 changed files with 5 additions and 3 deletions

View File

@ -43,7 +43,7 @@
There's a [command-line interface](https://camelot-py.readthedocs.io/en/latest/user/cli.html) too! There's a [command-line interface](https://camelot-py.readthedocs.io/en/latest/user/cli.html) too!
**Note:** Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based. **Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
## Why Camelot? ## Why Camelot?

View File

@ -55,7 +55,9 @@ Release v\ |version|. (:ref:`Installation <install>`)
There's a :ref:`command-line interface <cli>` too! There's a :ref:`command-line interface <cli>` too!
.. note:: Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based. .. note:: Camelot only works with text-based PDFs and not scanned documents. (As Tabula `explains`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
.. _explains: https://github.com/tabulapdf/tabula#why-tabula
Why Camelot? Why Camelot?
------------ ------------

View File

@ -19,7 +19,7 @@ Why another PDF table extraction library?
There are both open (`Tabula`_, `pdf-table-extract`_) and closed-source (`smallpdf`_, `PDFTables`_) tools that are widely used to extract tables from PDF files. They either give a nice output or fail miserably. There is no in between. This is not helpful since everything in the real world, including PDF table extraction, is fuzzy. This leads to the creation of ad-hoc table extraction scripts for each type of PDF table. There are both open (`Tabula`_, `pdf-table-extract`_) and closed-source (`smallpdf`_, `PDFTables`_) tools that are widely used to extract tables from PDF files. They either give a nice output or fail miserably. There is no in between. This is not helpful since everything in the real world, including PDF table extraction, is fuzzy. This leads to the creation of ad-hoc table extraction scripts for each type of PDF table.
We created Camelot to offer users complete control over table extraction. If you can't get your desired output with the default settings, you can tweak them and get the job done! Camelot was created to offer users complete control over table extraction. If you can't get your desired output with the default settings, you can tweak them and get the job done!
Here is a `comparison`_ of Camelot's output with outputs from other open-source PDF parsing libraries and tools. Here is a `comparison`_ of Camelot's output with outputs from other open-source PDF parsing libraries and tools.