From 9ff61c70d3380eb645d1461148aba044865bad67 Mon Sep 17 00:00:00 2001 From: Vinayak Mehta Date: Wed, 3 Oct 2018 12:32:17 +0530 Subject: [PATCH] Update README Update docs --- README.md | 2 +- docs/index.rst | 4 +++- docs/user/intro.rst | 2 +- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 8c9fd56..03cc5b2 100644 --- a/README.md +++ b/README.md @@ -43,7 +43,7 @@ There's a [command-line interface](https://camelot-py.readthedocs.io/en/latest/user/cli.html) too! -**Note:** Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based. +**Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".) ## Why Camelot? diff --git a/docs/index.rst b/docs/index.rst index e2d5857..b93451e 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -55,7 +55,9 @@ Release v\ |version|. (:ref:`Installation `) There's a :ref:`command-line interface ` too! -.. note:: Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based. +.. note:: Camelot only works with text-based PDFs and not scanned documents. (As Tabula `explains`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".) + +.. _explains: https://github.com/tabulapdf/tabula#why-tabula Why Camelot? ------------ diff --git a/docs/user/intro.rst b/docs/user/intro.rst index a0bcd65..4ec50d7 100644 --- a/docs/user/intro.rst +++ b/docs/user/intro.rst @@ -19,7 +19,7 @@ Why another PDF table extraction library? There are both open (`Tabula`_, `pdf-table-extract`_) and closed-source (`smallpdf`_, `PDFTables`_) tools that are widely used to extract tables from PDF files. They either give a nice output or fail miserably. There is no in between. This is not helpful since everything in the real world, including PDF table extraction, is fuzzy. This leads to the creation of ad-hoc table extraction scripts for each type of PDF table. -We created Camelot to offer users complete control over table extraction. If you can't get your desired output with the default settings, you can tweak them and get the job done! +Camelot was created to offer users complete control over table extraction. If you can't get your desired output with the default settings, you can tweak them and get the job done! Here is a `comparison`_ of Camelot's output with outputs from other open-source PDF parsing libraries and tools.