diff --git a/README.md b/README.md index e0462cf..4975fb1 100644 --- a/README.md +++ b/README.md @@ -4,8 +4,6 @@ Camelot is a Python library and command-line tool for extracting tables from PDF ## Usage -### API -
 >>> import camelot
 >>> tables = camelot.read_pdf('foo.pdf')
@@ -22,139 +20,51 @@ Camelot is a Python library and command-line tool for extracting tables from PDF
     'page': 1
 }
 >>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html
->>> tables[0].df
+>>> tables[0].df # get a pandas dataframe!
 
-### Command-line interface - -
-$ camelot --help
-Usage: camelot [OPTIONS] FILEPATH
-
-Options:
-  -p, --pages TEXT                Comma-separated page numbers to parse.
-                                  Example: 1,3,4 or 1,4-end
-  -o, --output TEXT               Output filepath.
-  -f, --format [csv|json|excel|html]
-                                  Output file format.
-  -z, --zip                       Whether or not to create a ZIP archive.
-  -m, --mesh                      Whether or not to use Lattice method of
-                                  parsing. Stream is used by default.
-  -T, --table_area TEXT           Table areas (x1,y1,x2,y2) to process.
-                                  x1, y1
-                                  -> left-top and x2, y2 -> right-bottom
-  -split, --split_text            Whether or not to split text if it spans
-                                  across multiple cells.
-  -flag, --flag_size              (inactive) Whether or not to flag text which
-                                  has uncommon size. (Useful to detect
-                                  super/subscripts)
-  -M, --margins <FLOAT FLOAT FLOAT>...
-                                  char_margin, line_margin, word_margin for
-                                  PDFMiner.
-  -C, --columns TEXT              x-coordinates of column separators.
-  -r, --row_close_tol INTEGER     Rows will be formed by combining text
-                                  vertically within this tolerance.
-  -c, --col_close_tol INTEGER     Columns will be formed by combining text
-                                  horizontally within this tolerance.
-  -back, --process_background     (with --mesh) Whether or not to process
-                                  lines that are in background.
-  -scale, --line_size_scaling INTEGER
-                                  (with --mesh) Factor by which the page
-                                  dimensions will be divided to get smallest
-                                  length of detected lines.
-  -copy, --copy_text [h|v]        (with --mesh) Specify direction in which
-                                  text will be copied over in a spanning cell.
-  -shift, --shift_text [l|r|t|b]  (with --mesh) Specify direction in which
-                                  text in a spanning cell should flow.
-  -l, --line_close_tol INTEGER    (with --mesh) Tolerance parameter used to
-                                  merge close vertical lines and close
-                                  horizontal lines.
-  -j, --joint_close_tol INTEGER   (with --mesh) Tolerance parameter used to
-                                  decide whether the detected lines and points
-                                  lie close to each other.
-  -block, --threshold_blocksize INTEGER
-                                  (with --mesh) For adaptive thresholding,
-                                  size of a pixel neighborhood that is used to
-                                  calculate a threshold value for the pixel:
-                                  3, 5, 7, and so on.
-  -const, --threshold_constant INTEGER
-                                  (with --mesh) For adaptive thresholding,
-                                  constant subtracted from the mean or
-                                  weighted mean.
-                                  Normally, it is positive but
-                                  may be zero or negative as well.
-  -I, --iterations INTEGER        (with --mesh) Number of times for
-                                  erosion/dilation is applied.
-  -G, --geometry_type [text|table|contour|joint|line]
-                                  Plot geometry found on pdf page for
-                                  debugging.
-                                  text: Plot text objects. (Useful to get
-                                        table_area and columns coordinates)
-                                  table: Plot parsed table.
-                                  contour (with --mesh): Plot detected rectangles.
-                                  joint (with --mesh): Plot detected line intersections.
-                                  line (with --mesh): Plot detected lines.
-  --help                          Show this message and exit.
-
- -## Dependencies - -The dependencies include [tk](https://wiki.tcl.tk/3743) and [ghostscript](https://www.ghostscript.com/). +There's a [command-line tool]() too! ## Installation -Make sure you have the most updated versions for `pip` and `setuptools`. You can update them by +After [installing dependencies](), you can simply use pip:
-$ pip install -U pip setuptools
+$ pip install camelot-py
 
-### Installing dependencies +## Documentation -tk and ghostscript can be installed using your system's default package manager. - -#### Linux - -* Ubuntu - -
-$ sudo apt-get install python-tk ghostscript
-
- -* Arch Linux - -
-$ sudo pacman -S tk ghostscript
-
- -#### OS X - -
-$ brew install tcl-tk ghostscript
-
- -Finally, `cd` into the project directory and install by - -
-$ python setup.py install
-
+Th documentation is available at [link](). ## Development -### Code +The [Contributor's Guide]() has detailed information about contributing code, documentation, tests and more. We've included some basic information in this README. + +### Source code You can check the latest sources with the command:
-$ git clone https://github.com/socialcopsdev/camelot.git
+$ git clone https://www.github.com/socialcopsdev/camelot.git
 
-### Contributing +### Setting up development environment -See [Contributing guidelines](). +You can install the development dependencies with the command: + +
+$ pip install camelot-py[dev]
+
### Testing +After installation, you can run tests using: +
 $ python setup.py test
-
\ No newline at end of file + + +## License + +This project is licensed under the [MIT License](LICENSE). \ No newline at end of file