Merge branch 'master' into format-markdown

2021-06-28 00:32:00 +05:30 · 2021-06-28 00:32:00 +05:30 · acb8f005c2
parent 955e4b62d0 216ec3c90b
commit acb8f005c2
23 changed files with 831 additions and 313 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -10,20 +10,25 @@ assignees: ''
 <!-- Please read the filing issues section of the contributor's guide first: https://camelot-py.readthedocs.io/en/master/dev/contributing.html -->
 **Describe the bug**
-A clear and concise description of what the bug is.
+
 <!-- A clear and concise description of what the bug is. -->
 **Steps to reproduce the bug**
 Steps used to install `camelot`:
 1. Add step here (you can add more steps too)
-Steps to reproduce the behavior:
+<!-- Steps used to install `camelot`:
-1. Add step here (you can add more steps too)
+1. Add step here (you can add more steps too) -->
 <!-- Steps to be used to reproduce behavior:
 1. Add step here (you can add more steps too) -->
 **Expected behavior**
-A clear and concise description of what you expected to happen.
+
 <!-- A clear and concise description of what you expected to happen. -->
 **Code**
-Add the Camelot code snippet that you used.
+
 <!-- Add the Camelot code snippet that you used. -->
 ```
 import camelot
@ -31,18 +36,22 @@ import camelot
 ```
 **PDF**
-Add the PDF file that you want to extract tables from.
+
 <!-- Add the PDF file that you want to extract tables from. -->
 **Screenshots**
-If applicable, add screenshots to help explain your problem.
+
 <!-- If applicable, add screenshots to help explain your problem. -->
 **Environment**
- - OS: [e.g. MacOS]
+
- - Python version:
+- OS: [e.g. macOS]
- - Numpy version:
+- Python version:
- - OpenCV version:
+- Numpy version:
- - Ghostscript version:
+- OpenCV version:
- - Camelot version:
+- Ghostscript version:
 - Camelot version:
 **Additional context**
-Add any other context about the problem here.
+
 <!-- Add any other context about the problem here. -->
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@ -0,0 +1,23 @@
 name: tests
 on: [pull_request]
 jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.6, 3.7, 3.8, 3.9]
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
      - name: Install camelot with dependencies
        run: |
          make install
      - name: Test with pytest
        run: |
          make test
--- a/.travis.yml
+++ b/.travis.yml
@ -1,29 +0,0 @@
 sudo: true
 language: python
 cache: pip
 addons:
  apt:
    update: true
 install:
  - make install
 jobs:
  include:
    - stage: test
      script:
        - make test
      python: '3.6'
    - stage: test
      script:
        - make test
      python: '3.7'
      dist: xenial
    - stage: test
      script:
        - make test
      python: '3.8'
      dist: xenial
    - stage: coverage
      python: '3.8'
      script:
        - make test
        - codecov --verbose
--- a/HISTORY.md
+++ b/HISTORY.md
@ -4,6 +4,33 @@ Release History
 master
 ------
 - Add faq section. [#216](https://github.com/camelot-dev/camelot/pull/216) by [Stefano Fiorucci](https://github.com/anakin87).
 0.9.0 (2021-06-15)
 ------------------
 **Bugfixes**
 - Fix use of resolution argument to generate image with ghostscript. [#231](https://github.com/camelot-dev/camelot/pull/231) by [Tiago Samaha Cordeiro](https://github.com/tiagosamaha).
 - [#15](https://github.com/camelot-dev/camelot/issues/15) Fix duplicate strings being assigned to the same cell. [#206](https://github.com/camelot-dev/camelot/pull/206) by [Eduardo Gonzalez Lopez de Murillas](https://github.com/edugonza).
 - Save plot when filename is specified. [#121](https://github.com/camelot-dev/camelot/pull/121) by [Jens Diemer](https://github.com/jedie).
 - Close file streams explicitly. [#202](https://github.com/camelot-dev/camelot/pull/202) by [Martin Abente Lahaye](https://github.com/tchx84).
 - Use correct re.sub signature. [#186](https://github.com/camelot-dev/camelot/pull/186) by [pevisscher](https://github.com/pevisscher).
 - [#183](https://github.com/camelot-dev/camelot/issues/183) Fix UnicodeEncodeError when using Stream flavor by adding encoding kwarg to `to_html`. [#188](https://github.com/camelot-dev/camelot/pull/188) by [Stefano Fiorucci](https://github.com/anakin87).
 - [#179](https://github.com/camelot-dev/camelot/issues/179) Fix `max() arg is an empty sequence` error on PDFs with blank pages. [#189](https://github.com/camelot-dev/camelot/pull/189) by Vinayak Mehta.
 **Improvements**
 - Add `line_overlap` and `boxes_flow` to `LAParams`. [#219](https://github.com/camelot-dev/camelot/pull/219) by [Arnie97](https://github.com/Arnie97).
 - [Add bug report template.](https://github.com/camelot-dev/camelot/commit/0a3944e54d133b701edfe9c7546ff11289301ba8)
 - Move from [Travis to GitHub Actions](https://github.com/camelot-dev/camelot/pull/241).
 - Update `.readthedocs.yml` and [remove requirements.txt](https://github.com/camelot-dev/camelot/commit/7ab5db39d07baa4063f975e9e00f6073340e04c1#diff-cde814ef2f549dc093f5b8fc533b7e8f47e7b32a8081e0760e57d5c25a1139d9)
 **Documentation**
 - [#193](https://github.com/camelot-dev/camelot/issues/193) Add better checks to confirm proper installation of ghostscript. [#196](https://github.com/camelot-dev/camelot/pull/196) by [jimhall](https://github.com/jimhall).
 - Update `advanced.rst` plotting examples. [#119](https://github.com/camelot-dev/camelot/pull/119) by [Jens Diemer](https://github.com/jedie).
 0.8.2 (2020-07-27)
 ------------------
--- a/2
+++ b/2
@ -1,6 +1,6 @@
 MIT License
-Copyright (c) 2019-2020 Camelot Developers
+Copyright (c) 2019-2021 Camelot Developers
 Copyright (c) 2018-2019 Peeply Private Ltd (Singapore)
 Permission is hereby granted, free of charge, to any person obtaining a copy
--- a/README.md
+++ b/README.md
@ -4,11 +4,10 @@
 # Camelot: PDF Table Extraction for Humans
-[![Build Status](https://travis-ci.org/camelot-dev/camelot.svg?branch=master)](https://travis-ci.org/camelot-dev/camelot) [![Documentation Status](https://readthedocs.org/projects/camelot-py/badge/?version=master)](https://camelot-py.readthedocs.io/en/master/)
+![Build Status](https://github.com/camelot-dev/camelot/actions/workflows/tests.yml/badge.svg) [![Documentation Status](https://readthedocs.org/projects/camelot-py/badge/?version=master)](https://camelot-py.readthedocs.io/en/master/)
 [![codecov.io](https://codecov.io/github/camelot-dev/camelot/badge.svg?branch=master&service=github)](https://codecov.io/github/camelot-dev/camelot?branch=master)
 [![image](https://img.shields.io/pypi/v/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/l/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/pyversions/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![Gitter chat](https://badges.gitter.im/camelot-dev/Lobby.png)](https://gitter.im/camelot-dev/Lobby)
-[![image](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black) [![image](https://img.shields.io/badge/continous%20quality-deepsource-lightgrey)](https://deepsource.io/gh/camelot-dev/camelot/?ref=repository-badge)
+[![image](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
 **Camelot** is a Python library that can help you extract tables from PDFs!
@ -50,10 +49,12 @@ Camelot also comes packaged with a [command-line interface](https://camelot-py.r
 **Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
 You can check out some frequently asked questions [here](https://camelot-py.readthedocs.io/en/master/user/faq.html).
 ## Why Camelot?
- **Configurability**: Camelot gives you control over the table extraction process with its [tweakable settings](https://camelot-py.readthedocs.io/en/master/user/advanced.html).
+- **Configurability**: Camelot gives you control over the table extraction process with [tweakable settings](https://camelot-py.readthedocs.io/en/master/user/advanced.html).
- **Metrics**: Bad tables can be discarded based on metrics like accuracy and whitespace, without having to manually look at each table.
+- **Metrics**: You can discard bad tables based on metrics like accuracy and whitespace, without having to manually look at each table.
 - **Output**: Each table is extracted into a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873). You can also export tables to multiple formats, which include CSV, JSON, Excel, HTML and Sqlite.
 See [comparison with similar libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
--- a/camelot/version.py
+++ b/camelot/version.py
@ -1,6 +1,6 @@
 # -*- coding: utf-8 -*-
-VERSION = (0, 8, 2)
+VERSION = (0, 9, 0)
 PRERELEASE = None  # alpha, beta or rc
 REVISION = None
--- a/camelot/core.py
+++ b/camelot/core.py
@ -55,7 +55,9 @@ class TextEdge(object):
        x = round(self.x, 2)
        y0 = round(self.y0, 2)
        y1 = round(self.y1, 2)
-        return f"<TextEdge x={x} y0={y0} y1={y1} align={self.align} valid={self.is_valid}>"
+        return (
            f"<TextEdge x={x} y0={y0} y1={y1} align={self.align} valid={self.is_valid}>"
        )
    def update_coords(self, x, y0, edge_tol=50):
        """Updates the text edge's x and bottom y coordinates and sets
@ -102,8 +104,7 @@ class TextEdges(object):
        return None
    def add(self, textline, align):
-        """Adds a new text edge to the current dict.
+        """Adds a new text edge to the current dict."""
        """
        x = self.get_x_coord(textline, align)
        y0 = textline.y0
        y1 = textline.y1
@ -111,8 +112,7 @@ class TextEdges(object):
        self._textedges[align].append(te)
    def update(self, textline):
-        """Updates an existing text edge in the current dict.
+        """Updates an existing text edge in the current dict."""
        """
        for align in ["left", "right", "middle"]:
            x_coord = self.get_x_coord(textline, align)
            idx = self.find(x_coord, align)
@ -304,8 +304,7 @@ class Cell(object):
    @property
    def bound(self):
-        """The number of sides on which the cell is bounded.
+        """The number of sides on which the cell is bounded."""
        """
        return self.top + self.bottom + self.left + self.right
@ -361,8 +360,7 @@ class Table(object):
    @property
    def data(self):
-        """Returns two-dimensional list of strings in table.
+        """Returns two-dimensional list of strings in table."""
        """
        d = []
        for row in self.cells:
            d.append([cell.text.strip() for cell in row])
@ -383,8 +381,7 @@ class Table(object):
        return report
    def set_all_edges(self):
-        """Sets all table edges to True.
+        """Sets all table edges to True."""
        """
        for row in self.cells:
            for cell in row:
                cell.left = cell.right = cell.top = cell.bottom = True
@ -526,8 +523,7 @@ class Table(object):
        return self
    def set_border(self):
-        """Sets table border edges to True.
+        """Sets table border edges to True."""
        """
        for r in range(len(self.rows)):
            self.cells[r][0].left = True
            self.cells[r][len(self.cols) - 1].right = True
--- a/camelot/ext/ghostscript/init.py
+++ b/camelot/ext/ghostscript/init.py
@ -81,8 +81,7 @@ class __Ghostscript(object):
 def Ghostscript(*args, **kwargs):
-    """Factory function for setting up a Ghostscript instance
+    """Factory function for setting up a Ghostscript instance"""
    """
    global __instance__
    # Ghostscript only supports a single instance
    if __instance__ is None:
--- a/camelot/handlers.py
+++ b/camelot/handlers.py
@ -167,9 +167,7 @@ class PDFHandler(object):
        with TemporaryDirectory() as tempdir:
            for p in self.pages:
                self._save_page(self.filepath, p, tempdir)
-            pages = [
+            pages = [os.path.join(tempdir, f"page-{p}.pdf") for p in self.pages]
                os.path.join(tempdir, f"page-{p}.pdf") for p in self.pages
            ]
            parser = Lattice(**kwargs) if flavor == "lattice" else Stream(**kwargs)
            for p in pages:
                t = parser.extract_tables(
--- a/camelot/parsers/base.py
+++ b/camelot/parsers/base.py
@ -6,8 +6,7 @@ from ..utils import get_page_layout, get_text_objects
 class BaseParser(object):
-    """Defines a base parser.
+    """Defines a base parser."""
    """
    def _generate_layout(self, filename, layout_kwargs):
        self.filename = filename
--- a/camelot/parsers/lattice.py
+++ b/camelot/parsers/lattice.py
@ -211,8 +211,8 @@ class Lattice(BaseParser):
        from ..ext.ghostscript import Ghostscript
        self.imagename = "".join([self.rootname, ".png"])
-        gs_call = "-q -sDEVICE=png16m -o {} -r300 {}".format(
+        gs_call = "-q -sDEVICE=png16m -o {} -r{} {}".format(
-            self.imagename, self.filename
+            self.imagename, self.resolution, self.filename
        )
        gs_call = gs_call.encode().split()
        null = open(os.devnull, "wb")
--- a/camelot/parsers/stream.py
+++ b/camelot/parsers/stream.py
@ -65,7 +65,7 @@ class Stream(BaseParser):
        edge_tol=50,
        row_tol=2,
        column_tol=0,
-        **kwargs
+        **kwargs,
    ):
        self.table_regions = table_regions
        self.table_areas = table_areas
@ -362,10 +362,10 @@ class Stream(BaseParser):
                    if len(elements):
                        ncols = max(set(elements), key=elements.count)
                    else:
-                        warnings.warn(
+                        warnings.warn(f"No tables found in table area {table_idx + 1}")
-                            f"No tables found in table area {table_idx + 1}"
+                cols = [
-                        )
+                    (t.x0, t.x1) for r in rows_grouped if len(r) == ncols for t in r
-                cols = [(t.x0, t.x1) for r in rows_grouped if len(r) == ncols for t in r]
+                ]
                cols = self._merge_columns(sorted(cols), column_tol=self.column_tol)
                inner_text = []
                for i in range(1, len(cols)):
--- a/camelot/plotting.py
+++ b/camelot/plotting.py
@ -34,13 +34,9 @@ class PlotMethods(object):
            raise ImportError("matplotlib is required for plotting.")
        if table.flavor == "lattice" and kind in ["textedge"]:
-            raise NotImplementedError(
+            raise NotImplementedError(f"Lattice flavor does not support kind='{kind}'")
                f"Lattice flavor does not support kind='{kind}'"
            )
        elif table.flavor == "stream" and kind in ["joint", "line"]:
-            raise NotImplementedError(
+            raise NotImplementedError(f"Stream flavor does not support kind='{kind}'")
                f"Stream flavor does not support kind='{kind}'"
            )
        plot_method = getattr(self, kind)
        fig = plot_method(table)
@ -48,7 +44,7 @@ class PlotMethods(object):
        if filename is not None:
            fig.savefig(filename)
            return None
-            
+
        return fig
    def text(self, table):
--- a/camelot/utils.py
+++ b/camelot/utils.py
@ -838,23 +838,27 @@ def compute_whitespace(d):
 def get_page_layout(
    filename,
    line_overlap=0.5,
    char_margin=1.0,
    line_margin=0.5,
    word_margin=0.1,
    boxes_flow=0.5,
    detect_vertical=True,
    all_texts=True,
 ):
    """Returns a PDFMiner LTPage object and page dimension of a single
-    page pdf. See https://euske.github.io/pdfminer/ to get definitions
+    page pdf. To get the definitions of kwargs, see
-    of kwargs.
+    https://pdfminersix.rtfd.io/en/latest/reference/composable.html.
    Parameters
    ----------
    filename : string
        Path to pdf file.
    line_overlap : float
    char_margin : float
    line_margin : float
    word_margin : float
    boxes_flow : float
    detect_vertical : bool
    all_texts : bool
@ -870,11 +874,15 @@ def get_page_layout(
        parser = PDFParser(f)
        document = PDFDocument(parser)
        if not document.is_extractable:
-            raise PDFTextExtractionNotAllowed(f"Text extraction is not allowed: {filename}")
+            raise PDFTextExtractionNotAllowed(
                f"Text extraction is not allowed: {filename}"
            )
        laparams = LAParams(
            line_overlap=line_overlap,
            char_margin=char_margin,
            line_margin=line_margin,
            word_margin=word_margin,
            boxes_flow=boxes_flow,
            detect_vertical=detect_vertical,
            all_texts=all_texts,
        )
--- a/docs/_themes/flask_theme_support.py
+++ b/docs/_themes/flask_theme_support.py
@ -1,7 +1,19 @@
 # flasky pygments style based on tango style
 from pygments.style import Style
-from pygments.token import Keyword, Name, Comment, String, Error, \
+from pygments.token import (
-     Number, Operator, Generic, Whitespace, Punctuation, Other, Literal
+    Keyword,
    Name,
    Comment,
    String,
    Error,
    Number,
    Operator,
    Generic,
    Whitespace,
    Punctuation,
    Other,
    Literal,
 )
 class FlaskyStyle(Style):
@ -11,76 +23,67 @@ class FlaskyStyle(Style):
    styles = {
        # No corresponding class for the following:
        # Text:                    "", # class:  ''
-        Whitespace:                "underline #f8f8f8",       # class: 'w'
+        Whitespace: "underline #f8f8f8",  # class: 'w'
-        Error:                     "#a40000 border:#ef2929",  # class: 'err'
+        Error: "#a40000 border:#ef2929",  # class: 'err'
-        Other:                     "#000000",                 # class 'x'
+        Other: "#000000",  # class 'x'
-
+        Comment: "italic #8f5902",  # class: 'c'
-        Comment:                   "italic #8f5902",  # class: 'c'
+        Comment.Preproc: "noitalic",  # class: 'cp'
-        Comment.Preproc:           "noitalic",        # class: 'cp'
+        Keyword: "bold #004461",  # class: 'k'
-
+        Keyword.Constant: "bold #004461",  # class: 'kc'
-        Keyword:                   "bold #004461",    # class: 'k'
+        Keyword.Declaration: "bold #004461",  # class: 'kd'
-        Keyword.Constant:          "bold #004461",    # class: 'kc'
+        Keyword.Namespace: "bold #004461",  # class: 'kn'
-        Keyword.Declaration:       "bold #004461",    # class: 'kd'
+        Keyword.Pseudo: "bold #004461",  # class: 'kp'
-        Keyword.Namespace:         "bold #004461",    # class: 'kn'
+        Keyword.Reserved: "bold #004461",  # class: 'kr'
-        Keyword.Pseudo:            "bold #004461",    # class: 'kp'
+        Keyword.Type: "bold #004461",  # class: 'kt'
-        Keyword.Reserved:          "bold #004461",    # class: 'kr'
+        Operator: "#582800",  # class: 'o'
-        Keyword.Type:              "bold #004461",    # class: 'kt'
+        Operator.Word: "bold #004461",  # class: 'ow' - like keywords
-
+        Punctuation: "bold #000000",  # class: 'p'
        Operator:                  "#582800",   # class: 'o'
        Operator.Word:             "bold #004461",   # class: 'ow' - like keywords
        Punctuation:               "bold #000000",   # class: 'p'
        # because special names such as Name.Class, Name.Function, etc.
        # are not recognized as such later in the parsing, we choose them
        # to look the same as ordinary variables.
-        Name:                      "#000000",         # class: 'n'
+        Name: "#000000",  # class: 'n'
-        Name.Attribute:            "#c4a000",         # class: 'na' - to be revised
+        Name.Attribute: "#c4a000",  # class: 'na' - to be revised
-        Name.Builtin:              "#004461",         # class: 'nb'
+        Name.Builtin: "#004461",  # class: 'nb'
-        Name.Builtin.Pseudo:       "#3465a4",         # class: 'bp'
+        Name.Builtin.Pseudo: "#3465a4",  # class: 'bp'
-        Name.Class:                "#000000",         # class: 'nc' - to be revised
+        Name.Class: "#000000",  # class: 'nc' - to be revised
-        Name.Constant:             "#000000",         # class: 'no' - to be revised
+        Name.Constant: "#000000",  # class: 'no' - to be revised
-        Name.Decorator:            "#888",            # class: 'nd' - to be revised
+        Name.Decorator: "#888",  # class: 'nd' - to be revised
-        Name.Entity:               "#ce5c00",         # class: 'ni'
+        Name.Entity: "#ce5c00",  # class: 'ni'
-        Name.Exception:            "bold #cc0000",    # class: 'ne'
+        Name.Exception: "bold #cc0000",  # class: 'ne'
-        Name.Function:             "#000000",         # class: 'nf'
+        Name.Function: "#000000",  # class: 'nf'
-        Name.Property:             "#000000",         # class: 'py'
+        Name.Property: "#000000",  # class: 'py'
-        Name.Label:                "#f57900",         # class: 'nl'
+        Name.Label: "#f57900",  # class: 'nl'
-        Name.Namespace:            "#000000",         # class: 'nn' - to be revised
+        Name.Namespace: "#000000",  # class: 'nn' - to be revised
-        Name.Other:                "#000000",         # class: 'nx'
+        Name.Other: "#000000",  # class: 'nx'
-        Name.Tag:                  "bold #004461",    # class: 'nt' - like a keyword
+        Name.Tag: "bold #004461",  # class: 'nt' - like a keyword
-        Name.Variable:             "#000000",         # class: 'nv' - to be revised
+        Name.Variable: "#000000",  # class: 'nv' - to be revised
-        Name.Variable.Class:       "#000000",         # class: 'vc' - to be revised
+        Name.Variable.Class: "#000000",  # class: 'vc' - to be revised
-        Name.Variable.Global:      "#000000",         # class: 'vg' - to be revised
+        Name.Variable.Global: "#000000",  # class: 'vg' - to be revised
-        Name.Variable.Instance:    "#000000",         # class: 'vi' - to be revised
+        Name.Variable.Instance: "#000000",  # class: 'vi' - to be revised
-
+        Number: "#990000",  # class: 'm'
-        Number:                    "#990000",         # class: 'm'
+        Literal: "#000000",  # class: 'l'
-
+        Literal.Date: "#000000",  # class: 'ld'
-        Literal:                   "#000000",         # class: 'l'
+        String: "#4e9a06",  # class: 's'
-        Literal.Date:              "#000000",         # class: 'ld'
+        String.Backtick: "#4e9a06",  # class: 'sb'
-
+        String.Char: "#4e9a06",  # class: 'sc'
-        String:                    "#4e9a06",         # class: 's'
+        String.Doc: "italic #8f5902",  # class: 'sd' - like a comment
-        String.Backtick:           "#4e9a06",         # class: 'sb'
+        String.Double: "#4e9a06",  # class: 's2'
-        String.Char:               "#4e9a06",         # class: 'sc'
+        String.Escape: "#4e9a06",  # class: 'se'
-        String.Doc:                "italic #8f5902",  # class: 'sd' - like a comment
+        String.Heredoc: "#4e9a06",  # class: 'sh'
-        String.Double:             "#4e9a06",         # class: 's2'
+        String.Interpol: "#4e9a06",  # class: 'si'
-        String.Escape:             "#4e9a06",         # class: 'se'
+        String.Other: "#4e9a06",  # class: 'sx'
-        String.Heredoc:            "#4e9a06",         # class: 'sh'
+        String.Regex: "#4e9a06",  # class: 'sr'
-        String.Interpol:           "#4e9a06",         # class: 'si'
+        String.Single: "#4e9a06",  # class: 's1'
-        String.Other:              "#4e9a06",         # class: 'sx'
+        String.Symbol: "#4e9a06",  # class: 'ss'
-        String.Regex:              "#4e9a06",         # class: 'sr'
+        Generic: "#000000",  # class: 'g'
-        String.Single:             "#4e9a06",         # class: 's1'
+        Generic.Deleted: "#a40000",  # class: 'gd'
-        String.Symbol:             "#4e9a06",         # class: 'ss'
+        Generic.Emph: "italic #000000",  # class: 'ge'
-
+        Generic.Error: "#ef2929",  # class: 'gr'
-        Generic:                   "#000000",         # class: 'g'
+        Generic.Heading: "bold #000080",  # class: 'gh'
-        Generic.Deleted:           "#a40000",         # class: 'gd'
+        Generic.Inserted: "#00A000",  # class: 'gi'
-        Generic.Emph:              "italic #000000",  # class: 'ge'
+        Generic.Output: "#888",  # class: 'go'
-        Generic.Error:             "#ef2929",         # class: 'gr'
+        Generic.Prompt: "#745334",  # class: 'gp'
-        Generic.Heading:           "bold #000080",    # class: 'gh'
+        Generic.Strong: "bold #000000",  # class: 'gs'
-        Generic.Inserted:          "#00A000",         # class: 'gi'
+        Generic.Subheading: "bold #800080",  # class: 'gu'
-        Generic.Output:            "#888",            # class: 'go'
+        Generic.Traceback: "bold #a40000",  # class: 'gt'
        Generic.Prompt:            "#745334",         # class: 'gp'
        Generic.Strong:            "bold #000000",    # class: 'gs'
        Generic.Subheading:        "bold #800080",    # class: 'gu'
        Generic.Traceback:         "bold #a40000",    # class: 'gt'
    }
--- a/docs/conf.py
+++ b/docs/conf.py
@ -22,8 +22,8 @@ import sys
 # sys.path.insert(0, os.path.abspath('..'))
 # Insert Camelot's path into the system.
-sys.path.insert(0, os.path.abspath('..'))
+sys.path.insert(0, os.path.abspath(".."))
-sys.path.insert(0, os.path.abspath('_themes'))
+sys.path.insert(0, os.path.abspath("_themes"))
 import camelot
@ -38,33 +38,33 @@ import camelot
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
 extensions = [
-    'sphinx.ext.autodoc',
+    "sphinx.ext.autodoc",
-    'sphinx.ext.napoleon',
+    "sphinx.ext.napoleon",
-    'sphinx.ext.intersphinx',
+    "sphinx.ext.intersphinx",
-    'sphinx.ext.todo',
+    "sphinx.ext.todo",
-    'sphinx.ext.viewcode',
+    "sphinx.ext.viewcode",
 ]
 # Add any paths that contain templates here, relative to this directory.
-templates_path = ['_templates']
+templates_path = ["_templates"]
 # The suffix(es) of source filenames.
 # You can specify multiple suffix as a list of string:
 #
 # source_suffix = ['.rst', '.md']
-source_suffix = '.rst'
+source_suffix = ".rst"
 # The encoding of source files.
 #
 # source_encoding = 'utf-8-sig'
 # The master toctree document.
-master_doc = 'index'
+master_doc = "index"
 # General information about the project.
-project = u'Camelot'
+project = u"Camelot"
-copyright = u'2020, Camelot Developers'
+copyright = u"2021, Camelot Developers"
-author = u'Vinayak Mehta'
+author = u"Vinayak Mehta"
 # The version info for the project you're documenting, acts as replacement for
 # |version| and |release|, also used in various other places throughout the
@ -94,7 +94,7 @@ language = None
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This patterns also effect to html_static_path and html_extra_path
-exclude_patterns = ['_build']
+exclude_patterns = ["_build"]
 # The reST default role (used for this markup: `text`) to use for all
 # documents.
@ -114,7 +114,7 @@ add_module_names = True
 # show_authors = False
 # The name of the Pygments (syntax highlighting) style to use.
-pygments_style = 'flask_theme_support.FlaskyStyle'
+pygments_style = "flask_theme_support.FlaskyStyle"
 # A list of ignored prefixes for module index sorting.
 # modindex_common_prefix = []
@ -130,18 +130,18 @@ todo_include_todos = True
 # The theme to use for HTML and HTML Help pages.  See the documentation for
 # a list of builtin themes.
-html_theme = 'alabaster'
+html_theme = "alabaster"
 # Theme options are theme-specific and customize the look and feel of a theme
 # further.  For a list of options available for each theme, see the
 # documentation.
 html_theme_options = {
-    'show_powered_by': False,
+    "show_powered_by": False,
-    'github_user': 'camelot-dev',
+    "github_user": "camelot-dev",
-    'github_repo': 'camelot',
+    "github_repo": "camelot",
-    'github_banner': True,
+    "github_banner": True,
-    'show_related': False,
+    "show_related": False,
-    'note_bg': '#FFF59C'
+    "note_bg": "#FFF59C",
 }
 # Add any paths that contain custom themes here, relative to this directory.
@ -164,12 +164,12 @@ html_theme_options = {
 # The name of an image file (relative to this directory) to use as a favicon of
 # the docs.  This file should be a Windows icon file (.ico) being 16x16 or 32x32
 # pixels large.
-html_favicon = '_static/favicon.ico'
+html_favicon = "_static/favicon.ico"
 # Add any paths that contain custom static files (such as style sheets) here,
 # relative to this directory. They are copied after the builtin static files,
 # so a file named "default.css" will overwrite the builtin "default.css".
-html_static_path = ['_static']
+html_static_path = ["_static"]
 # Add any extra paths that contain custom files (such as robots.txt or
 # .htaccess) here, relative to this directory. These files are copied
@ -189,10 +189,21 @@ html_use_smartypants = True
 # Custom sidebar templates, maps document names to template names.
 html_sidebars = {
-    'index': ['sidebarintro.html', 'relations.html', 'sourcelink.html',
+    "index": [
-              'searchbox.html', 'hacks.html'],
+        "sidebarintro.html",
-    '**': ['sidebarlogo.html', 'localtoc.html', 'relations.html',
+        "relations.html",
-           'sourcelink.html', 'searchbox.html', 'hacks.html']
+        "sourcelink.html",
        "searchbox.html",
        "hacks.html",
    ],
    "**": [
        "sidebarlogo.html",
        "localtoc.html",
        "relations.html",
        "sourcelink.html",
        "searchbox.html",
        "hacks.html",
    ],
 }
 # Additional templates that should be rendered to pages, maps page names to
@ -249,34 +260,30 @@ html_show_copyright = True
 # html_search_scorer = 'scorer.js'
 # Output file base name for HTML help builder.
-htmlhelp_basename = 'Camelotdoc'
+htmlhelp_basename = "Camelotdoc"
 # -- Options for LaTeX output ---------------------------------------------
 latex_elements = {
-     # The paper size ('letterpaper' or 'a4paper').
+    # The paper size ('letterpaper' or 'a4paper').
-     #
+    #
-     # 'papersize': 'letterpaper',
+    # 'papersize': 'letterpaper',
-
+    # The font size ('10pt', '11pt' or '12pt').
-     # The font size ('10pt', '11pt' or '12pt').
+    #
-     #
+    # 'pointsize': '10pt',
-     # 'pointsize': '10pt',
+    # Additional stuff for the LaTeX preamble.
-
+    #
-     # Additional stuff for the LaTeX preamble.
+    # 'preamble': '',
-     #
+    # Latex figure (float) alignment
-     # 'preamble': '',
+    #
-
+    # 'figure_align': 'htbp',
     # Latex figure (float) alignment
     #
     # 'figure_align': 'htbp',
 }
 # Grouping the document tree into LaTeX files. List of tuples
 # (source start file, target name, title,
 #  author, documentclass [howto, manual, or own class]).
 latex_documents = [
-    (master_doc, 'Camelot.tex', u'Camelot Documentation',
+    (master_doc, "Camelot.tex", u"Camelot Documentation", u"Vinayak Mehta", "manual"),
     u'Vinayak Mehta', 'manual'),
 ]
 # The name of an image file (relative to this directory) to place at the top of
@ -316,10 +323,7 @@ latex_documents = [
 # One entry per manual page. List of tuples
 # (source start file, name, description, authors, manual section).
-man_pages = [
+man_pages = [(master_doc, "Camelot", u"Camelot Documentation", [author], 1)]
    (master_doc, 'Camelot', u'Camelot Documentation',
     [author], 1)
 ]
 # If true, show URL addresses after external links.
 #
@ -332,9 +336,15 @@ man_pages = [
 # (source start file, target name, title, author,
 #  dir menu entry, description, category)
 texinfo_documents = [
-    (master_doc, 'Camelot', u'Camelot Documentation',
+    (
-     author, 'Camelot', 'One line description of project.',
+        master_doc,
-     'Miscellaneous'),
+        "Camelot",
        u"Camelot Documentation",
        author,
        "Camelot",
        "One line description of project.",
        "Miscellaneous",
    ),
 ]
 # Documents to append as an appendix to all manuals.
@ -356,6 +366,6 @@ texinfo_documents = [
 # Example configuration for intersphinx: refer to the Python standard library.
 intersphinx_mapping = {
-    'https://docs.python.org/2': None,
+    "https://docs.python.org/2": None,
-    'http://pandas.pydata.org/pandas-docs/stable': None
+    "http://pandas.pydata.org/pandas-docs/stable": None,
 }
--- a/docs/index.rst
+++ b/docs/index.rst
@ -110,6 +110,7 @@ This part of the documentation begins with some background information about why
   user/how-it-works
   user/quickstart
   user/advanced
   user/faq
   user/cli
 The API Documentation/Guide
--- a/docs/user/advanced.rst
+++ b/docs/user/advanced.rst
@ -618,7 +618,7 @@ Tweak layout generation
 Camelot is built on top of PDFMiner's functionality of grouping characters on a page into words and sentences. In some cases (such as `#170 <https://github.com/camelot-dev/camelot/issues/170>`_ and `#215 <https://github.com/camelot-dev/camelot/issues/215>`_), PDFMiner can group characters that should belong to the same sentence into separate sentences.
-To deal with such cases, you can tweak PDFMiner's `LAParams kwargs <https://github.com/euske/pdfminer/blob/master/pdfminer/layout.py#L33>`_ to improve layout generation, by passing the keyword arguments as a dict using ``layout_kwargs`` in :meth:`read_pdf() <camelot.read_pdf>`. To know more about the parameters you can tweak, you can check out `PDFMiner docs <https://euske.github.io/pdfminer/>`_.
+To deal with such cases, you can tweak PDFMiner's `LAParams kwargs <https://github.com/euske/pdfminer/blob/master/pdfminer/layout.py#L33>`_ to improve layout generation, by passing the keyword arguments as a dict using ``layout_kwargs`` in :meth:`read_pdf() <camelot.read_pdf>`. To know more about the parameters you can tweak, you can check out `PDFMiner docs <https://pdfminersix.rtfd.io/en/latest/reference/composable.html>`_.
 ::
--- a/docs/user/faq.rst
+++ b/docs/user/faq.rst
@ -0,0 +1,56 @@
 .. _faq:
 Frequently Asked Questions
 ==========================
 This part of the documentation answers some common questions. To add questions, please open an issue `here <https://github.com/camelot-dev/camelot/issues/new>`_.
 Does Camelot work with image-based PDFs?
 ----------------------------------------
 **No**, Camelot only works with text-based PDFs and not scanned documents. (As Tabula `explains <https://github.com/tabulapdf/tabula#why-tabula>`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
 How to reduce memory usage for long PDFs?
 -----------------------------------------
 During table extraction from long PDF documents, RAM usage can grow significantly.
 A simple workaround is to divide the extraction into chunks, and save extracted data to disk at the end of every chunk.
 For more details, check out this code snippet from `@anakin87 <https://github.com/anakin87>`_:
 ::
    import camelot
    def chunks(l, n):
        """Yield successive n-sized chunks from l."""
        for i in range(0, len(l), n):
            yield l[i : i + n]
    def extract_tables(filepath, pages, chunks=50, export_path=".", params={}):
        """
        Divide the extraction work into n chunks. At the end of every chunk,
        save data on disk and free RAM.
        filepath : str
            Filepath or URL of the PDF file.
        pages : str, optional (default: '1')
            Comma-separated page numbers.
            Example: '1,3,4' or '1,4-end' or 'all'.
        """
        # get list of pages from camelot.handlers.PDFHandler
        handler = camelot.handlers.PDFHandler(filepath)
        page_list = handler._get_pages(filepath, pages=pages)
        # chunk pages list
        page_chunks = list(chunks(page_list, chunks))
        # extraction and export
        for chunk in page_chunks:
            pages_string = str(chunk).replace("[", "").replace("]", "")
            tables = camelot.read_pdf(filepath, pages=pages_string, **params)
            tables.export(f"{export_path}/tables.csv")
--- a/docs/user/install-deps.rst
+++ b/docs/user/install-deps.rst
@ -43,8 +43,9 @@ For Ubuntu/MacOS::
 For Windows::
    >>> import ctypes
    >>> from ctypes.util import find_library
-    >>> find_library("".join(("gsdll", str(ctypes.sizeof(ctypes.c_voidp) * 8), ".dll"))
+    >>> find_library("".join(("gsdll", str(ctypes.sizeof(ctypes.c_voidp) * 8), ".dll")))
    <name-of-ghostscript-library-on-windows>
 **Check:** The output of the ``find_library`` function should not be empty.
--- a/setup.py
+++ b/setup.py
@ -6,39 +6,38 @@ from setuptools import find_packages
 here = os.path.abspath(os.path.dirname(__file__))
 about = {}
-with open(os.path.join(here, 'camelot', '__version__.py'), 'r') as f:
+with open(os.path.join(here, "camelot", "__version__.py"), "r") as f:
    exec(f.read(), about)
-with open('README.md', 'r') as f:
+with open("README.md", "r") as f:
    readme = f.read()
 requires = [
-    'chardet>=3.0.4',
+    "chardet>=3.0.4",
-    'click>=6.7',
+    "click>=6.7",
-    'numpy>=1.13.3',
+    "numpy>=1.13.3",
-    'openpyxl>=2.5.8',
+    "openpyxl>=2.5.8",
-    'pandas>=0.23.4',
+    "pandas>=0.23.4",
-    'pdfminer.six>=20200726',
+    "pdfminer.six>=20200726",
-    'PyPDF2>=1.26.0',
+    "PyPDF2>=1.26.0",
-    'tabulate'
+    "tabulate>=0.8.9",
 ]
-cv_requires = [
+cv_requires = ["opencv-python>=3.4.2.17"]
    'opencv-python>=3.4.2.17'
 ]
 plot_requires = [
-    'matplotlib>=2.2.3',
+    "matplotlib>=2.2.3",
 ]
 dev_requires = [
-    'codecov>=2.0.15',
+    "codecov>=2.0.15",
-    'pytest>=5.4.3',
+    "pytest>=5.4.3",
-    'pytest-cov>=2.10.0',
+    "pytest-cov>=2.10.0",
-    'pytest-mpl>=0.11',
+    "pytest-mpl>=0.11",
-    'pytest-runner>=5.2',
+    "pytest-runner>=5.2",
-    'Sphinx>=3.1.2'
+    "Sphinx>=3.1.2",
    "sphinx-autobuild>=2021.3.14",
 ]
 all_requires = cv_requires + plot_requires
@ -46,36 +45,39 @@ dev_requires = dev_requires + all_requires
 def setup_package():
-    metadata = dict(name=about['__title__'],
+    metadata = dict(
-                    version=about['__version__'],
+        name=about["__title__"],
-                    description=about['__description__'],
+        version=about["__version__"],
-                    long_description=readme,
+        description=about["__description__"],
-                    long_description_content_type="text/markdown",
+        long_description=readme,
-                    url=about['__url__'],
+        long_description_content_type="text/markdown",
-                    author=about['__author__'],
+        url=about["__url__"],
-                    author_email=about['__author_email__'],
+        author=about["__author__"],
-                    license=about['__license__'],
+        author_email=about["__author_email__"],
-                    packages=find_packages(exclude=('tests',)),
+        license=about["__license__"],
-                    install_requires=requires,
+        packages=find_packages(exclude=("tests",)),
-                    extras_require={
+        install_requires=requires,
-                        'all': all_requires,
+        extras_require={
-                        'cv': cv_requires,
+            "all": all_requires,
-                        'dev': dev_requires,
+            "cv": cv_requires,
-                        'plot': plot_requires
+            "dev": dev_requires,
-                    },
+            "plot": plot_requires,
-                    entry_points={
+        },
-                        'console_scripts': [
+        entry_points={
-                            'camelot = camelot.cli:cli',
+            "console_scripts": [
-                        ],
+                "camelot = camelot.cli:cli",
-                    },
+            ],
-                    classifiers=[
+        },
-                        # Trove classifiers
+        classifiers=[
-                        # Full list: https://pypi.python.org/pypi?%3Aaction=list_classifiers
+            # Trove classifiers
-                        'License :: OSI Approved :: MIT License',
+            # Full list: https://pypi.python.org/pypi?%3Aaction=list_classifiers
-                        'Programming Language :: Python :: 3.6',
+            "License :: OSI Approved :: MIT License",
-                        'Programming Language :: Python :: 3.7',
+            "Programming Language :: Python :: 3.6",
-                        'Programming Language :: Python :: 3.8'
+            "Programming Language :: Python :: 3.7",
-                    ])
+            "Programming Language :: Python :: 3.8",
            "Programming Language :: Python :: 3.9",
        ],
    )
    try:
        from setuptools import setup
@ -85,5 +87,5 @@ def setup_package():
    setup(**metadata)
-if __name__ == '__main__':
+if __name__ == "__main__":
    setup_package()
--- a/tests/data.py
+++ b/tests/data.py
@ -2800,49 +2800,467 @@ data_stream_layout_kwargs = [
 ]
 data_stream_duplicated_text = [
-    ['', '2012 BETTER VARIETIES Harvest Report for Minnesota Central  [ MNCE ]', '', '', '', '', '', '', '', '',
+    [
-     'ALL SEASON TEST'],
+        "",
-    ['', 'Doug Toreen, Renville County, MN 55310          [ BIRD ISLAND ]', '', '', '', '', '', '', '', '',
+        "2012 BETTER VARIETIES Harvest Report for Minnesota Central  [ MNCE ]",
-     '1.3 - 2.0 MAT. GROUP'],
+        "",
-    ['PREV. CROP/HERB:', 'Corn / Surpass, Roundup', '', '', '', '', '', '', '', '', 'S2MNCE01'],
+        "",
-    ['SOIL DESCRIPTION:', '', 'Canisteo clay loam, mod. well drained, non-irrigated', '', '', '', '', '', '', '', ''],
+        "",
-    ['SOIL CONDITIONS:', '', 'High P, high K, 6.7 pH, 3.9% OM, Low SCN', '', '', '', '', '', '', '', '30" ROW SPACING'],
+        "",
-    ['TILLAGE/CULTIVATION:', 'conventional w/ fall till', '', '', '', '', '', '', '', '', ''],
+        "",
-    ['PEST MANAGEMENT:', 'Roundup twice', '', '', '', '', '', '', '', '', ''],
+        "",
-    ['SEEDED - RATE:', 'May 15', '140 000 /A', '', '', '', '', '', '', 'TOP 30 for YIELD of 63 TESTED', ''],
+        "",
-    ['HARVESTED - STAND:', 'Oct 3', '122 921 /A', '', '', '', '', '', '', 'AVERAGE of (3) REPLICATIONS', ''],
+        "",
-    ['', '', '', '', 'SCN', 'Seed', 'Yield', 'Moisture', 'Lodging', 'Stand', 'Gross'],
+        "ALL SEASON TEST",
-    ['Company/Brand', 'Product/Brand†', 'Technol.†', 'Mat.', 'Resist.', 'Trmt.†', 'Bu/A', '%', '%', '(x 1000)',
+    ],
-     'Income'], ['Kruger', 'K2 1901', 'RR2Y', '1.9', 'R', 'Ac,PV', '56.4', '7.6', '0', '126.3', '$846'],
+    [
-    ['Stine', '19RA02 §', 'RR2Y', '1.9', 'R', 'CMB', '55.3', '7.6', '0', '120.0', '$830'],
+        "",
-    ['Wensman', 'W 3190NR2', 'RR2Y', '1.9', 'R', 'Ac', '54.5', '7.6', '0', '119.5', '$818'],
+        "Doug Toreen, Renville County, MN 55310          [ BIRD ISLAND ]",
-    ['Hefty', 'H17Y12', 'RR2Y', '1.7', 'MR', 'I', '53.7', '7.7', '0', '124.4', '$806'],
+        "",
-    ['Dyna-Gro', 'S15RY53', 'RR2Y', '1.5', 'R', 'Ac', '53.6', '7.7', '0', '126.8', '$804'],
+        "",
-    ['LG Seeds', 'C2050R2', 'RR2Y', '2.1', 'R', 'Ac', '53.6', '7.7', '0', '123.9', '$804'],
+        "",
-    ['Titan Pro', '19M42', 'RR2Y', '1.9', 'R', 'CMB', '53.6', '7.7', '0', '121.0', '$804'],
+        "",
-    ['Stine', '19RA02 (2) §', 'RR2Y', '1.9', 'R', 'CMB', '53.4', '7.7', '0', '123.9', '$801'],
+        "",
-    ['Asgrow', 'AG1832 §', 'RR2Y', '1.8', 'MR', 'Ac,PV', '52.9', '7.7', '0', '122.0', '$794'],
+        "",
-    ['Prairie Brand', 'PB-1566R2', 'RR2Y', '1.5', 'R', 'CMB', '52.8', '7.7', '0', '122.9', '$792'],
+        "",
-    ['Channel', '1901R2', 'RR2Y', '1.9', 'R', 'Ac,PV', '52.8', '7.6', '0', '123.4', '$791'],
+        "",
-    ['Titan Pro', '20M1', 'RR2Y', '2.0', 'R', 'Am', '52.5', '7.5', '0', '124.4', '$788'],
+        "1.3 - 2.0 MAT. GROUP",
-    ['Kruger', 'K2-2002', 'RR2Y', '2.0', 'R', 'Ac,PV', '52.4', '7.9', '0', '125.4', '$786'],
+    ],
-    ['Channel', '1700R2', 'RR2Y', '1.7', 'R', 'Ac,PV', '52.3', '7.9', '0', '123.9', '$784'],
+    [
-    ['Hefty', 'H16Y11', 'RR2Y', '1.6', 'MR', 'I', '51.4', '7.6', '0', '123.9', '$771'],
+        "PREV. CROP/HERB:",
-    ['Anderson', '162R2Y', 'RR2Y', '1.6', 'R', 'None', '51.3', '7.5', '0', '119.5', '$770'],
+        "Corn / Surpass, Roundup",
-    ['Titan Pro', '15M22', 'RR2Y', '1.5', 'R', 'CMB', '51.3', '7.8', '0', '125.4', '$769'],
+        "",
-    ['Dairyland', 'DSR-1710R2Y', 'RR2Y', '1.7', 'R', 'CMB', '51.3', '7.7', '0', '122.0', '$769'],
+        "",
-    ['Hefty', 'H20R3', 'RR2Y', '2.0', 'MR', 'I', '50.5', '8.2', '0', '121.0', '$757'],
+        "",
-    ['Prairie Brand', 'PB 1743R2', 'RR2Y', '1.7', 'R', 'CMB', '50.2', '7.7', '0', '125.8', '$752'],
+        "",
-    ['Gold Country', '1741', 'RR2Y', '1.7', 'R', 'Ac', '50.1', '7.8', '0', '123.9', '$751'],
+        "",
-    ['Trelay', '20RR43', 'RR2Y', '2.0', 'R', 'Ac,Ex', '49.9', '7.6', '0', '127.8', '$749'],
+        "",
-    ['Hefty', 'H14R3', 'RR2Y', '1.4', 'MR', 'I', '49.7', '7.7', '0', '122.9', '$746'],
+        "",
-    ['Prairie Brand', 'PB-2099NRR2', 'RR2Y', '2.0', 'R', 'CMB', '49.6', '7.8', '0', '126.3', '$743'],
+        "",
-    ['Wensman', 'W 3174NR2', 'RR2Y', '1.7', 'R', 'Ac', '49.3', '7.6', '0', '122.5', '$740'],
+        "S2MNCE01",
-    ['Kruger', 'K2 1602', 'RR2Y', '1.6', 'R', 'Ac,PV', '48.7', '7.6', '0', '125.4', '$731'],
+    ],
-    ['NK Brand', 'S18-C2 §', 'RR2Y', '1.8', 'R', 'CMB', '48.7', '7.7', '0', '126.8', '$731'],
+    [
-    ['Kruger', 'K2 1902', 'RR2Y', '1.9', 'R', 'Ac,PV', '48.7', '7.5', '0', '124.4', '$730'],
+        "SOIL DESCRIPTION:",
-    ['Prairie Brand', 'PB-1823R2', 'RR2Y', '1.8', 'R', 'None', '48.5', '7.6', '0', '121.0', '$727'],
+        "",
-    ['Gold Country', '1541', 'RR2Y', '1.5', 'R', 'Ac', '48.4', '7.6', '0', '110.4', '$726'],
+        "Canisteo clay loam, mod. well drained, non-irrigated",
-    ['', '', '', '', '', 'Test Average  =', '47.6', '7.7', '0', '122.9', '$713'],
+        "",
-    ['', '', '', '', '', 'LSD (0.10)  =', '5.7', '0.3', 'ns', '37.8', '566.4']
+        "",
        "",
        "",
        "",
        "",
        "",
        "",
    ],
    [
        "SOIL CONDITIONS:",
        "",
        "High P, high K, 6.7 pH, 3.9% OM, Low SCN",
        "",
        "",
        "",
        "",
        "",
        "",
        "",
        '30" ROW SPACING',
    ],
    [
        "TILLAGE/CULTIVATION:",
        "conventional w/ fall till",
        "",
        "",
        "",
        "",
        "",
        "",
        "",
        "",
        "",
    ],
    ["PEST MANAGEMENT:", "Roundup twice", "", "", "", "", "", "", "", "", ""],
    [
        "SEEDED - RATE:",
        "May 15",
        "140 000 /A",
        "",
        "",
        "",
        "",
        "",
        "",
        "TOP 30 for YIELD of 63 TESTED",
        "",
    ],
    [
        "HARVESTED - STAND:",
        "Oct 3",
        "122 921 /A",
        "",
        "",
        "",
        "",
        "",
        "",
        "AVERAGE of (3) REPLICATIONS",
        "",
    ],
    ["", "", "", "", "SCN", "Seed", "Yield", "Moisture", "Lodging", "Stand", "Gross"],
    [
        "Company/Brand",
        "Product/Brand†",
        "Technol.†",
        "Mat.",
        "Resist.",
        "Trmt.†",
        "Bu/A",
        "%",
        "%",
        "(x 1000)",
        "Income",
    ],
    [
        "Kruger",
        "K2 1901",
        "RR2Y",
        "1.9",
        "R",
        "Ac,PV",
        "56.4",
        "7.6",
        "0",
        "126.3",
        "$846",
    ],
    [
        "Stine",
        "19RA02 §",
        "RR2Y",
        "1.9",
        "R",
        "CMB",
        "55.3",
        "7.6",
        "0",
        "120.0",
        "$830",
    ],
    [
        "Wensman",
        "W 3190NR2",
        "RR2Y",
        "1.9",
        "R",
        "Ac",
        "54.5",
        "7.6",
        "0",
        "119.5",
        "$818",
    ],
    ["Hefty", "H17Y12", "RR2Y", "1.7", "MR", "I", "53.7", "7.7", "0", "124.4", "$806"],
    [
        "Dyna-Gro",
        "S15RY53",
        "RR2Y",
        "1.5",
        "R",
        "Ac",
        "53.6",
        "7.7",
        "0",
        "126.8",
        "$804",
    ],
    [
        "LG Seeds",
        "C2050R2",
        "RR2Y",
        "2.1",
        "R",
        "Ac",
        "53.6",
        "7.7",
        "0",
        "123.9",
        "$804",
    ],
    [
        "Titan Pro",
        "19M42",
        "RR2Y",
        "1.9",
        "R",
        "CMB",
        "53.6",
        "7.7",
        "0",
        "121.0",
        "$804",
    ],
    [
        "Stine",
        "19RA02 (2) §",
        "RR2Y",
        "1.9",
        "R",
        "CMB",
        "53.4",
        "7.7",
        "0",
        "123.9",
        "$801",
    ],
    [
        "Asgrow",
        "AG1832 §",
        "RR2Y",
        "1.8",
        "MR",
        "Ac,PV",
        "52.9",
        "7.7",
        "0",
        "122.0",
        "$794",
    ],
    [
        "Prairie Brand",
        "PB-1566R2",
        "RR2Y",
        "1.5",
        "R",
        "CMB",
        "52.8",
        "7.7",
        "0",
        "122.9",
        "$792",
    ],
    [
        "Channel",
        "1901R2",
        "RR2Y",
        "1.9",
        "R",
        "Ac,PV",
        "52.8",
        "7.6",
        "0",
        "123.4",
        "$791",
    ],
    [
        "Titan Pro",
        "20M1",
        "RR2Y",
        "2.0",
        "R",
        "Am",
        "52.5",
        "7.5",
        "0",
        "124.4",
        "$788",
    ],
    [
        "Kruger",
        "K2-2002",
        "RR2Y",
        "2.0",
        "R",
        "Ac,PV",
        "52.4",
        "7.9",
        "0",
        "125.4",
        "$786",
    ],
    [
        "Channel",
        "1700R2",
        "RR2Y",
        "1.7",
        "R",
        "Ac,PV",
        "52.3",
        "7.9",
        "0",
        "123.9",
        "$784",
    ],
    ["Hefty", "H16Y11", "RR2Y", "1.6", "MR", "I", "51.4", "7.6", "0", "123.9", "$771"],
    [
        "Anderson",
        "162R2Y",
        "RR2Y",
        "1.6",
        "R",
        "None",
        "51.3",
        "7.5",
        "0",
        "119.5",
        "$770",
    ],
    [
        "Titan Pro",
        "15M22",
        "RR2Y",
        "1.5",
        "R",
        "CMB",
        "51.3",
        "7.8",
        "0",
        "125.4",
        "$769",
    ],
    [
        "Dairyland",
        "DSR-1710R2Y",
        "RR2Y",
        "1.7",
        "R",
        "CMB",
        "51.3",
        "7.7",
        "0",
        "122.0",
        "$769",
    ],
    ["Hefty", "H20R3", "RR2Y", "2.0", "MR", "I", "50.5", "8.2", "0", "121.0", "$757"],
    [
        "Prairie Brand",
        "PB 1743R2",
        "RR2Y",
        "1.7",
        "R",
        "CMB",
        "50.2",
        "7.7",
        "0",
        "125.8",
        "$752",
    ],
    [
        "Gold Country",
        "1741",
        "RR2Y",
        "1.7",
        "R",
        "Ac",
        "50.1",
        "7.8",
        "0",
        "123.9",
        "$751",
    ],
    [
        "Trelay",
        "20RR43",
        "RR2Y",
        "2.0",
        "R",
        "Ac,Ex",
        "49.9",
        "7.6",
        "0",
        "127.8",
        "$749",
    ],
    ["Hefty", "H14R3", "RR2Y", "1.4", "MR", "I", "49.7", "7.7", "0", "122.9", "$746"],
    [
        "Prairie Brand",
        "PB-2099NRR2",
        "RR2Y",
        "2.0",
        "R",
        "CMB",
        "49.6",
        "7.8",
        "0",
        "126.3",
        "$743",
    ],
    [
        "Wensman",
        "W 3174NR2",
        "RR2Y",
        "1.7",
        "R",
        "Ac",
        "49.3",
        "7.6",
        "0",
        "122.5",
        "$740",
    ],
    [
        "Kruger",
        "K2 1602",
        "RR2Y",
        "1.6",
        "R",
        "Ac,PV",
        "48.7",
        "7.6",
        "0",
        "125.4",
        "$731",
    ],
    [
        "NK Brand",
        "S18-C2 §",
        "RR2Y",
        "1.8",
        "R",
        "CMB",
        "48.7",
        "7.7",
        "0",
        "126.8",
        "$731",
    ],
    [
        "Kruger",
        "K2 1902",
        "RR2Y",
        "1.9",
        "R",
        "Ac,PV",
        "48.7",
        "7.5",
        "0",
        "124.4",
        "$730",
    ],
    [
        "Prairie Brand",
        "PB-1823R2",
        "RR2Y",
        "1.8",
        "R",
        "None",
        "48.5",
        "7.6",
        "0",
        "121.0",
        "$727",
    ],
    [
        "Gold Country",
        "1541",
        "RR2Y",
        "1.5",
        "R",
        "Ac",
        "48.4",
        "7.6",
        "0",
        "110.4",
        "$726",
    ],
    ["", "", "", "", "", "Test Average  =", "47.6", "7.7", "0", "122.9", "$713"],
    ["", "", "", "", "", "LSD (0.10)  =", "5.7", "0.3", "ns", "37.8", "566.4"],
 ]