Fix merge conflict

pull/2/head
Vinayak Mehta 2018-12-17 15:33:38 +05:30
commit 9aa219695f
63 changed files with 2109 additions and 583 deletions

10
.editorconfig 100644
View File

@ -0,0 +1,10 @@
root = true
[*]
end_of_line = lf
insert_final_newline = true
[*.py]
charset = utf-8
indent_style = space
indent_size = 4

View File

@ -1,23 +1,32 @@
sudo: false sudo: true
language: python language: python
cache: pip cache: pip
python:
- "2.7"
- "3.5"
- "3.6"
matrix:
include:
- python: 3.7
dist: xenial
sudo: true
before_install:
- sudo apt-get install python-tk python3-tk ghostscript
addons: addons:
apt: apt:
update: true update: true
install: install:
- pip install ".[dev]" - make install
script: jobs:
- pytest --verbose --cov-config .coveragerc --cov-report term --cov-report xml --cov=camelot tests include:
after_success: - stage: test
- codecov --verbose script:
- make test
python: '2.7'
- stage: test
script:
- make test
python: '3.5'
- stage: test
script:
- make test
python: '3.6'
- stage: test
script:
- make test
python: '3.7'
dist: xenial
- stage: coverage
python: '3.6'
script:
- make test
- codecov --verbose

View File

@ -1,6 +1,98 @@
Release History Release History
=============== ===============
master
------
0.5.0 (2018-12-13)
------------------
**Improvements**
* [#207](https://github.com/socialcopsdev/camelot/issues/207) Add a plot type for Stream text edges and detected table areas. [#224](https://github.com/socialcopsdev/camelot/pull/224) by Vinayak Mehta.
* [#204](https://github.com/socialcopsdev/camelot/issues/204) `suppress_warnings` is now called `suppress_stdout`. [#225](https://github.com/socialcopsdev/camelot/pull/225) by Vinayak Mehta.
**Bugfixes**
* [#217](https://github.com/socialcopsdev/camelot/issues/217) Fix IndexError when scale is large.
* [#105](https://github.com/socialcopsdev/camelot/issues/105), [#192](https://github.com/socialcopsdev/camelot/issues/192) and [#215](https://github.com/socialcopsdev/camelot/issues/215) in [#227](https://github.com/socialcopsdev/camelot/pull/227) by Vinayak Mehta.
**Documentation**
* Add pdfplumber comparison and update Tabula (stream) comparison. Check out the [wiki page](https://github.com/socialcopsdev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
0.4.1 (2018-12-05)
------------------
**Bugfixes**
* Add chardet to `install_requires` to fix [#210](https://github.com/socialcopsdev/camelot/issues/210). More details in [pdfminer.six#213](https://github.com/pdfminer/pdfminer.six/issues/213).
0.4.0 (2018-11-23)
------------------
**Improvements**
* [#102](https://github.com/socialcopsdev/camelot/issues/102) Detect tables automatically when Stream is used. [#206](https://github.com/socialcopsdev/camelot/pull/206) Add implementation of Anssi Nurminen's table detection algorithm by Vinayak Mehta.
0.3.2 (2018-11-04)
------------------
**Improvements**
* [#186](https://github.com/socialcopsdev/camelot/issues/186) Add `_bbox` attribute to table. [#193](https://github.com/socialcopsdev/camelot/pull/193) by Vinayak Mehta.
* You can use `table._bbox` to get coordinates of the detected table.
0.3.1 (2018-11-02)
------------------
**Improvements**
* Matplotlib is now an optional requirement. [#190](https://github.com/socialcopsdev/camelot/pull/190) by Vinayak Mehta.
* You can install it using `$ pip install camelot-py[plot]`.
* [#127](https://github.com/socialcopsdev/camelot/issues/127) Add tests for plotting. Coverage is now at 87%! [#179](https://github.com/socialcopsdev/camelot/pull/179) by [Suyash Behera](https://github.com/Suyash458).
0.3.0 (2018-10-28)
------------------
**Improvements**
* [#162](https://github.com/socialcopsdev/camelot/issues/162) Add password keyword argument. [#180](https://github.com/socialcopsdev/camelot/pull/180) by [rbares](https://github.com/rbares).
* An encrypted PDF can now be decrypted by passing `password='<PASSWORD>'` to `read_pdf` or `--password <PASSWORD>` to the command-line interface. (Limited encryption algorithm support from PyPDF2.)
* [#139](https://github.com/socialcopsdev/camelot/issues/139) Add suppress_warnings keyword argument. [#155](https://github.com/socialcopsdev/camelot/pull/155) by [Jonathan Lloyd](https://github.com/jonathanlloyd).
* Warnings raised by Camelot can now be suppressed by passing `suppress_warnings=True` to `read_pdf` or `--quiet` to the command-line interface.
* [#154](https://github.com/socialcopsdev/camelot/issues/154) The CLI can now be run using `python -m`. Try `python -m camelot --help`. [#159](https://github.com/socialcopsdev/camelot/pull/159) by [Parth P Panchal](https://github.com/pqrth).
* [#165](https://github.com/socialcopsdev/camelot/issues/165) Rename `table_area` to `table_areas`. [#171](https://github.com/socialcopsdev/camelot/pull/171) by [Parth P Panchal](https://github.com/pqrth).
**Bugfixes**
* Raise error if the ghostscript executable is not on the PATH variable. [#166](https://github.com/socialcopsdev/camelot/pull/166) by Vinayak Mehta.
* Convert filename to lowercase to check for PDF extension. [#169](https://github.com/socialcopsdev/camelot/pull/169) by [Vinicius Mesel](https://github.com/vmesel).
**Files**
* [#114](https://github.com/socialcopsdev/camelot/issues/114) Add Makefile and make codecov run only once. [#132](https://github.com/socialcopsdev/camelot/pull/132) by [Vaibhav Mule](https://github.com/vaibhavmule).
* Add .editorconfig. [#151](https://github.com/socialcopsdev/camelot/pull/151) by [KOLANICH](https://github.com/KOLANICH).
* Downgrade numpy version from 1.15.2 to 1.13.3.
* Add requirements.txt for readthedocs.
**Documentation**
* Add "Using conda" section to installation instructions.
* Add readthedocs badge.
0.2.3 (2018-10-08)
------------------
* Remove hard dependencies on requirements versions.
0.2.2 (2018-10-08)
------------------
**Bugfixes**
* Move opencv-python to extra\_requires. [#134](https://github.com/socialcopsdev/camelot/pull/134) by Vinayak Mehta.
0.2.1 (2018-10-05) 0.2.1 (2018-10-05)
------------------ ------------------
@ -51,4 +143,4 @@ Release History
0.1.0 (2018-09-24) 0.1.0 (2018-09-24)
------------------ ------------------
* Birth! * Rebirth!

View File

@ -1 +1 @@
include MANIFEST.in README.md HISTORY.md LICENSE requirements.txt requirements-dev.txt setup.py setup.cfg include MANIFEST.in README.md HISTORY.md LICENSE setup.py setup.cfg

28
Makefile 100644
View File

@ -0,0 +1,28 @@
.PHONY: docs
INSTALL :=
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
INSTALL := @sudo apt install python-tk python3-tk ghostscript
else ifeq ($(UNAME_S),Darwin)
INSTALL := @brew install tcl-tk ghostscript
else
INSTALL := @echo "Please install tk and ghostscript"
endif
install:
$(INSTALL)
pip install --upgrade pip
pip install ".[dev]"
test:
pytest --verbose --cov-config .coveragerc --cov-report term --cov-report xml --cov=camelot --mpl
docs:
cd docs && make html
@echo "\033[95m\n\nBuild successful! View the docs homepage at docs/_build/html/index.html.\n\033[0m"
publish:
pip install twine
python setup.py sdist
twine upload dist/*
rm -fr build dist .egg camelot_py.egg-info

View File

@ -4,11 +4,14 @@
# Camelot: PDF Table Extraction for Humans # Camelot: PDF Table Extraction for Humans
[![Build Status](https://travis-ci.org/socialcopsdev/camelot.svg?branch=master)](https://travis-ci.org/socialcopsdev/camelot) [![codecov.io](https://codecov.io/github/socialcopsdev/camelot/badge.svg?branch=master&service=github)](https://codecov.io/github/socialcopsdev/camelot?branch=master) [![Build Status](https://travis-ci.org/socialcopsdev/camelot.svg?branch=master)](https://travis-ci.org/socialcopsdev/camelot) [![Documentation Status](https://readthedocs.org/projects/camelot-py/badge/?version=master)](https://camelot-py.readthedocs.io/en/master/)
[![image](https://img.shields.io/pypi/v/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/l/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/pyversions/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![codecov.io](https://codecov.io/github/socialcopsdev/camelot/badge.svg?branch=master&service=github)](https://codecov.io/github/socialcopsdev/camelot?branch=master)
[![image](https://img.shields.io/pypi/v/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/l/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/pyversions/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![Gitter chat](https://badges.gitter.im/camelot-dev/Lobby.png)](https://gitter.im/camelot-dev/Lobby)
**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files! **Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!
**Note:** You can also check out [Excalibur](https://github.com/camelot-dev/excalibur), which is a web interface for Camelot!
--- ---
**Here's how you can extract tables from PDF files.** Check out the PDF used in this example [here](https://github.com/socialcopsdev/camelot/blob/master/docs/_static/pdf/foo.pdf). **Here's how you can extract tables from PDF files.** Check out the PDF used in this example [here](https://github.com/socialcopsdev/camelot/blob/master/docs/_static/pdf/foo.pdf).
@ -41,7 +44,7 @@
| 2032_2 | 0.17 | 57.8 | 21.7% | 0.3% | 2.7% | 1.2% | | 2032_2 | 0.17 | 57.8 | 21.7% | 0.3% | 2.7% | 1.2% |
| 4171_1 | 0.07 | 173.9 | 58.1% | 1.6% | 2.1% | 0.5% | | 4171_1 | 0.07 | 173.9 | 58.1% | 1.6% | 2.1% | 0.5% |
There's a [command-line interface](https://camelot-py.readthedocs.io/en/latest/user/cli.html) too! There's a [command-line interface](https://camelot-py.readthedocs.io/en/master/user/cli.html) too!
**Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".) **Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
@ -56,15 +59,25 @@ See [comparison with other PDF table extraction libraries and tools](https://git
## Installation ## Installation
After [installing the dependencies](https://camelot-py.readthedocs.io/en/latest/user/install.html) ([tk](https://packages.ubuntu.com/trusty/python-tk) and [ghostscript](https://www.ghostscript.com/)), you can simply use pip to install Camelot: ### Using conda
The easiest way to install Camelot is to install it with [conda](https://conda.io/docs/), which is a package manager and environment management system for the [Anaconda](http://docs.continuum.io/anaconda/) distribution.
<pre> <pre>
$ pip install camelot-py $ conda install -c conda-forge camelot-py
</pre> </pre>
### Alternatively ### Using pip
After [installing the dependencies](https://camelot-py.readthedocs.io/en/latest/user/install.html), clone the repo using: After [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install-deps.html) ([tk](https://packages.ubuntu.com/trusty/python-tk) and [ghostscript](https://www.ghostscript.com/)), you can simply use pip to install Camelot:
<pre>
$ pip install camelot-py[cv]
</pre>
### From the source code
After [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install.html#using-pip), clone the repo using:
<pre> <pre>
$ git clone https://www.github.com/socialcopsdev/camelot $ git clone https://www.github.com/socialcopsdev/camelot
@ -74,18 +87,16 @@ and install Camelot using pip:
<pre> <pre>
$ cd camelot $ cd camelot
$ pip install . $ pip install ".[cv]"
</pre> </pre>
**Note:** Use a [virtualenv](https://virtualenv.pypa.io/en/stable/) if you don't want to affect your global Python installation.
## Documentation ## Documentation
Great documentation is available at [http://camelot-py.readthedocs.io/](http://camelot-py.readthedocs.io/). Great documentation is available at [http://camelot-py.readthedocs.io/](http://camelot-py.readthedocs.io/).
## Development ## Development
The [Contributor's Guide](https://camelot-py.readthedocs.io/en/latest/dev/contributing.html) has detailed information about contributing code, documentation, tests and more. We've included some basic information in this README. The [Contributor's Guide](https://camelot-py.readthedocs.io/en/master/dev/contributing.html) has detailed information about contributing code, documentation, tests and more. We've included some basic information in this README.
### Source code ### Source code

View File

@ -2,10 +2,21 @@
import logging import logging
from click import HelpFormatter
from .__version__ import __version__ from .__version__ import __version__
from .io import read_pdf from .io import read_pdf
from .plotting import PlotMethods
def _write_usage(self, prog, args='', prefix='Usage: '):
return self._write_usage('camelot', args, prefix=prefix)
# monkey patch click.HelpFormatter
HelpFormatter._write_usage = HelpFormatter.write_usage
HelpFormatter.write_usage = _write_usage
# set up logging # set up logging
logger = logging.getLogger('camelot') logger = logging.getLogger('camelot')
@ -15,3 +26,6 @@ handler = logging.StreamHandler()
handler.setFormatter(formatter) handler.setFormatter(formatter)
logger.addHandler(handler) logger.addHandler(handler)
# instantiate plot method
plot = PlotMethods()

View File

@ -0,0 +1,16 @@
# -*- coding: utf-8 -*-
from __future__ import absolute_import
__all__ = ('main',)
def main():
from camelot.cli import cli
cli()
if __name__ == "__main__":
main()

View File

@ -1,11 +1,23 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
VERSION = (0, 2, 1) VERSION = (0, 5, 0)
PRERELEASE = None # alpha, beta or rc
REVISION = None
def generate_version(version, prerelease=None, revision=None):
version_parts = ['.'.join(map(str, version))]
if prerelease is not None:
version_parts.append('-{}'.format(prerelease))
if revision is not None:
version_parts.append('.{}'.format(revision))
return ''.join(version_parts)
__title__ = 'camelot-py' __title__ = 'camelot-py'
__description__ = 'PDF Table Extraction for Humans.' __description__ = 'PDF Table Extraction for Humans.'
__url__ = 'http://camelot-py.readthedocs.io/' __url__ = 'http://camelot-py.readthedocs.io/'
__version__ = '.'.join(map(str, VERSION)) __version__ = generate_version(VERSION, prerelease=PRERELEASE, revision=REVISION)
__author__ = 'Vinayak Mehta' __author__ = 'Vinayak Mehta'
__author_email__ = 'vmehta94@gmail.com' __author_email__ = 'vmehta94@gmail.com'
__license__ = 'MIT License' __license__ = 'MIT License'

View File

@ -3,9 +3,14 @@
import logging import logging
import click import click
try:
import matplotlib.pyplot as plt
except ImportError:
_HAS_MPL = False
else:
_HAS_MPL = True
from . import __version__ from . import __version__, read_pdf, plot
from .io import read_pdf
logger = logging.getLogger('camelot') logger = logging.getLogger('camelot')
@ -25,8 +30,10 @@ pass_config = click.make_pass_decorator(Config)
@click.group() @click.group()
@click.version_option(version=__version__) @click.version_option(version=__version__)
@click.option('-q', '--quiet', is_flag=False, help='Suppress logs and warnings.')
@click.option('-p', '--pages', default='1', help='Comma-separated page numbers.' @click.option('-p', '--pages', default='1', help='Comma-separated page numbers.'
' Example: 1,3,4 or 1,4-end.') ' Example: 1,3,4 or 1,4-end.')
@click.option('-pw', '--password', help='Password for decryption.')
@click.option('-o', '--output', help='Output file path.') @click.option('-o', '--output', help='Output file path.')
@click.option('-f', '--format', @click.option('-f', '--format',
type=click.Choice(['csv', 'json', 'excel', 'html']), type=click.Choice(['csv', 'json', 'excel', 'html']),
@ -47,7 +54,7 @@ def cli(ctx, *args, **kwargs):
@cli.command('lattice') @cli.command('lattice')
@click.option('-T', '--table_area', default=[], multiple=True, @click.option('-T', '--table_areas', default=[], multiple=True,
help='Table areas to process. Example: x1,y1,x2,y2' help='Table areas to process. Example: x1,y1,x2,y2'
' where x1, y1 -> left-top and x2, y2 -> right-bottom.') ' where x1, y1 -> left-top and x2, y2 -> right-bottom.')
@click.option('-back', '--process_background', is_flag=True, @click.option('-back', '--process_background', is_flag=True,
@ -78,8 +85,8 @@ def cli(ctx, *args, **kwargs):
@click.option('-I', '--iterations', default=0, @click.option('-I', '--iterations', default=0,
help='Number of times for erosion/dilation will be applied.') help='Number of times for erosion/dilation will be applied.')
@click.option('-plot', '--plot_type', @click.option('-plot', '--plot_type',
type=click.Choice(['text', 'table', 'contour', 'joint', 'line']), type=click.Choice(['text', 'grid', 'contour', 'joint', 'line']),
help='Plot geometry found on PDF page, for debugging.') help='Plot elements found on PDF page for visual debugging.')
@click.argument('filepath', type=click.Path(exists=True)) @click.argument('filepath', type=click.Path(exists=True))
@pass_config @pass_config
def lattice(c, *args, **kwargs): def lattice(c, *args, **kwargs):
@ -89,31 +96,39 @@ def lattice(c, *args, **kwargs):
output = conf.pop('output') output = conf.pop('output')
f = conf.pop('format') f = conf.pop('format')
compress = conf.pop('zip') compress = conf.pop('zip')
quiet = conf.pop('quiet')
plot_type = kwargs.pop('plot_type') plot_type = kwargs.pop('plot_type')
filepath = kwargs.pop('filepath') filepath = kwargs.pop('filepath')
kwargs.update(conf) kwargs.update(conf)
table_area = list(kwargs['table_area']) table_areas = list(kwargs['table_areas'])
kwargs['table_area'] = None if not table_area else table_area kwargs['table_areas'] = None if not table_areas else table_areas
copy_text = list(kwargs['copy_text']) copy_text = list(kwargs['copy_text'])
kwargs['copy_text'] = None if not copy_text else copy_text kwargs['copy_text'] = None if not copy_text else copy_text
kwargs['shift_text'] = list(kwargs['shift_text']) kwargs['shift_text'] = list(kwargs['shift_text'])
tables = read_pdf(filepath, pages=pages, flavor='lattice', **kwargs)
click.echo('Found {} tables'.format(tables.n))
if plot_type is not None: if plot_type is not None:
for table in tables: if not _HAS_MPL:
table.plot(plot_type) raise ImportError('matplotlib is required for plotting.')
else: else:
if output is None: if output is None:
raise click.UsageError('Please specify output file path using --output') raise click.UsageError('Please specify output file path using --output')
if f is None: if f is None:
raise click.UsageError('Please specify output file format using --format') raise click.UsageError('Please specify output file format using --format')
tables = read_pdf(filepath, pages=pages, flavor='lattice',
suppress_stdout=quiet, **kwargs)
click.echo('Found {} tables'.format(tables.n))
if plot_type is not None:
for table in tables:
plot(table, kind=plot_type)
plt.show()
else:
tables.export(output, f=f, compress=compress) tables.export(output, f=f, compress=compress)
@cli.command('stream') @cli.command('stream')
@click.option('-T', '--table_area', default=[], multiple=True, @click.option('-T', '--table_areas', default=[], multiple=True,
help='Table areas to process. Example: x1,y1,x2,y2' help='Table areas to process. Example: x1,y1,x2,y2'
' where x1, y1 -> left-top and x2, y2 -> right-bottom.') ' where x1, y1 -> left-top and x2, y2 -> right-bottom.')
@click.option('-C', '--columns', default=[], multiple=True, @click.option('-C', '--columns', default=[], multiple=True,
@ -123,8 +138,8 @@ def lattice(c, *args, **kwargs):
@click.option('-c', '--col_close_tol', default=0, help='Tolerance parameter' @click.option('-c', '--col_close_tol', default=0, help='Tolerance parameter'
' used to combine text horizontally, to generate columns.') ' used to combine text horizontally, to generate columns.')
@click.option('-plot', '--plot_type', @click.option('-plot', '--plot_type',
type=click.Choice(['text', 'table']), type=click.Choice(['text', 'grid', 'contour', 'textedge']),
help='Plot geometry found on PDF page for debugging.') help='Plot elements found on PDF page for visual debugging.')
@click.argument('filepath', type=click.Path(exists=True)) @click.argument('filepath', type=click.Path(exists=True))
@pass_config @pass_config
def stream(c, *args, **kwargs): def stream(c, *args, **kwargs):
@ -134,23 +149,31 @@ def stream(c, *args, **kwargs):
output = conf.pop('output') output = conf.pop('output')
f = conf.pop('format') f = conf.pop('format')
compress = conf.pop('zip') compress = conf.pop('zip')
quiet = conf.pop('quiet')
plot_type = kwargs.pop('plot_type') plot_type = kwargs.pop('plot_type')
filepath = kwargs.pop('filepath') filepath = kwargs.pop('filepath')
kwargs.update(conf) kwargs.update(conf)
table_area = list(kwargs['table_area']) table_areas = list(kwargs['table_areas'])
kwargs['table_area'] = None if not table_area else table_area kwargs['table_areas'] = None if not table_areas else table_areas
columns = list(kwargs['columns']) columns = list(kwargs['columns'])
kwargs['columns'] = None if not columns else columns kwargs['columns'] = None if not columns else columns
tables = read_pdf(filepath, pages=pages, flavor='stream', **kwargs)
click.echo('Found {} tables'.format(tables.n))
if plot_type is not None: if plot_type is not None:
for table in tables: if not _HAS_MPL:
table.plot(plot_type) raise ImportError('matplotlib is required for plotting.')
else: else:
if output is None: if output is None:
raise click.UsageError('Please specify output file path using --output') raise click.UsageError('Please specify output file path using --output')
if f is None: if f is None:
raise click.UsageError('Please specify output file format using --format') raise click.UsageError('Please specify output file format using --format')
tables = read_pdf(filepath, pages=pages, flavor='stream',
suppress_stdout=quiet, **kwargs)
click.echo('Found {} tables'.format(tables.n))
if plot_type is not None:
for table in tables:
plot(table, kind=plot_type)
plt.show()
else:
tables.export(output, f=f, compress=compress) tables.export(output, f=f, compress=compress)

View File

@ -3,11 +3,208 @@
import os import os
import zipfile import zipfile
import tempfile import tempfile
from itertools import chain
from operator import itemgetter
import numpy as np import numpy as np
import pandas as pd import pandas as pd
from .plotting import *
# minimum number of vertical textline intersections for a textedge
# to be considered valid
TEXTEDGE_REQUIRED_ELEMENTS = 4
# y coordinate tolerance for extending textedge
TEXTEDGE_EXTEND_TOLERANCE = 50
# padding added to table area on the left, right and bottom
TABLE_AREA_PADDING = 10
class TextEdge(object):
"""Defines a text edge coordinates relative to a left-bottom
origin. (PDF coordinate space)
Parameters
----------
x : float
x-coordinate of the text edge.
y0 : float
y-coordinate of bottommost point.
y1 : float
y-coordinate of topmost point.
align : string, optional (default: 'left')
{'left', 'right', 'middle'}
Attributes
----------
intersections: int
Number of intersections with horizontal text rows.
is_valid: bool
A text edge is valid if it intersections with at least
TEXTEDGE_REQUIRED_ELEMENTS horizontal text rows.
"""
def __init__(self, x, y0, y1, align='left'):
self.x = x
self.y0 = y0
self.y1 = y1
self.align = align
self.intersections = 0
self.is_valid = False
def __repr__(self):
return '<TextEdge x={} y0={} y1={} align={} valid={}>'.format(
round(self.x, 2), round(self.y0, 2), round(self.y1, 2), self.align, self.is_valid)
def update_coords(self, x, y0):
"""Updates the text edge's x and bottom y coordinates and sets
the is_valid attribute.
"""
if np.isclose(self.y0, y0, atol=TEXTEDGE_EXTEND_TOLERANCE):
self.x = (self.intersections * self.x + x) / float(self.intersections + 1)
self.y0 = y0
self.intersections += 1
# a textedge is valid only if it extends uninterrupted
# over a required number of textlines
if self.intersections > TEXTEDGE_REQUIRED_ELEMENTS:
self.is_valid = True
class TextEdges(object):
"""Defines a dict of left, right and middle text edges found on
the PDF page. The dict has three keys based on the alignments,
and each key's value is a list of camelot.core.TextEdge objects.
"""
def __init__(self):
self._textedges = {'left': [], 'right': [], 'middle': []}
@staticmethod
def get_x_coord(textline, align):
"""Returns the x coordinate of a text row based on the
specified alignment.
"""
x_left = textline.x0
x_right = textline.x1
x_middle = x_left + (x_right - x_left) / 2.0
x_coord = {'left': x_left, 'middle': x_middle, 'right': x_right}
return x_coord[align]
def find(self, x_coord, align):
"""Returns the index of an existing text edge using
the specified x coordinate and alignment.
"""
for i, te in enumerate(self._textedges[align]):
if np.isclose(te.x, x_coord, atol=0.5):
return i
return None
def add(self, textline, align):
"""Adds a new text edge to the current dict.
"""
x = self.get_x_coord(textline, align)
y0 = textline.y0
y1 = textline.y1
te = TextEdge(x, y0, y1, align=align)
self._textedges[align].append(te)
def update(self, textline):
"""Updates an existing text edge in the current dict.
"""
for align in ['left', 'right', 'middle']:
x_coord = self.get_x_coord(textline, align)
idx = self.find(x_coord, align)
if idx is None:
self.add(textline, align)
else:
self._textedges[align][idx].update_coords(x_coord, textline.y0)
def generate(self, textlines):
"""Generates the text edges dict based on horizontal text
rows.
"""
for tl in textlines:
if len(tl.get_text().strip()) > 1: # TODO: hacky
self.update(tl)
def get_relevant(self):
"""Returns the list of relevant text edges (all share the same
alignment) based on which list intersects horizontal text rows
the most.
"""
intersections_sum = {
'left': sum(te.intersections for te in self._textedges['left'] if te.is_valid),
'right': sum(te.intersections for te in self._textedges['right'] if te.is_valid),
'middle': sum(te.intersections for te in self._textedges['middle'] if te.is_valid)
}
# TODO: naive
# get vertical textedges that intersect maximum number of
# times with horizontal textlines
relevant_align = max(intersections_sum.items(), key=itemgetter(1))[0]
return self._textedges[relevant_align]
def get_table_areas(self, textlines, relevant_textedges):
"""Returns a dict of interesting table areas on the PDF page
calculated using relevant text edges.
"""
def pad(area, average_row_height):
x0 = area[0] - TABLE_AREA_PADDING
y0 = area[1] - TABLE_AREA_PADDING
x1 = area[2] + TABLE_AREA_PADDING
# add a constant since table headers can be relatively up
y1 = area[3] + average_row_height * 5
return (x0, y0, x1, y1)
# sort relevant textedges in reading order
relevant_textedges.sort(key=lambda te: (-te.y0, te.x))
table_areas = {}
for te in relevant_textedges:
if te.is_valid:
if not table_areas:
table_areas[(te.x, te.y0, te.x, te.y1)] = None
else:
found = None
for area in table_areas:
# check for overlap
if te.y1 >= area[1] and te.y0 <= area[3]:
found = area
break
if found is None:
table_areas[(te.x, te.y0, te.x, te.y1)] = None
else:
table_areas.pop(found)
updated_area = (
found[0], min(te.y0, found[1]), max(found[2], te.x), max(found[3], te.y1))
table_areas[updated_area] = None
# extend table areas based on textlines that overlap
# vertically. it's possible that these textlines were
# eliminated during textedges generation since numbers and
# chars/words/sentences are often aligned differently.
# drawback: table areas that have paragraphs on their sides
# will include the paragraphs too.
sum_textline_height = 0
for tl in textlines:
sum_textline_height += tl.y1 - tl.y0
found = None
for area in table_areas:
# check for overlap
if tl.y0 >= area[1] and tl.y1 <= area[3]:
found = area
break
if found is not None:
table_areas.pop(found)
updated_area = (
min(tl.x0, found[0]), min(tl.y0, found[1]), max(found[2], tl.x1), max(found[3], tl.y1))
table_areas[updated_area] = None
average_textline_height = sum_textline_height / float(len(textlines))
# add some padding to table areas
table_areas_padded = {}
for area in table_areas:
table_areas_padded[pad(area, average_textline_height)] = None
return table_areas_padded
class Cell(object): class Cell(object):
@ -251,7 +448,7 @@ class Table(object):
self.cells[L][J].top = True self.cells[L][J].top = True
J += 1 J += 1
elif i == []: # only bottom edge elif i == []: # only bottom edge
I = len(self.rows) - 1 L = len(self.rows) - 1
if k: if k:
K = k[0] K = k[0]
while J < K: while J < K:
@ -321,33 +518,6 @@ class Table(object):
cell.hspan = True cell.hspan = True
return self return self
def plot(self, geometry_type):
"""Plot geometry found on PDF page based on geometry_type
specified, useful for debugging and playing with different
parameters to get the best output.
Parameters
----------
geometry_type : str
The geometry type for which a plot should be generated.
Can be 'text', 'table', 'contour', 'joint', 'line'
"""
if self.flavor == 'stream' and geometry_type in ['contour', 'joint', 'line']:
raise NotImplementedError("{} cannot be plotted with flavor='stream'".format(
geometry_type))
if geometry_type == 'text':
plot_text(self._text)
elif geometry_type == 'table':
plot_table(self)
elif geometry_type == 'contour':
plot_contour(self._image)
elif geometry_type == 'joint':
plot_joint(self._image)
elif geometry_type == 'line':
plot_line(self._segments)
def to_csv(self, path, **kwargs): def to_csv(self, path, **kwargs):
"""Writes Table to a comma-separated values (csv) file. """Writes Table to a comma-separated values (csv) file.

View File

@ -1,6 +1,7 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
import os import os
import sys
from PyPDF2 import PdfFileReader, PdfFileWriter from PyPDF2 import PdfFileReader, PdfFileWriter
@ -21,14 +22,22 @@ class PDFHandler(object):
Path to PDF file. Path to PDF file.
pages : str, optional (default: '1') pages : str, optional (default: '1')
Comma-separated page numbers. Comma-separated page numbers.
Example: 1,3,4 or 1,4-end. Example: '1,3,4' or '1,4-end'.
password : str, optional (default: None)
Password for decryption.
""" """
def __init__(self, filename, pages='1'): def __init__(self, filename, pages='1', password=None):
self.filename = filename self.filename = filename
if not self.filename.endswith('.pdf'): if not filename.lower().endswith('.pdf'):
raise NotImplementedError("File format not supported") raise NotImplementedError("File format not supported")
self.pages = self._get_pages(self.filename, pages) self.pages = self._get_pages(self.filename, pages)
if password is None:
self.password = ''
else:
self.password = password
if sys.version_info[0] < 3:
self.password = self.password.encode('ascii')
def _get_pages(self, filename, pages): def _get_pages(self, filename, pages):
"""Converts pages string to list of ints. """Converts pages string to list of ints.
@ -52,6 +61,8 @@ class PDFHandler(object):
page_numbers.append({'start': 1, 'end': 1}) page_numbers.append({'start': 1, 'end': 1})
else: else:
infile = PdfFileReader(open(filename, 'rb'), strict=False) infile = PdfFileReader(open(filename, 'rb'), strict=False)
if infile.isEncrypted:
infile.decrypt(self.password)
if pages == 'all': if pages == 'all':
page_numbers.append({'start': 1, 'end': infile.getNumPages()}) page_numbers.append({'start': 1, 'end': infile.getNumPages()})
else: else:
@ -84,7 +95,7 @@ class PDFHandler(object):
with open(filename, 'rb') as fileobj: with open(filename, 'rb') as fileobj:
infile = PdfFileReader(fileobj, strict=False) infile = PdfFileReader(fileobj, strict=False)
if infile.isEncrypted: if infile.isEncrypted:
infile.decrypt('') infile.decrypt(self.password)
fpath = os.path.join(temp, 'page-{0}.pdf'.format(page)) fpath = os.path.join(temp, 'page-{0}.pdf'.format(page))
froot, fext = os.path.splitext(fpath) froot, fext = os.path.splitext(fpath)
p = infile.getPage(page - 1) p = infile.getPage(page - 1)
@ -103,7 +114,7 @@ class PDFHandler(object):
os.rename(fpath, fpath_new) os.rename(fpath, fpath_new)
infile = PdfFileReader(open(fpath_new, 'rb'), strict=False) infile = PdfFileReader(open(fpath_new, 'rb'), strict=False)
if infile.isEncrypted: if infile.isEncrypted:
infile.decrypt('') infile.decrypt(self.password)
outfile = PdfFileWriter() outfile = PdfFileWriter()
p = infile.getPage(0) p = infile.getPage(0)
if rotation == 'anticlockwise': if rotation == 'anticlockwise':
@ -114,7 +125,7 @@ class PDFHandler(object):
with open(fpath, 'wb') as f: with open(fpath, 'wb') as f:
outfile.write(f) outfile.write(f)
def parse(self, flavor='lattice', **kwargs): def parse(self, flavor='lattice', suppress_stdout=False, **kwargs):
"""Extracts tables by calling parser.get_tables on all single """Extracts tables by calling parser.get_tables on all single
page PDFs. page PDFs.
@ -123,6 +134,8 @@ class PDFHandler(object):
flavor : str (default: 'lattice') flavor : str (default: 'lattice')
The parsing method to use ('lattice' or 'stream'). The parsing method to use ('lattice' or 'stream').
Lattice is used by default. Lattice is used by default.
suppress_stdout : str (default: False)
Suppress logs and warnings.
kwargs : dict kwargs : dict
See camelot.read_pdf kwargs. See camelot.read_pdf kwargs.
@ -130,9 +143,6 @@ class PDFHandler(object):
------- -------
tables : camelot.core.TableList tables : camelot.core.TableList
List of tables found in PDF. List of tables found in PDF.
geometry : camelot.core.GeometryList
List of geometry objects (contours, lines, joints) found
in PDF.
""" """
tables = [] tables = []
@ -143,6 +153,6 @@ class PDFHandler(object):
for p in self.pages] for p in self.pages]
parser = Lattice(**kwargs) if flavor == 'lattice' else Stream(**kwargs) parser = Lattice(**kwargs) if flavor == 'lattice' else Stream(**kwargs)
for p in pages: for p in pages:
t = parser.extract_tables(p) t = parser.extract_tables(p, suppress_stdout=suppress_stdout)
tables.extend(t) tables.extend(t)
return TableList(tables) return TableList(tables)

View File

@ -1,10 +1,12 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
import warnings
from .handlers import PDFHandler from .handlers import PDFHandler
from .utils import validate_input, remove_extra from .utils import validate_input, remove_extra
def read_pdf(filepath, pages='1', flavor='lattice', **kwargs): def read_pdf(filepath, pages='1', password=None, flavor='lattice',
suppress_stdout=False, **kwargs):
"""Read PDF and return extracted tables. """Read PDF and return extracted tables.
Note: kwargs annotated with ^ can only be used with flavor='stream' Note: kwargs annotated with ^ can only be used with flavor='stream'
@ -16,11 +18,15 @@ def read_pdf(filepath, pages='1', flavor='lattice', **kwargs):
Path to PDF file. Path to PDF file.
pages : str, optional (default: '1') pages : str, optional (default: '1')
Comma-separated page numbers. Comma-separated page numbers.
Example: 1,3,4 or 1,4-end. Example: '1,3,4' or '1,4-end'.
password : str, optional (default: None)
Password for decryption.
flavor : str (default: 'lattice') flavor : str (default: 'lattice')
The parsing method to use ('lattice' or 'stream'). The parsing method to use ('lattice' or 'stream').
Lattice is used by default. Lattice is used by default.
table_area : list, optional (default: None) suppress_stdout : bool, optional (default: True)
Print all logs and warnings.
table_areas : list, optional (default: None)
List of table area strings of the form x1,y1,x2,y2 List of table area strings of the form x1,y1,x2,y2
where (x1, y1) -> left-top and (x2, y2) -> right-bottom where (x1, y1) -> left-top and (x2, y2) -> right-bottom
in PDF coordinate space. in PDF coordinate space.
@ -85,8 +91,12 @@ def read_pdf(filepath, pages='1', flavor='lattice', **kwargs):
raise NotImplementedError("Unknown flavor specified." raise NotImplementedError("Unknown flavor specified."
" Use either 'lattice' or 'stream'") " Use either 'lattice' or 'stream'")
validate_input(kwargs, flavor=flavor) with warnings.catch_warnings():
p = PDFHandler(filepath, pages) if suppress_stdout:
kwargs = remove_extra(kwargs, flavor=flavor) warnings.simplefilter("ignore")
tables = p.parse(flavor=flavor, **kwargs)
return tables validate_input(kwargs, flavor=flavor)
p = PDFHandler(filepath, pages=pages, password=password)
kwargs = remove_extra(kwargs, flavor=flavor)
tables = p.parse(flavor=flavor, suppress_stdout=suppress_stdout, **kwargs)
return tables

View File

@ -31,7 +31,7 @@ class Lattice(BaseParser):
Parameters Parameters
---------- ----------
table_area : list, optional (default: None) table_areas : list, optional (default: None)
List of table area strings of the form x1,y1,x2,y2 List of table area strings of the form x1,y1,x2,y2
where (x1, y1) -> left-top and (x2, y2) -> right-bottom where (x1, y1) -> left-top and (x2, y2) -> right-bottom
in PDF coordinate space. in PDF coordinate space.
@ -79,12 +79,12 @@ class Lattice(BaseParser):
For more information, refer `PDFMiner docs <https://euske.github.io/pdfminer/>`_. For more information, refer `PDFMiner docs <https://euske.github.io/pdfminer/>`_.
""" """
def __init__(self, table_area=None, process_background=False, def __init__(self, table_areas=None, process_background=False,
line_size_scaling=15, copy_text=None, shift_text=['l', 't'], line_size_scaling=15, copy_text=None, shift_text=['l', 't'],
split_text=False, flag_size=False, line_close_tol=2, split_text=False, flag_size=False, line_close_tol=2,
joint_close_tol=2, threshold_blocksize=15, threshold_constant=-2, joint_close_tol=2, threshold_blocksize=15, threshold_constant=-2,
iterations=0, margins=(1.0, 0.5, 0.1), **kwargs): iterations=0, margins=(1.0, 0.5, 0.1), **kwargs):
self.table_area = table_area self.table_areas = table_areas
self.process_background = process_background self.process_background = process_background
self.line_size_scaling = line_size_scaling self.line_size_scaling = line_size_scaling
self.copy_text = copy_text self.copy_text = copy_text
@ -210,9 +210,9 @@ class Lattice(BaseParser):
self.threshold, direction='horizontal', self.threshold, direction='horizontal',
line_size_scaling=self.line_size_scaling, iterations=self.iterations) line_size_scaling=self.line_size_scaling, iterations=self.iterations)
if self.table_area is not None: if self.table_areas is not None:
areas = [] areas = []
for area in self.table_area: for area in self.table_areas:
x1, y1, x2, y2 = area.split(",") x1, y1, x2, y2 = area.split(",")
x1 = float(x1) x1 = float(x1)
y1 = float(y1) y1 = float(y1)
@ -237,10 +237,11 @@ class Lattice(BaseParser):
tk, self.vertical_segments, self.horizontal_segments) tk, self.vertical_segments, self.horizontal_segments)
t_bbox['horizontal'] = text_in_bbox(tk, self.horizontal_text) t_bbox['horizontal'] = text_in_bbox(tk, self.horizontal_text)
t_bbox['vertical'] = text_in_bbox(tk, self.vertical_text) t_bbox['vertical'] = text_in_bbox(tk, self.vertical_text)
self.t_bbox = t_bbox
for direction in t_bbox: t_bbox['horizontal'].sort(key=lambda x: (-x.y0, x.x0))
t_bbox[direction].sort(key=lambda x: (-x.y0, x.x0)) t_bbox['vertical'].sort(key=lambda x: (x.x0, -x.y0))
self.t_bbox = t_bbox
cols, rows = zip(*self.table_bbox[tk]) cols, rows = zip(*self.table_bbox[tk])
cols, rows = list(cols), list(rows) cols, rows = list(cols), list(rows)
@ -274,7 +275,9 @@ class Lattice(BaseParser):
table = table.set_span() table = table.set_span()
pos_errors = [] pos_errors = []
for direction in self.t_bbox: # TODO: have a single list in place of two directional ones?
# sorted on x-coordinate based on reading order i.e. LTR or RTL
for direction in ['vertical', 'horizontal']:
for t in self.t_bbox[direction]: for t in self.t_bbox[direction]:
indices, error = get_table_index( indices, error = get_table_index(
table, t, direction, split_text=self.split_text, table, t, direction, split_text=self.split_text,
@ -307,12 +310,14 @@ class Lattice(BaseParser):
table._text = _text table._text = _text
table._image = (self.image, self.table_bbox_unscaled) table._image = (self.image, self.table_bbox_unscaled)
table._segments = (self.vertical_segments, self.horizontal_segments) table._segments = (self.vertical_segments, self.horizontal_segments)
table._textedges = None
return table return table
def extract_tables(self, filename): def extract_tables(self, filename, suppress_stdout=False):
self._generate_layout(filename) self._generate_layout(filename)
logger.info('Processing {}'.format(os.path.basename(self.rootname))) if not suppress_stdout:
logger.info('Processing {}'.format(os.path.basename(self.rootname)))
if not self.horizontal_text: if not self.horizontal_text:
warnings.warn("No tables found on {}".format( warnings.warn("No tables found on {}".format(
@ -328,6 +333,7 @@ class Lattice(BaseParser):
self.table_bbox.keys(), key=lambda x: x[1], reverse=True)): self.table_bbox.keys(), key=lambda x: x[1], reverse=True)):
cols, rows, v_s, h_s = self._generate_columns_and_rows(table_idx, tk) cols, rows, v_s, h_s = self._generate_columns_and_rows(table_idx, tk)
table = self._generate_table(table_idx, cols, rows, v_s=v_s, h_s=h_s) table = self._generate_table(table_idx, cols, rows, v_s=v_s, h_s=h_s)
table._bbox = tk
_tables.append(table) _tables.append(table)
return _tables return _tables

View File

@ -9,7 +9,7 @@ import numpy as np
import pandas as pd import pandas as pd
from .base import BaseParser from .base import BaseParser
from ..core import Table from ..core import TextEdges, Table
from ..utils import (text_in_bbox, get_table_index, compute_accuracy, from ..utils import (text_in_bbox, get_table_index, compute_accuracy,
compute_whitespace) compute_whitespace)
@ -26,7 +26,7 @@ class Stream(BaseParser):
Parameters Parameters
---------- ----------
table_area : list, optional (default: None) table_areas : list, optional (default: None)
List of table area strings of the form x1,y1,x2,y2 List of table area strings of the form x1,y1,x2,y2
where (x1, y1) -> left-top and (x2, y2) -> right-bottom where (x1, y1) -> left-top and (x2, y2) -> right-bottom
in PDF coordinate space. in PDF coordinate space.
@ -50,10 +50,10 @@ class Stream(BaseParser):
For more information, refer `PDFMiner docs <https://euske.github.io/pdfminer/>`_. For more information, refer `PDFMiner docs <https://euske.github.io/pdfminer/>`_.
""" """
def __init__(self, table_area=None, columns=None, split_text=False, def __init__(self, table_areas=None, columns=None, split_text=False,
flag_size=False, row_close_tol=2, col_close_tol=0, flag_size=False, row_close_tol=2, col_close_tol=0,
margins=(1.0, 0.5, 0.1), **kwargs): margins=(1.0, 0.5, 0.1), **kwargs):
self.table_area = table_area self.table_areas = table_areas
self.columns = columns self.columns = columns
self._validate_columns() self._validate_columns()
self.split_text = split_text self.split_text = split_text
@ -116,7 +116,7 @@ class Stream(BaseParser):
row_y = t.y0 row_y = t.y0
temp.append(t) temp.append(t)
rows.append(sorted(temp, key=lambda t: t.x0)) rows.append(sorted(temp, key=lambda t: t.x0))
__ = rows.pop(0) # hacky __ = rows.pop(0) # TODO: hacky
return rows return rows
@staticmethod @staticmethod
@ -241,15 +241,42 @@ class Stream(BaseParser):
return cols return cols
def _validate_columns(self): def _validate_columns(self):
if self.table_area is not None and self.columns is not None: if self.table_areas is not None and self.columns is not None:
if len(self.table_area) != len(self.columns): if len(self.table_areas) != len(self.columns):
raise ValueError("Length of table_area and columns" raise ValueError("Length of table_areas and columns"
" should be equal") " should be equal")
def _nurminen_table_detection(self, textlines):
"""A general implementation of the table detection algorithm
described by Anssi Nurminen's master's thesis.
Link: https://dspace.cc.tut.fi/dpub/bitstream/handle/123456789/21520/Nurminen.pdf?sequence=3
Assumes that tables are situated relatively far apart
vertically.
"""
# TODO: add support for arabic text #141
# sort textlines in reading order
textlines.sort(key=lambda x: (-x.y0, x.x0))
textedges = TextEdges()
# generate left, middle and right textedges
textedges.generate(textlines)
# select relevant edges
relevant_textedges = textedges.get_relevant()
self.textedges.extend(relevant_textedges)
# guess table areas using textlines and relevant edges
table_bbox = textedges.get_table_areas(textlines, relevant_textedges)
# treat whole page as table area if no table areas found
if not len(table_bbox):
table_bbox = {(0, 0, self.pdf_width, self.pdf_height): None}
return table_bbox
def _generate_table_bbox(self): def _generate_table_bbox(self):
if self.table_area is not None: self.textedges = []
if self.table_areas is not None:
table_bbox = {} table_bbox = {}
for area in self.table_area: for area in self.table_areas:
x1, y1, x2, y2 = area.split(",") x1, y1, x2, y2 = area.split(",")
x1 = float(x1) x1 = float(x1)
y1 = float(y1) y1 = float(y1)
@ -257,7 +284,8 @@ class Stream(BaseParser):
y2 = float(y2) y2 = float(y2)
table_bbox[(x1, y2, x2, y1)] = None table_bbox[(x1, y2, x2, y1)] = None
else: else:
table_bbox = {(0, 0, self.pdf_width, self.pdf_height): None} # find tables based on nurminen's detection algorithm
table_bbox = self._nurminen_table_detection(self.horizontal_text)
self.table_bbox = table_bbox self.table_bbox = table_bbox
def _generate_columns_and_rows(self, table_idx, tk): def _generate_columns_and_rows(self, table_idx, tk):
@ -265,10 +293,11 @@ class Stream(BaseParser):
t_bbox = {} t_bbox = {}
t_bbox['horizontal'] = text_in_bbox(tk, self.horizontal_text) t_bbox['horizontal'] = text_in_bbox(tk, self.horizontal_text)
t_bbox['vertical'] = text_in_bbox(tk, self.vertical_text) t_bbox['vertical'] = text_in_bbox(tk, self.vertical_text)
self.t_bbox = t_bbox
for direction in self.t_bbox: t_bbox['horizontal'].sort(key=lambda x: (-x.y0, x.x0))
self.t_bbox[direction].sort(key=lambda x: (-x.y0, x.x0)) t_bbox['vertical'].sort(key=lambda x: (x.x0, -x.y0))
self.t_bbox = t_bbox
text_x_min, text_y_min, text_x_max, text_y_max = self._text_bbox(self.t_bbox) text_x_min, text_y_min, text_x_max, text_y_max = self._text_bbox(self.t_bbox)
rows_grouped = self._group_rows(self.t_bbox['horizontal'], row_close_tol=self.row_close_tol) rows_grouped = self._group_rows(self.t_bbox['horizontal'], row_close_tol=self.row_close_tol)
@ -286,10 +315,21 @@ class Stream(BaseParser):
cols.append(text_x_max) cols.append(text_x_max)
cols = [(cols[i], cols[i + 1]) for i in range(0, len(cols) - 1)] cols = [(cols[i], cols[i + 1]) for i in range(0, len(cols) - 1)]
else: else:
# calculate mode of the list of number of elements in
# each row to guess the number of columns
ncols = max(set(elements), key=elements.count) ncols = max(set(elements), key=elements.count)
if ncols == 1: if ncols == 1:
warnings.warn("No tables found on {}".format( # if mode is 1, the page usually contains not tables
os.path.basename(self.rootname))) # but there can be cases where the list can be skewed,
# try to remove all 1s from list in this case and
# see if the list contains elements, if yes, then use
# the mode after removing 1s
elements = list(filter(lambda x: x != 1, elements))
if len(elements):
ncols = max(set(elements), key=elements.count)
else:
warnings.warn("No tables found in table area {}".format(
table_idx + 1))
cols = [(t.x0, t.x1) for r in rows_grouped if len(r) == ncols for t in r] cols = [(t.x0, t.x1) for r in rows_grouped if len(r) == ncols for t in r]
cols = self._merge_columns(sorted(cols), col_close_tol=self.col_close_tol) cols = self._merge_columns(sorted(cols), col_close_tol=self.col_close_tol)
inner_text = [] inner_text = []
@ -311,8 +351,11 @@ class Stream(BaseParser):
def _generate_table(self, table_idx, cols, rows, **kwargs): def _generate_table(self, table_idx, cols, rows, **kwargs):
table = Table(cols, rows) table = Table(cols, rows)
table = table.set_all_edges() table = table.set_all_edges()
pos_errors = [] pos_errors = []
for direction in self.t_bbox: # TODO: have a single list in place of two directional ones?
# sorted on x-coordinate based on reading order i.e. LTR or RTL
for direction in ['vertical', 'horizontal']:
for t in self.t_bbox[direction]: for t in self.t_bbox[direction]:
indices, error = get_table_index( indices, error = get_table_index(
table, t, direction, split_text=self.split_text, table, t, direction, split_text=self.split_text,
@ -341,12 +384,14 @@ class Stream(BaseParser):
table._text = _text table._text = _text
table._image = None table._image = None
table._segments = None table._segments = None
table._textedges = self.textedges
return table return table
def extract_tables(self, filename): def extract_tables(self, filename, suppress_stdout=False):
self._generate_layout(filename) self._generate_layout(filename)
logger.info('Processing {}'.format(os.path.basename(self.rootname))) if not suppress_stdout:
logger.info('Processing {}'.format(os.path.basename(self.rootname)))
if not self.horizontal_text: if not self.horizontal_text:
warnings.warn("No tables found on {}".format( warnings.warn("No tables found on {}".format(
@ -361,6 +406,7 @@ class Stream(BaseParser):
self.table_bbox.keys(), key=lambda x: x[1], reverse=True)): self.table_bbox.keys(), key=lambda x: x[1], reverse=True)):
cols, rows = self._generate_columns_and_rows(table_idx, tk) cols, rows = self._generate_columns_and_rows(table_idx, tk)
table = self._generate_table(table_idx, cols, rows) table = self._generate_table(table_idx, cols, rows)
table._bbox = tk
_tables.append(table) _tables.append(table)
return _tables return _tables

View File

@ -1,108 +1,244 @@
import cv2 # -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import matplotlib.patches as patches try:
import matplotlib.pyplot as plt
import matplotlib.patches as patches
except ImportError:
_HAS_MPL = False
else:
_HAS_MPL = True
def plot_text(text): class PlotMethods(object):
"""Generates a plot for all text present on the PDF page. def __call__(self, table, kind='text', filename=None):
"""Plot elements found on PDF page based on kind
specified, useful for debugging and playing with different
parameters to get the best output.
Parameters Parameters
---------- ----------
text : list table: camelot.core.Table
A Camelot Table.
kind : str, optional (default: 'text')
{'text', 'grid', 'contour', 'joint', 'line'}
The element type for which a plot should be generated.
filepath: str, optional (default: None)
Absolute path for saving the generated plot.
""" Returns
fig = plt.figure() -------
ax = fig.add_subplot(111, aspect='equal') fig : matplotlib.fig.Figure
xs, ys = [], []
for t in text: """
xs.extend([t[0], t[2]]) if not _HAS_MPL:
ys.extend([t[1], t[3]]) raise ImportError('matplotlib is required for plotting.')
ax.add_patch(
patches.Rectangle( if table.flavor == 'lattice' and kind in ['textedge']:
(t[0], t[1]), raise NotImplementedError("Lattice flavor does not support kind='{}'".format(
t[2] - t[0], kind))
t[3] - t[1] elif table.flavor == 'stream' and kind in ['joint', 'line']:
raise NotImplementedError("Stream flavor does not support kind='{}'".format(
kind))
plot_method = getattr(self, kind)
return plot_method(table)
def text(self, table):
"""Generates a plot for all text elements present
on the PDF page.
Parameters
----------
table : camelot.core.Table
Returns
-------
fig : matplotlib.fig.Figure
"""
fig = plt.figure()
ax = fig.add_subplot(111, aspect='equal')
xs, ys = [], []
for t in table._text:
xs.extend([t[0], t[2]])
ys.extend([t[1], t[3]])
ax.add_patch(
patches.Rectangle(
(t[0], t[1]),
t[2] - t[0],
t[3] - t[1]
)
) )
) ax.set_xlim(min(xs) - 10, max(xs) + 10)
ax.set_xlim(min(xs) - 10, max(xs) + 10) ax.set_ylim(min(ys) - 10, max(ys) + 10)
ax.set_ylim(min(ys) - 10, max(ys) + 10) return fig
plt.show()
def grid(self, table):
"""Generates a plot for the detected table grids
on the PDF page.
def plot_table(table): Parameters
"""Generates a plot for the table. ----------
table : camelot.core.Table
Parameters Returns
---------- -------
table : camelot.core.Table fig : matplotlib.fig.Figure
""" """
for row in table.cells: fig = plt.figure()
for cell in row: ax = fig.add_subplot(111, aspect='equal')
if cell.left: for row in table.cells:
plt.plot([cell.lb[0], cell.lt[0]], for cell in row:
[cell.lb[1], cell.lt[1]]) if cell.left:
if cell.right: ax.plot([cell.lb[0], cell.lt[0]],
plt.plot([cell.rb[0], cell.rt[0]], [cell.lb[1], cell.lt[1]])
[cell.rb[1], cell.rt[1]]) if cell.right:
if cell.top: ax.plot([cell.rb[0], cell.rt[0]],
plt.plot([cell.lt[0], cell.rt[0]], [cell.rb[1], cell.rt[1]])
[cell.lt[1], cell.rt[1]]) if cell.top:
if cell.bottom: ax.plot([cell.lt[0], cell.rt[0]],
plt.plot([cell.lb[0], cell.rb[0]], [cell.lt[1], cell.rt[1]])
[cell.lb[1], cell.rb[1]]) if cell.bottom:
plt.show() ax.plot([cell.lb[0], cell.rb[0]],
[cell.lb[1], cell.rb[1]])
return fig
def contour(self, table):
"""Generates a plot for all table boundaries present
on the PDF page.
def plot_contour(image): Parameters
"""Generates a plot for all table boundaries present on the ----------
PDF page. table : camelot.core.Table
Parameters Returns
---------- -------
image : tuple fig : matplotlib.fig.Figure
""" """
img, table_bbox = image try:
for t in table_bbox.keys(): img, table_bbox = table._image
cv2.rectangle(img, (t[0], t[1]), _FOR_LATTICE = True
(t[2], t[3]), (255, 0, 0), 20) except TypeError:
plt.imshow(img) img, table_bbox = (None, {table._bbox: None})
plt.show() _FOR_LATTICE = False
fig = plt.figure()
ax = fig.add_subplot(111, aspect='equal')
xs, ys = [], []
if not _FOR_LATTICE:
for t in table._text:
xs.extend([t[0], t[2]])
ys.extend([t[1], t[3]])
ax.add_patch(
patches.Rectangle(
(t[0], t[1]),
t[2] - t[0],
t[3] - t[1],
color='blue'
)
)
def plot_joint(image): for t in table_bbox.keys():
"""Generates a plot for all line intersections present on the ax.add_patch(
PDF page. patches.Rectangle(
(t[0], t[1]),
t[2] - t[0],
t[3] - t[1],
fill=False,
color='red'
)
)
if not _FOR_LATTICE:
xs.extend([t[0], t[2]])
ys.extend([t[1], t[3]])
ax.set_xlim(min(xs) - 10, max(xs) + 10)
ax.set_ylim(min(ys) - 10, max(ys) + 10)
Parameters if _FOR_LATTICE:
---------- ax.imshow(img)
image : tuple return fig
""" def textedge(self, table):
img, table_bbox = image """Generates a plot for relevant textedges.
x_coord = []
y_coord = []
for k in table_bbox.keys():
for coord in table_bbox[k]:
x_coord.append(coord[0])
y_coord.append(coord[1])
plt.plot(x_coord, y_coord, 'ro')
plt.imshow(img)
plt.show()
Parameters
----------
table : camelot.core.Table
def plot_line(segments): Returns
"""Generates a plot for all line segments present on the PDF page. -------
fig : matplotlib.fig.Figure
Parameters """
---------- fig = plt.figure()
segments : tuple ax = fig.add_subplot(111, aspect='equal')
xs, ys = [], []
for t in table._text:
xs.extend([t[0], t[2]])
ys.extend([t[1], t[3]])
ax.add_patch(
patches.Rectangle(
(t[0], t[1]),
t[2] - t[0],
t[3] - t[1],
color='blue'
)
)
ax.set_xlim(min(xs) - 10, max(xs) + 10)
ax.set_ylim(min(ys) - 10, max(ys) + 10)
""" for te in table._textedges:
vertical, horizontal = segments ax.plot([te.x, te.x],
for v in vertical: [te.y0, te.y1])
plt.plot([v[0], v[2]], [v[1], v[3]])
for h in horizontal: return fig
plt.plot([h[0], h[2]], [h[1], h[3]])
plt.show() def joint(self, table):
"""Generates a plot for all line intersections present
on the PDF page.
Parameters
----------
table : camelot.core.Table
Returns
-------
fig : matplotlib.fig.Figure
"""
img, table_bbox = table._image
fig = plt.figure()
ax = fig.add_subplot(111, aspect='equal')
x_coord = []
y_coord = []
for k in table_bbox.keys():
for coord in table_bbox[k]:
x_coord.append(coord[0])
y_coord.append(coord[1])
ax.plot(x_coord, y_coord, 'ro')
ax.imshow(img)
return fig
def line(self, table):
"""Generates a plot for all line segments present
on the PDF page.
Parameters
----------
table : camelot.core.Table
Returns
-------
fig : matplotlib.fig.Figure
"""
fig = plt.figure()
ax = fig.add_subplot(111, aspect='equal')
vertical, horizontal = table._segments
for v in vertical:
ax.plot([v[0], v[2]], [v[1], v[3]])
for h in horizontal:
ax.plot([h[0], h[2]], [h[1], h[3]])
return fig

View File

@ -344,9 +344,9 @@ def flag_font_size(textline, direction):
fchars = [t[0] for t in chars] fchars = [t[0] for t in chars]
if ''.join(fchars).strip(): if ''.join(fchars).strip():
flist.append(''.join(fchars)) flist.append(''.join(fchars))
fstring = ''.join(flist).strip('\n') fstring = ''.join(flist)
else: else:
fstring = ''.join([t.get_text() for t in textline]).strip('\n') fstring = ''.join([t.get_text() for t in textline])
return fstring return fstring
@ -419,7 +419,7 @@ def split_textline(table, textline, direction, flag_size=False):
grouped_chars.append((key[0], key[1], flag_font_size([t[2] for t in chars], direction))) grouped_chars.append((key[0], key[1], flag_font_size([t[2] for t in chars], direction)))
else: else:
gchars = [t[2].get_text() for t in chars] gchars = [t[2].get_text() for t in chars]
grouped_chars.append((key[0], key[1], ''.join(gchars).strip('\n'))) grouped_chars.append((key[0], key[1], ''.join(gchars)))
return grouped_chars return grouped_chars
@ -500,7 +500,7 @@ def get_table_index(table, t, direction, split_text=False, flag_size=False):
if flag_size: if flag_size:
return [(r_idx, c_idx, flag_font_size(t._objs, direction))], error return [(r_idx, c_idx, flag_font_size(t._objs, direction))], error
else: else:
return [(r_idx, c_idx, t.get_text().strip('\n'))], error return [(r_idx, c_idx, t.get_text())], error
def compute_accuracy(error_weights): def compute_accuracy(error_weights):

View File

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 24 KiB

View File

Before

Width:  |  Height:  |  Size: 8.1 KiB

After

Width:  |  Height:  |  Size: 8.1 KiB

View File

Before

Width:  |  Height:  |  Size: 8.8 KiB

After

Width:  |  Height:  |  Size: 8.8 KiB

View File

Before

Width:  |  Height:  |  Size: 64 KiB

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

View File

@ -0,0 +1,96 @@
"0","1","2","3","4","5","6","7","8","9","10"
"Sl.
No.","District","n
o
i
t
a
l3
opu2-1hs)
P1k
d 20 la
er n
cto(I
ef
j
o
r
P","%
8
8
o )
s
ult t tkh
dna
Aalen l
v(I
i
u
q
E",")
y
n a
umptiomentadult/donnes)
nsres/h t
ouimk
Cqga
al re00n L
ot 4(I
T @
(","menteds, age)nes)
uireg sewastton
qn h
Reudis &ak
al cld L
tnen
To(Ife(I","","","","",""
"","","","","","","f
i
r
a
h
K","i
b
a
R","l
a
t
o
T","e
c
i
R","y
d
d
a
P"
"1","Balasore","23.65","20.81","3.04","3.47","2.78","0.86","3.64","0.17","0.25"
"2","Bhadrak","15.34","13.50","1.97","2.25","3.50","0.05","3.55","1.30","1.94"
"3","Balangir","17.01","14.97","2.19","2.50","6.23","0.10","6.33","3.83","5.72"
"4","Subarnapur","6.70","5.90","0.86","0.98","4.48","1.13","5.61","4.63","6.91"
"5","Cuttack","26.63","23.43","3.42","3.91","3.75","0.06","3.81","-0.10","-0.15"
"6","Jagatsingpur","11.49","10.11","1.48","1.69","2.10","0.02","2.12","0.43","0.64"
"7","Jajpur","18.59","16.36","2.39","2.73","2.13","0.04","2.17","-0.56","-0.84"
"8","Kendrapara","14.62","12.87","1.88","2.15","2.60","0.07","2.67","0.52","0.78"
"9","Dhenkanal","12.13","10.67","1.56","1.78","2.26","0.02","2.28","0.50","0.75"
"10","Angul","12.93","11.38","1.66","1.90","1.73","0.02","1.75","-0.15","-0.22"
"11","Ganjam","35.77","31.48","4.60","5.26","4.57","0.00","4.57","-0.69","-1.03"
"12","Gajapati","5.85","5.15","0.75","0.86","0.68","0.01","0.69","-0.17","-0.25"
"13","Kalahandi","16.12","14.19","2.07","2.37","5.42","1.13","6.55","4.18","6.24"
"14","Nuapada","6.18","5.44","0.79","0.90","1.98","0.08","2.06","1.16","1.73"
"15","Keonjhar","18.42","16.21","2.37","2.71","2.76","0.08","2.84","0.13","0.19"
"16","Koraput","14.09","12.40","1.81","2.07","2.08","0.34","2.42","0.35","0.52"
"17","Malkangiri","6.31","5.55","0.81","0.93","1.78","0.04","1.82","0.89","1.33"
"18","Nabarangpur","12.50","11.00","1.61","1.84","3.26","0.02","3.28","1.44","2.15"
"19","Rayagada","9.83","8.65","1.26","1.44","1.15","0.03","1.18","-0.26","-0.39"
"20","Mayurbhanj","25.61","22.54","3.29","3.76","4.90","0.06","4.96","1.20","1.79"
"21","Kandhamal","7.45","6.56","0.96","1.10","0.70","0.01","0.71","-0.39","-0.58"
"22","Boudh","4.51","3.97","0.58","0.66","1.73","0.03","1.76","1.10","1.64"
"23","Puri","17.29","15.22","2.22","2.54","2.45","0.99","3.44","0.90","1.34"
"24","Khordha","23.08","20.31","2.97","3.39","2.02","0.03","2.05","-1.34","-2.00"
"25","Nayagarh","9.78","8.61","1.26","1.44","2.10","0.00","2.10","0.66","0.99"
"26","Sambalpur","10.62","9.35","1.37","1.57","3.45","0.71","4.16","2.59","3.87"
"27","Bargarh","15.00","13.20","1.93","2.21","6.87","2.65","9.52","7.31","10.91"
"28","Deogarh","3.18","2.80","0.41","0.47","1.12","0.07","1.19","0.72","1.07"
"29","Jharsuguda","5.91","5.20","0.76","0.87","0.99","0.01","1.00","0.13","0.19"
"30","","","18.66","2.72","3.11","4.72","0.02","4.74","1.63","2.43"
1 0 1 2 3 4 5 6 7 8 9 10
2 Sl. No. District n o i t a l3 opu2-1hs) P1k d 20 la er n cto(I ef j o r P % 8 8 o ) s ult t tkh dna Aalen l v(I i u q E ) y n a umptiomentadult/donnes) nsres/h t ouimk Cqga al re00n L ot 4(I T @ ( menteds, age)nes) uireg sewastton qn h Reudis &ak al cld L tnen To(Ife(I
3 f i r a h K i b a R l a t o T e c i R y d d a P
4 1 Balasore 23.65 20.81 3.04 3.47 2.78 0.86 3.64 0.17 0.25
5 2 Bhadrak 15.34 13.50 1.97 2.25 3.50 0.05 3.55 1.30 1.94
6 3 Balangir 17.01 14.97 2.19 2.50 6.23 0.10 6.33 3.83 5.72
7 4 Subarnapur 6.70 5.90 0.86 0.98 4.48 1.13 5.61 4.63 6.91
8 5 Cuttack 26.63 23.43 3.42 3.91 3.75 0.06 3.81 -0.10 -0.15
9 6 Jagatsingpur 11.49 10.11 1.48 1.69 2.10 0.02 2.12 0.43 0.64
10 7 Jajpur 18.59 16.36 2.39 2.73 2.13 0.04 2.17 -0.56 -0.84
11 8 Kendrapara 14.62 12.87 1.88 2.15 2.60 0.07 2.67 0.52 0.78
12 9 Dhenkanal 12.13 10.67 1.56 1.78 2.26 0.02 2.28 0.50 0.75
13 10 Angul 12.93 11.38 1.66 1.90 1.73 0.02 1.75 -0.15 -0.22
14 11 Ganjam 35.77 31.48 4.60 5.26 4.57 0.00 4.57 -0.69 -1.03
15 12 Gajapati 5.85 5.15 0.75 0.86 0.68 0.01 0.69 -0.17 -0.25
16 13 Kalahandi 16.12 14.19 2.07 2.37 5.42 1.13 6.55 4.18 6.24
17 14 Nuapada 6.18 5.44 0.79 0.90 1.98 0.08 2.06 1.16 1.73
18 15 Keonjhar 18.42 16.21 2.37 2.71 2.76 0.08 2.84 0.13 0.19
19 16 Koraput 14.09 12.40 1.81 2.07 2.08 0.34 2.42 0.35 0.52
20 17 Malkangiri 6.31 5.55 0.81 0.93 1.78 0.04 1.82 0.89 1.33
21 18 Nabarangpur 12.50 11.00 1.61 1.84 3.26 0.02 3.28 1.44 2.15
22 19 Rayagada 9.83 8.65 1.26 1.44 1.15 0.03 1.18 -0.26 -0.39
23 20 Mayurbhanj 25.61 22.54 3.29 3.76 4.90 0.06 4.96 1.20 1.79
24 21 Kandhamal 7.45 6.56 0.96 1.10 0.70 0.01 0.71 -0.39 -0.58
25 22 Boudh 4.51 3.97 0.58 0.66 1.73 0.03 1.76 1.10 1.64
26 23 Puri 17.29 15.22 2.22 2.54 2.45 0.99 3.44 0.90 1.34
27 24 Khordha 23.08 20.31 2.97 3.39 2.02 0.03 2.05 -1.34 -2.00
28 25 Nayagarh 9.78 8.61 1.26 1.44 2.10 0.00 2.10 0.66 0.99
29 26 Sambalpur 10.62 9.35 1.37 1.57 3.45 0.71 4.16 2.59 3.87
30 27 Bargarh 15.00 13.20 1.93 2.21 6.87 2.65 9.52 7.31 10.91
31 28 Deogarh 3.18 2.80 0.41 0.47 1.12 0.07 1.19 0.72 1.07
32 29 Jharsuguda 5.91 5.20 0.76 0.87 0.99 0.01 1.00 0.13 0.19
33 30 18.66 2.72 3.11 4.72 0.02 4.74 1.63 2.43

View File

@ -0,0 +1,56 @@
"0","1","2","3","4","5","6","7"
"Rate of Accidental Deaths & Suicides and Population Growth During 1967 to 2013","","","","","","",""
"Sl.
No.","Year","Population
(in Lakh)","Accidental Deaths","","Suicides","","Percentage
Population
growth"
"","","","Incidence","Rate","Incidence","Rate",""
"(1)","(2)","(3)","(4)","(5)","(6)","(7)","(8)"
"1.","1967","4999","126762","25.4","38829","7.8","2.2"
"2.","1968","5111","126232","24.7","40688","8.0","2.2"
"3.","1969","5225","130755","25.0","43633","8.4","2.2"
"4.","1970","5343","139752","26.2","48428","9.1","2.3"
"5.","1971","5512","105601","19.2","43675","7.9","3.2"
"6.","1972","5635","106184","18.8","43601","7.7","2.2"
"7.","1973","5759","130654","22.7","40807","7.1","2.2"
"8.","1974","5883","110624","18.8","46008","7.8","2.2"
"9.","1975","6008","113016","18.8","42890","7.1","2.1"
"10.","1976","6136","111611","18.2","41415","6.7","2.1"
"11.","1977","6258","117338","18.8","39718","6.3","2.0"
"12.","1978","6384","118594","18.6","40207","6.3","2.0"
"13.","1979","6510","108987","16.7","38217","5.9","2.0"
"14.","1980","6636","116912","17.6","41663","6.3","1.9"
"15.","1981","6840","122221","17.9","40245","5.9","3.1"
"16.","1982","7052","125993","17.9","44732","6.3","3.1"
"17.","1983","7204","128576","17.8","46579","6.5","2.2"
"18.","1984","7356","134628","18.3","50571","6.9","2.1"
"19.","1985","7509","139657","18.6","52811","7.0","2.1"
"20.","1986","7661","147023","19.2","54357","7.1","2.0"
"21.","1987","7814","152314","19.5","58568","7.5","2.0"
"22.","1988","7966","163522","20.5","64270","8.1","1.9"
"23.","1989","8118","169066","20.8","68744","8.5","1.9"
"24.","1990","8270","174401","21.1","73911","8.9","1.9"
"25.","1991","8496","188003","22.1","78450","9.2","2.7"
"26.","1992","8677","194910","22.5","80149","9.2","2.1"
"27.","1993","8838","192357","21.8","84244","9.5","1.9"
"28.","1994","8997","190435","21.2","89195","9.9","1.8"
"29.","1995","9160","222487","24.3","89178","9.7","1.8"
"30.","1996","9319","220094","23.6","88241","9.5","1.7"
"31.","1997","9552","233903","24.5","95829","10.0","2.5"
"32.","1998","9709","258409","26.6","104713","10.8","1.6"
"33.","1999","9866","271918","27.6","110587","11.2","1.6"
"34.","2000","10021","255883","25.5","108593","10.8","1.6"
"35.","2001","10270","271019","26.4","108506","10.6","2.5"
"36.","2002","10506","260122","24.8","110417","10.5","2.3"
"37.","2003","10682","259625","24.3","110851","10.4","1.7"
"38.","2004","10856","277263","25.5","113697","10.5","1.6"
"39.","2005","11028","294175","26.7","113914","10.3","1.6"
"40.","2006","11198","314704","28.1","118112","10.5","1.5"
"41.","2007","11366","340794","30.0","122637","10.8","1.5"
"42.","2008","11531","342309","29.7","125017","10.8","1.4"
"43.","2009","11694","357021","30.5","127151","10.9","1.4"
"44.","2010","11858","384649","32.4","134599","11.4","1.4"
"45.","2011","12102","390884","32.3","135585","11.2","2.1"
"46.","2012","12134","394982","32.6","135445","11.2","1.0"
"47.","2013","12288","400517","32.6","134799","11.0","1.0"
1 0 1 2 3 4 5 6 7
2 Rate of Accidental Deaths & Suicides and Population Growth During 1967 to 2013
3 Sl. No. Year Population (in Lakh) Accidental Deaths Suicides Percentage Population growth
4 Incidence Rate Incidence Rate
5 (1) (2) (3) (4) (5) (6) (7) (8)
6 1. 1967 4999 126762 25.4 38829 7.8 2.2
7 2. 1968 5111 126232 24.7 40688 8.0 2.2
8 3. 1969 5225 130755 25.0 43633 8.4 2.2
9 4. 1970 5343 139752 26.2 48428 9.1 2.3
10 5. 1971 5512 105601 19.2 43675 7.9 3.2
11 6. 1972 5635 106184 18.8 43601 7.7 2.2
12 7. 1973 5759 130654 22.7 40807 7.1 2.2
13 8. 1974 5883 110624 18.8 46008 7.8 2.2
14 9. 1975 6008 113016 18.8 42890 7.1 2.1
15 10. 1976 6136 111611 18.2 41415 6.7 2.1
16 11. 1977 6258 117338 18.8 39718 6.3 2.0
17 12. 1978 6384 118594 18.6 40207 6.3 2.0
18 13. 1979 6510 108987 16.7 38217 5.9 2.0
19 14. 1980 6636 116912 17.6 41663 6.3 1.9
20 15. 1981 6840 122221 17.9 40245 5.9 3.1
21 16. 1982 7052 125993 17.9 44732 6.3 3.1
22 17. 1983 7204 128576 17.8 46579 6.5 2.2
23 18. 1984 7356 134628 18.3 50571 6.9 2.1
24 19. 1985 7509 139657 18.6 52811 7.0 2.1
25 20. 1986 7661 147023 19.2 54357 7.1 2.0
26 21. 1987 7814 152314 19.5 58568 7.5 2.0
27 22. 1988 7966 163522 20.5 64270 8.1 1.9
28 23. 1989 8118 169066 20.8 68744 8.5 1.9
29 24. 1990 8270 174401 21.1 73911 8.9 1.9
30 25. 1991 8496 188003 22.1 78450 9.2 2.7
31 26. 1992 8677 194910 22.5 80149 9.2 2.1
32 27. 1993 8838 192357 21.8 84244 9.5 1.9
33 28. 1994 8997 190435 21.2 89195 9.9 1.8
34 29. 1995 9160 222487 24.3 89178 9.7 1.8
35 30. 1996 9319 220094 23.6 88241 9.5 1.7
36 31. 1997 9552 233903 24.5 95829 10.0 2.5
37 32. 1998 9709 258409 26.6 104713 10.8 1.6
38 33. 1999 9866 271918 27.6 110587 11.2 1.6
39 34. 2000 10021 255883 25.5 108593 10.8 1.6
40 35. 2001 10270 271019 26.4 108506 10.6 2.5
41 36. 2002 10506 260122 24.8 110417 10.5 2.3
42 37. 2003 10682 259625 24.3 110851 10.4 1.7
43 38. 2004 10856 277263 25.5 113697 10.5 1.6
44 39. 2005 11028 294175 26.7 113914 10.3 1.6
45 40. 2006 11198 314704 28.1 118112 10.5 1.5
46 41. 2007 11366 340794 30.0 122637 10.8 1.5
47 42. 2008 11531 342309 29.7 125017 10.8 1.4
48 43. 2009 11694 357021 30.5 127151 10.9 1.4
49 44. 2010 11858 384649 32.4 134599 11.4 1.4
50 45. 2011 12102 390884 32.3 135585 11.2 2.1
51 46. 2012 12134 394982 32.6 135445 11.2 1.0
52 47. 2013 12288 400517 32.6 134799 11.0 1.0

View File

@ -0,0 +1,18 @@
"0","1","2"
"","e
bl
a
ail
v
a
t
o
n
a
t
a
D
*",""
1 0 1 2
2 e bl a ail v a t o n a t a D *

View File

@ -0,0 +1,3 @@
"0"
"Sl."
"No."
1 0
2 Sl.
3 No.

View File

@ -0,0 +1,3 @@
"0"
"Table 6 : DISTRIBUTION (%) OF HOUSEHOLDS BY LITERACY STATUS OF"
"MALE HEAD OF THE HOUSEHOLD"
1 0
2 Table 6 : DISTRIBUTION (%) OF HOUSEHOLDS BY LITERACY STATUS OF
3 MALE HEAD OF THE HOUSEHOLD

View File

@ -1,3 +1,7 @@
"[In thousands (11,062.6 represents 11,062,600) For year ending December 31. Based on Uniform Crime Reporting (UCR)","","","","","","","","",""
"Program. Represents arrests reported (not charged) by 12,910 agencies with a total population of 247,526,916 as estimated","","","","","","","","",""
"by the FBI. Some persons may be arrested more than once during a year, therefore, the data in this table, in some cases,","","","","","","","","",""
"could represent multiple arrests of the same person. See text, this section and source]","","","","","","","","",""
"","","Total","","","Male","","","Female","" "","","Total","","","Male","","","Female",""
"Offense charged","","Under 18","18 years","","Under 18","18 years","","Under 18","18 years" "Offense charged","","Under 18","18 years","","Under 18","18 years","","Under 18","18 years"
"","Total","years","and over","Total","years","and over","Total","years","and over" "","Total","years","and over","Total","years","and over","Total","years","and over"
@ -36,3 +40,4 @@
"Curfew and loitering law violations ..","91.0","91.0","(X)","63.1","63.1","(X)","28.0","28.0","(X)" "Curfew and loitering law violations ..","91.0","91.0","(X)","63.1","63.1","(X)","28.0","28.0","(X)"
"Runaways . . . . . . . .. .. .. .. .. ....","75.8","75.8","(X)","34.0","34.0","(X)","41.8","41.8","(X)" "Runaways . . . . . . . .. .. .. .. .. ....","75.8","75.8","(X)","34.0","34.0","(X)","41.8","41.8","(X)"
""," Represents zero. X Not applicable. 1 Buying, receiving, possessing stolen property. 2 Except forcible rape and prostitution.","","","","","","","","" ""," Represents zero. X Not applicable. 1 Buying, receiving, possessing stolen property. 2 Except forcible rape and prostitution.","","","","","","","",""
"","Source: U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reports, Arrests Master Files.","","","","","","","",""

1 [In thousands (11,062.6 represents 11,062,600) For year ending December 31. Based on Uniform Crime Reporting (UCR) Total Male Female
1 [In thousands (11,062.6 represents 11,062,600) For year ending December 31. Based on Uniform Crime Reporting (UCR)
2 Program. Represents arrests reported (not charged) by 12,910 agencies with a total population of 247,526,916 as estimated
3 by the FBI. Some persons may be arrested more than once during a year, therefore, the data in this table, in some cases,
4 could represent multiple arrests of the same person. See text, this section and source]
5 Total Total Male Male Female Female
6 Offense charged Offense charged Under 18 Under 18 18 years Under 18 18 years Under 18 18 years 18 years Under 18 Under 18 18 years
7 Total years Total years and over Total years and over years Total and over and over Total years years and over
40 Curfew and loitering law violations .. Curfew and loitering law violations .. 91.0 91.0 91.0 (X) 63.1 63.1 (X) 63.1 28.0 (X) (X) 28.0 28.0 28.0 (X)
41 Runaways . . . . . . . .. .. .. .. .. .... Runaways . . . . . . . .. .. .. .. .. .... 75.8 75.8 75.8 (X) 34.0 34.0 (X) 34.0 41.8 (X) (X) 41.8 41.8 41.8 (X)
42 – Represents zero. X Not applicable. 1 Buying, receiving, possessing stolen property. 2 Except forcible rape and prostitution. – Represents zero. X Not applicable. 1 Buying, receiving, possessing stolen property. 2 Except forcible rape and prostitution.
43 Source: U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reports, Arrests Master Files.

View File

@ -1,3 +1,7 @@
"","Source: U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reports, Arrests Master Files.","","","",""
"Table 325. Arrests by Race: 2009","","","","",""
"[Based on Uniform Crime Reporting (UCR) Program. Represents arrests reported (not charged) by 12,371 agencies","","","","",""
"with a total population of 239,839,971 as estimated by the FBI. See headnote, Table 324]","","","","",""
"","","","","American","" "","","","","American",""
"Offense charged","","","","Indian/Alaskan","Asian Pacific" "Offense charged","","","","Indian/Alaskan","Asian Pacific"
"","Total","White","Black","Native","Islander" "","Total","White","Black","Native","Islander"
@ -34,3 +38,4 @@
"Curfew and loitering law violations . .. ... .. ....","89,578","54,439","33,207","872","1,060" "Curfew and loitering law violations . .. ... .. ....","89,578","54,439","33,207","872","1,060"
"Runaways . . . . . . . .. .. .. .. .. .. .... .. ..... .","73,616","48,343","19,670","1,653","3,950" "Runaways . . . . . . . .. .. .. .. .. .. .... .. ..... .","73,616","48,343","19,670","1,653","3,950"
"1 Except forcible rape and prostitution.","","","","","" "1 Except forcible rape and prostitution.","","","","",""
"","Source: U.S. Department of Justice, Federal Bureau of Investigation, “Crime in the United States, Arrests,” September 2010,","","","",""

1 Source: U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reports, Arrests Master Files. American
1 Source: U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reports, Arrests Master Files.
2 Table 325. Arrests by Race: 2009
3 [Based on Uniform Crime Reporting (UCR) Program. Represents arrests reported (not charged) by 12,371 agencies
4 with a total population of 239,839,971 as estimated by the FBI. See headnote, Table 324]
5 American American
6 Offense charged Indian/Alaskan Indian/Alaskan Asian Pacific
7 Total Total White White Black Native Black Native Islander
38 Curfew and loitering law violations . .. ... .. .... 89,578 89,578 54,439 54,439 33,207 872 33,207 872 1,060
39 Runaways . . . . . . . .. .. .. .. .. .. .... .. ..... . 73,616 73,616 48,343 48,343 19,670 1,653 19,670 1,653 3,950
40 1 Except forcible rape and prostitution.
41 Source: U.S. Department of Justice, Federal Bureau of Investigation, “Crime in the United States, Arrests,” September 2010,

View File

@ -1,35 +1,43 @@
"","","","","","SCN","Seed","Yield","Moisture","Lodgingg","g","Stand","","Gross" "","2012 BETTER VARIETIES Harvest Report for Minnesota Central [ MNCE ]2012 BETTER VARIETIES Harvest Report for Minnesota Central [ MNCE ]","","","","","","","","","","","","ALL SEASON TESTALL SEASON TEST",""
"Company/Brandpy","","Product/Brand†","Technol.†","Mat.","Resist.","Trmt.†","Bu/A","%","%","","(x 1000)(",")","Income" "","Doug Toreen, Renville County, MN 55310 [ BIRD ISLAND ]Doug Toreen, Renville County, MN 55310","","","","","[ BIRD ISLAND ]","","","","","","","1.3 - 2.0 MAT. GROUP1.3 - 2.0 MAT. GROUP",""
"KrugerKruger","","K2-1901K2 1901","RR2YRR2Y","1.91.9","RR","Ac,PVAc,PV","56.456.4","7.67.6","00","","126.3126.3","","$846$846" "PREVPREV. CROP/HERB:","CROP/HERB","C/ S","Corn / Surpass, RoundupR","d","","","","","","","","","","S2MNCE01S2MNCE01"
"StineStine","","19RA02 §19RA02 §","RR2YRR2Y","1 91.9","RR","CMBCMB","55.355.3","7 67.6","00","","120 0120.0","","$830$830" "SOIL DESCRIPTION:","","C","Canisteo clay loam, mod. well drained, non-irrigated","","","","","","","","","","",""
"WensmanWensman","","W 3190NR2W 3190NR2","RR2YRR2Y","1 91.9","RR","AcAc","54 554.5","7 67.6","00","","119 5119.5","","$818$818" "SOIL CONDITIONS:","","","High P, high K, 6.7 pH, 3.9% OM, Low SCN","","","","","","","","","","","30"" ROW SPACING"
"H ftHefty","","H17Y12H17Y12","RR2YRR2Y","1 71.7","MRMR","II","53 753.7","7 77.7","00","","124 4124.4","","$806$806" "TILLAGE/CULTIVATION:TILLAGE/CULTIVATION:","","","conventional w/ fall tillconventional w/ fall till","","","","","","","","","","",""
"Dyna-Gro","","S15RY53","RR2Y","1.5","R","Ac","53.6","7.7","0","","126.8","","$804" "PEST MANAGEMENT:PEST MANAGEMENT:","","Roundup twiceRoundup twice","","","","","","","","","","","",""
"LG SeedsLG Seeds","","C2050R2C2050R2","RR2YRR2Y","2.12.1","RR","AcAc","53.653.6","7.77.7","00","","123.9123.9","","$804$804" "SEEDED - RATE:","","May 15M15","140,000 /A140 000 /A","","","","","","","","TOP 30 foTOP 30 for YIELD of 63 TESTED","","YIELD of 63 TESTED",""
"Titan ProTitan Pro","","19M4219M42","RR2YRR2Y","1.91.9","RR","CMBCMB","53.653.6","7.77.7","00","","121.0121.0","","$804$804" "HARVESTEDHARVESTED - STAND:","STAND","O t 3Oct 3","122 921 /A122,921 /A","","","","","","","","","AVERAGE of (3) REPLICATIONSAVERAGE of (3) REPLICATIONS","",""
"StineStine","","19RA02 (2) §19RA02 (2) §","RR2YRR2Y","1 91.9","RR","CMBCMB","53 453.4","7 77.7","00","","123 9123.9","","$801$801" "","","","","","","SCN","Seed","Yield","Moisture","Lodgingg","g","Stand","","Gross"
"AsgrowAsgrow","","AG1832 §AG1832 §","RR2YRR2Y","1 81.8","MRMR","Ac PVAc,PV","52 952.9","7 77.7","00","","122 0122.0","","$794$794" "","Company/Brandpy","Product/Brand†","","Technol.†","Mat.","Resist.","Trmt.†","Bu/A","%","%","","(x 1000)(",")","Income"
"Prairie Brandiid","","PB-1566R2662","RR2Y2","1.5","R","CMB","52.8","7.7","0","","122.9","","$792$" "","KrugerKruger","K2-1901K2 1901","","RR2YRR2Y","1.91.9","RR","Ac,PVAc,PV","56.456.4","7.67.6","00","","126.3126.3","","$846$846"
"Channel","","1901R2","RR2Y","1.9","R","Ac,PV,","52.8","7.6","0","","123.4","","$791$" "","StineStine","19RA02 §19RA02 §","","RR2YRR2Y","1 91.9","RR","CMBCMB","55.355.3","7 67.6","00","","120 0120.0","","$830$830"
"Titan ProTitan Pro","","20M120M1","RR2YRR2Y","2.02.0","RR","AmAm","52.552.5","7.57.5","00","","124.4124.4","","$788$788" "","WensmanWensman","W 3190NR2W 3190NR2","","RR2YRR2Y","1 91.9","RR","AcAc","54 554.5","7 67.6","00","","119 5119.5","","$818$818"
"KrugerKruger","","K2-2002K2-2002","RR2YRR2Y","2 02.0","RR","Ac PVAc,PV","52 452.4","7 97.9","00","","125 4125.4","","$786$786" "","H ftHefty","H17Y12H17Y12","","RR2YRR2Y","1 71.7","MRMR","II","53 753.7","7 77.7","00","","124 4124.4","","$806$806"
"ChannelChannel","","1700R21700R2","RR2YRR2Y","1 71.7","RR","Ac PVAc,PV","52 352.3","7 97.9","00","","123 9123.9","","$784$784" "","Dyna-Gro","S15RY53","","RR2Y","1.5","R","Ac","53.6","7.7","0","","126.8","","$804"
"H ftHefty","","H16Y11H16Y11","RR2YRR2Y","1 61.6","MRMR","II","51 451.4","7 67.6","00","","123 9123.9","","$771$771" "","LG SeedsLG Seeds","C2050R2C2050R2","","RR2YRR2Y","2.12.1","RR","AcAc","53.653.6","7.77.7","00","","123.9123.9","","$804$804"
"Anderson","","162R2Y","RR2Y","1.6","R","None","51.3","7.5","0","","119.5","","$770" "","Titan ProTitan Pro","19M4219M42","","RR2YRR2Y","1.91.9","RR","CMBCMB","53.653.6","7.77.7","00","","121.0121.0","","$804$804"
"Titan ProTitan Pro","","15M2215M22","RR2YRR2Y","1.51.5","RR","CMBCMB","51.351.3","7.87.8","00","","125.4125.4","","$769$769" "","StineStine","19RA02 (2) §19RA02 (2) §","","RR2YRR2Y","1 91.9","RR","CMBCMB","53 453.4","7 77.7","00","","123 9123.9","","$801$801"
"DairylandDairyland","","DSR-1710R2YDSR-1710R2Y","RR2YRR2Y","1 71.7","RR","CMBCMB","51 351.3","7 77.7","00","","122 0122.0","","$769$769" "","AsgrowAsgrow","AG1832 §AG1832 §","","RR2YRR2Y","1 81.8","MRMR","Ac PVAc,PV","52 952.9","7 77.7","00","","122 0122.0","","$794$794"
"HeftyHefty","","H20R3H20R3","RR2YRR2Y","2 02.0","MRMR","II","50 550.5","8 28.2","00","","121 0121.0","","$757$757" "","Prairie Brandiid","PB-1566R2662","","RR2Y2","1.5","R","CMB","52.8","7.7","0","","122.9","","$792$"
"PPrairie BrandiiBd","","PB 1743R2PB-1743R2","RR2YRR2Y","1 71.7","RR","CMBCMB","50 250.2","7 77.7","00","","125 8125.8","","$752$752" "","Channel","1901R2","","RR2Y","1.9","R","Ac,PV,","52.8","7.6","0","","123.4","","$791$"
"Gold Country","","1741","RR2Y","1.7","R","Ac","50.1","7.8","0","","123.9","","$751" "","Titan ProTitan Pro","20M120M1","","RR2YRR2Y","2.02.0","RR","AmAm","52.552.5","7.57.5","00","","124.4124.4","","$788$788"
"Trelaye ay","","20RR4303","RR2Y","2.00","R","Ac,Exc,","49.99 9","7.66","00","","127.88","","$749$9" "","KrugerKruger","K2-2002K2-2002","","RR2YRR2Y","2 02.0","RR","Ac PVAc,PV","52 452.4","7 97.9","00","","125 4125.4","","$786$786"
"HeftyHefty","","H14R3H14R3","RR2YRR2Y","1.41.4","MRMR","II","49.749.7","7.77.7","00","","122.9122.9","","$746$746" "","ChannelChannel","1700R21700R2","","RR2YRR2Y","1 71.7","RR","Ac PVAc,PV","52 352.3","7 97.9","00","","123 9123.9","","$784$784"
"Prairie BrandPrairie Brand","","PB-2099NRR2PB-2099NRR2","RR2YRR2Y","2 02.0","RR","CMBCMB","49 649.6","7 87.8","00","","126 3126.3","","$743$743" "","H ftHefty","H16Y11H16Y11","","RR2YRR2Y","1 61.6","MRMR","II","51 451.4","7 67.6","00","","123 9123.9","","$771$771"
"WensmanWensman","","W 3174NR2W 3174NR2","RR2YRR2Y","1 71.7","RR","AcAc","49 349.3","7 67.6","00","","122 5122.5","","$740$740" "","Anderson","162R2Y","","RR2Y","1.6","R","None","51.3","7.5","0","","119.5","","$770"
"KKruger","","K2 1602K2-1602","RR2YRR2Y","1 61.6","R","Ac,PV","48.78","7.66","00","","125.412","","$731$31" "","Titan ProTitan Pro","15M2215M22","","RR2YRR2Y","1.51.5","RR","CMBCMB","51.351.3","7.87.8","00","","125.4125.4","","$769$769"
"NK Brand","","S18-C2 §§","RR2Y","1.8","R","CMB","48.7","7.7","0","","126.8","","$731$" "","DairylandDairyland","DSR-1710R2YDSR-1710R2Y","","RR2YRR2Y","1 71.7","RR","CMBCMB","51 351.3","7 77.7","00","","122 0122.0","","$769$769"
"KrugerKruger","","K2-1902K2 1902","RR2YRR2Y","1.91.9","RR","Ac,PVAc,PV","48.748.7","7.57.5","00","","124.4124.4","","$730$730" "","HeftyHefty","H20R3H20R3","","RR2YRR2Y","2 02.0","MRMR","II","50 550.5","8 28.2","00","","121 0121.0","","$757$757"
"Prairie BrandPrairie Brand","","PB-1823R2PB-1823R2","RR2YRR2Y","1 81.8","RR","NoneNone","48 548.5","7 67.6","00","","121 0121.0","","$727$727" "","PPrairie BrandiiBd","PB 1743R2PB-1743R2","","RR2YRR2Y","1 71.7","RR","CMBCMB","50 250.2","7 77.7","00","","125 8125.8","","$752$752"
"Gold CountryGold Country","","15411541","RR2YRR2Y","1 51.5","RR","AcAc","48 448.4","7 67.6","00","","110 4110.4","","$726$726" "","Gold Country","1741","","RR2Y","1.7","R","Ac","50.1","7.8","0","","123.9","","$751"
"","","","","","","Test Average =","47 647.6","7 77.7","00","","122 9122.9","","$713$713" "","Trelaye ay","20RR4303","","RR2Y","2.00","R","Ac,Exc,","49.99 9","7.66","00","","127.88","","$749$9"
"","","","","","","LSD (0.10) =","5.7","0.3","ns","","37.8","","566.4" "","HeftyHefty","H14R3H14R3","","RR2YRR2Y","1.41.4","MRMR","II","49.749.7","7.77.7","00","","122.9122.9","","$746$746"
"","F.I.R.S.T. Managerg","","","","","C.V. =","8.8","2.9","","","56.4","","846.2" "","Prairie BrandPrairie Brand","PB-2099NRR2PB-2099NRR2","","RR2YRR2Y","2 02.0","RR","CMBCMB","49 649.6","7 87.8","00","","126 3126.3","","$743$743"
"","WensmanWensman","W 3174NR2W 3174NR2","","RR2YRR2Y","1 71.7","RR","AcAc","49 349.3","7 67.6","00","","122 5122.5","","$740$740"
"","KKruger","K2 1602K2-1602","","RR2YRR2Y","1 61.6","R","Ac,PV","48.78","7.66","00","","125.412","","$731$31"
"","NK Brand","S18-C2 §§","","RR2Y","1.8","R","CMB","48.7","7.7","0","","126.8","","$731$"
"","KrugerKruger","K2-1902K2 1902","","RR2YRR2Y","1.91.9","RR","Ac,PVAc,PV","48.748.7","7.57.5","00","","124.4124.4","","$730$730"
"","Prairie BrandPrairie Brand","PB-1823R2PB-1823R2","","RR2YRR2Y","1 81.8","RR","NoneNone","48 548.5","7 67.6","00","","121 0121.0","","$727$727"
"","Gold CountryGold Country","15411541","","RR2YRR2Y","1 51.5","RR","AcAc","48 448.4","7 67.6","00","","110 4110.4","","$726$726"
"","","","","","","","Test Average =","47 647.6","7 77.7","00","","122 9122.9","","$713$713"
"","","","","","","","LSD (0.10) =","5.7","0.3","ns","","37.8","","566.4"

1 2012 BETTER VARIETIES Harvest Report for Minnesota Central [ MNCE ]2012 BETTER VARIETIES Harvest Report for Minnesota Central [ MNCE ] SCN Seed Yield g Moisture Lodgingg Stand Gross ALL SEASON TESTALL SEASON TEST
2 Company/Brandpy Doug Toreen, Renville County, MN 55310 [ BIRD ISLAND ]Doug Toreen, Renville County, MN 55310 Product/Brand† Technol.† Resist. Mat. Trmt.† ) [ BIRD ISLAND ] Bu/A % % (x 1000)( Income 1.3 - 2.0 MAT. GROUP1.3 - 2.0 MAT. GROUP
3 KrugerKruger PREVPREV. CROP/HERB: CROP/HERB C/ S K2-1901K2 1901 Corn / Surpass, RoundupR RR2YRR2Y d RR 1.91.9 Ac,PVAc,PV 56.456.4 7.67.6 00 126.3126.3 $846$846 S2MNCE01S2MNCE01
4 StineStine SOIL DESCRIPTION: C 19RA02 §19RA02 § Canisteo clay loam, mod. well drained, non-irrigated RR2YRR2Y RR 1 91.9 CMBCMB 55.355.3 7 67.6 00 120 0120.0 $830$830
5 WensmanWensman SOIL CONDITIONS: W 3190NR2W 3190NR2 High P, high K, 6.7 pH, 3.9% OM, Low SCN RR2YRR2Y RR 1 91.9 AcAc 54 554.5 7 67.6 00 119 5119.5 $818$818 30" ROW SPACING
6 H ftHefty TILLAGE/CULTIVATION:TILLAGE/CULTIVATION: H17Y12H17Y12 conventional w/ fall tillconventional w/ fall till RR2YRR2Y MRMR 1 71.7 II 53 753.7 7 77.7 00 124 4124.4 $806$806
7 Dyna-Gro PEST MANAGEMENT:PEST MANAGEMENT: Roundup twiceRoundup twice S15RY53 RR2Y R 1.5 Ac 53.6 7.7 0 126.8 $804
8 LG SeedsLG Seeds SEEDED - RATE: May 15M15 C2050R2C2050R2 140,000 /A140 000 /A RR2YRR2Y RR 2.12.1 AcAc 53.653.6 7.77.7 00 123.9123.9 TOP 30 foTOP 30 for YIELD of 63 TESTED $804$804 YIELD of 63 TESTED
9 Titan ProTitan Pro HARVESTEDHARVESTED - STAND: STAND O t 3Oct 3 19M4219M42 122 921 /A122,921 /A RR2YRR2Y RR 1.91.9 CMBCMB 53.653.6 7.77.7 00 121.0121.0 AVERAGE of (3) REPLICATIONSAVERAGE of (3) REPLICATIONS $804$804
10 StineStine 19RA02 (2) §19RA02 (2) § RR2YRR2Y RR 1 91.9 CMBCMB SCN 53 453.4 Seed 7 77.7 Yield 00 Moisture Lodgingg 123 9123.9 g Stand $801$801 Gross
11 AsgrowAsgrow Company/Brandpy Product/Brand† AG1832 §AG1832 § RR2YRR2Y Technol.† MRMR 1 81.8 Mat. Ac PVAc,PV Resist. 52 952.9 Trmt.† 7 77.7 Bu/A 00 % % 122 0122.0 (x 1000)( $794$794 ) Income
12 Prairie Brandiid KrugerKruger K2-1901K2 1901 PB-1566R2662 RR2Y2 RR2YRR2Y R 1.5 1.91.9 CMB RR 52.8 Ac,PVAc,PV 7.7 56.456.4 0 7.67.6 00 122.9 126.3126.3 $792$ $846$846
13 Channel StineStine 19RA02 §19RA02 § 1901R2 RR2Y RR2YRR2Y R 1.9 1 91.9 Ac,PV, RR 52.8 CMBCMB 7.6 55.355.3 0 7 67.6 00 123.4 120 0120.0 $791$ $830$830
14 Titan ProTitan Pro WensmanWensman W 3190NR2W 3190NR2 20M120M1 RR2YRR2Y RR 2.02.0 1 91.9 AmAm RR 52.552.5 AcAc 7.57.5 54 554.5 00 7 67.6 00 124.4124.4 119 5119.5 $788$788 $818$818
15 KrugerKruger H ftHefty H17Y12H17Y12 K2-2002K2-2002 RR2YRR2Y RR 2 02.0 1 71.7 Ac PVAc,PV MRMR 52 452.4 II 7 97.9 53 753.7 00 7 77.7 00 125 4125.4 124 4124.4 $786$786 $806$806
16 ChannelChannel Dyna-Gro S15RY53 1700R21700R2 RR2YRR2Y RR2Y RR 1 71.7 1.5 Ac PVAc,PV R 52 352.3 Ac 7 97.9 53.6 00 7.7 0 123 9123.9 126.8 $784$784 $804
17 H ftHefty LG SeedsLG Seeds C2050R2C2050R2 H16Y11H16Y11 RR2YRR2Y MRMR 1 61.6 2.12.1 II RR 51 451.4 AcAc 7 67.6 53.653.6 00 7.77.7 00 123 9123.9 123.9123.9 $771$771 $804$804
18 Anderson Titan ProTitan Pro 19M4219M42 162R2Y RR2Y RR2YRR2Y R 1.6 1.91.9 None RR 51.3 CMBCMB 7.5 53.653.6 0 7.77.7 00 119.5 121.0121.0 $770 $804$804
19 Titan ProTitan Pro StineStine 19RA02 (2) §19RA02 (2) § 15M2215M22 RR2YRR2Y RR 1.51.5 1 91.9 CMBCMB RR 51.351.3 CMBCMB 7.87.8 53 453.4 00 7 77.7 00 125.4125.4 123 9123.9 $769$769 $801$801
20 DairylandDairyland AsgrowAsgrow AG1832 §AG1832 § DSR-1710R2YDSR-1710R2Y RR2YRR2Y RR 1 71.7 1 81.8 CMBCMB MRMR 51 351.3 Ac PVAc,PV 7 77.7 52 952.9 00 7 77.7 00 122 0122.0 122 0122.0 $769$769 $794$794
21 HeftyHefty Prairie Brandiid PB-1566R2662 H20R3H20R3 RR2YRR2Y RR2Y2 MRMR 2 02.0 1.5 II R 50 550.5 CMB 8 28.2 52.8 00 7.7 0 121 0121.0 122.9 $757$757 $792$
22 PPrairie BrandiiBd Channel 1901R2 PB 1743R2PB-1743R2 RR2YRR2Y RR2Y RR 1 71.7 1.9 CMBCMB R 50 250.2 Ac,PV, 7 77.7 52.8 00 7.6 0 125 8125.8 123.4 $752$752 $791$
23 Gold Country Titan ProTitan Pro 20M120M1 1741 RR2Y RR2YRR2Y R 1.7 2.02.0 Ac RR 50.1 AmAm 7.8 52.552.5 0 7.57.5 00 123.9 124.4124.4 $751 $788$788
24 Trelaye ay KrugerKruger K2-2002K2-2002 20RR4303 RR2Y RR2YRR2Y R 2.00 2 02.0 Ac,Exc, RR 49.99 9 Ac PVAc,PV 7.66 52 452.4 00 7 97.9 00 127.88 125 4125.4 $749$9 $786$786
25 HeftyHefty ChannelChannel 1700R21700R2 H14R3H14R3 RR2YRR2Y MRMR 1.41.4 1 71.7 II RR 49.749.7 Ac PVAc,PV 7.77.7 52 352.3 00 7 97.9 00 122.9122.9 123 9123.9 $746$746 $784$784
26 Prairie BrandPrairie Brand H ftHefty H16Y11H16Y11 PB-2099NRR2PB-2099NRR2 RR2YRR2Y RR 2 02.0 1 61.6 CMBCMB MRMR 49 649.6 II 7 87.8 51 451.4 00 7 67.6 00 126 3126.3 123 9123.9 $743$743 $771$771
27 WensmanWensman Anderson 162R2Y W 3174NR2W 3174NR2 RR2YRR2Y RR2Y RR 1 71.7 1.6 AcAc R 49 349.3 None 7 67.6 51.3 00 7.5 0 122 5122.5 119.5 $740$740 $770
28 KKruger Titan ProTitan Pro 15M2215M22 K2 1602K2-1602 RR2YRR2Y R 1 61.6 1.51.5 Ac,PV RR 48.78 CMBCMB 7.66 51.351.3 00 7.87.8 00 125.412 125.4125.4 $731$31 $769$769
29 NK Brand DairylandDairyland DSR-1710R2YDSR-1710R2Y S18-C2 §§ RR2Y RR2YRR2Y R 1.8 1 71.7 CMB RR 48.7 CMBCMB 7.7 51 351.3 0 7 77.7 00 126.8 122 0122.0 $731$ $769$769
30 KrugerKruger HeftyHefty H20R3H20R3 K2-1902K2 1902 RR2YRR2Y RR 1.91.9 2 02.0 Ac,PVAc,PV MRMR 48.748.7 II 7.57.5 50 550.5 00 8 28.2 00 124.4124.4 121 0121.0 $730$730 $757$757
31 Prairie BrandPrairie Brand PPrairie BrandiiBd PB 1743R2PB-1743R2 PB-1823R2PB-1823R2 RR2YRR2Y RR 1 81.8 1 71.7 NoneNone RR 48 548.5 CMBCMB 7 67.6 50 250.2 00 7 77.7 00 121 0121.0 125 8125.8 $727$727 $752$752
32 Gold CountryGold Country Gold Country 1741 15411541 RR2YRR2Y RR2Y RR 1 51.5 1.7 AcAc R 48 448.4 Ac 7 67.6 50.1 00 7.8 0 110 4110.4 123.9 $726$726 $751
33 Trelaye ay 20RR4303 RR2Y 2.00 Test Average = R 47 647.6 Ac,Exc, 7 77.7 49.99 9 00 7.66 00 122 9122.9 127.88 $713$713 $749$9
34 HeftyHefty H14R3H14R3 RR2YRR2Y 1.41.4 LSD (0.10) = MRMR 5.7 II 0.3 49.749.7 ns 7.77.7 00 37.8 122.9122.9 566.4 $746$746
35 Prairie BrandPrairie Brand F.I.R.S.T. Managerg PB-2099NRR2PB-2099NRR2 RR2YRR2Y 2 02.0 C.V. = RR 8.8 CMBCMB 2.9 49 649.6 7 87.8 00 56.4 126 3126.3 846.2 $743$743
36 WensmanWensman W 3174NR2W 3174NR2 RR2YRR2Y 1 71.7 RR AcAc 49 349.3 7 67.6 00 122 5122.5 $740$740
37 KKruger K2 1602K2-1602 RR2YRR2Y 1 61.6 R Ac,PV 48.78 7.66 00 125.412 $731$31
38 NK Brand S18-C2 §§ RR2Y 1.8 R CMB 48.7 7.7 0 126.8 $731$
39 KrugerKruger K2-1902K2 1902 RR2YRR2Y 1.91.9 RR Ac,PVAc,PV 48.748.7 7.57.5 00 124.4124.4 $730$730
40 Prairie BrandPrairie Brand PB-1823R2PB-1823R2 RR2YRR2Y 1 81.8 RR NoneNone 48 548.5 7 67.6 00 121 0121.0 $727$727
41 Gold CountryGold Country 15411541 RR2YRR2Y 1 51.5 RR AcAc 48 448.4 7 67.6 00 110 4110.4 $726$726
42 Test Average = 47 647.6 7 77.7 00 122 9122.9 $713$713
43 LSD (0.10) = 5.7 0.3 ns 37.8 566.4

View File

@ -0,0 +1,39 @@
"TILLAGE/CULTIVATION:TILLAGE/CULTIVATION:","","conventional w/ fall tillconventional w/ fall till","","","","","","","","","","",""
"PEST MANAGEMENT:PEST MANAGEMENT:","","Roundup twiceRoundup twice","","","","","","","","","","",""
"SEEDED - RATE:","","May 15M15","140,000 /A140 000 /A","","","","","","","TOP 30 foTOP 30 for YIELD of 63 TESTED","","YIELD of 63 TESTED",""
"HARVESTEDHARVESTED - STAND:STAND","","O t 3Oct 3","122 921 /A122,921 /A","","","","","","","","AVERAGE of (3) REPLICATIONSAVERAGE of (3) REPLICATIONS","",""
"","","","","","SCN","Seed","Yield","Moisture","Lodgingg","g","Stand","","Gross"
"Company/Brandpy","","Product/Brand†","Technol.†","Mat.","Resist.","Trmt.†","Bu/A","%","%","","(x 1000)(",")","Income"
"KrugerKruger","","K2-1901K2 1901","RR2YRR2Y","1.91.9","RR","Ac,PVAc,PV","56.456.4","7.67.6","00","","126.3126.3","","$846$846"
"StineStine","","19RA02 §19RA02 §","RR2YRR2Y","1 91.9","RR","CMBCMB","55.355.3","7 67.6","00","","120 0120.0","","$830$830"
"WensmanWensman","","W 3190NR2W 3190NR2","RR2YRR2Y","1 91.9","RR","AcAc","54 554.5","7 67.6","00","","119 5119.5","","$818$818"
"H ftHefty","","H17Y12H17Y12","RR2YRR2Y","1 71.7","MRMR","II","53 753.7","7 77.7","00","","124 4124.4","","$806$806"
"Dyna-Gro","","S15RY53","RR2Y","1.5","R","Ac","53.6","7.7","0","","126.8","","$804"
"LG SeedsLG Seeds","","C2050R2C2050R2","RR2YRR2Y","2.12.1","RR","AcAc","53.653.6","7.77.7","00","","123.9123.9","","$804$804"
"Titan ProTitan Pro","","19M4219M42","RR2YRR2Y","1.91.9","RR","CMBCMB","53.653.6","7.77.7","00","","121.0121.0","","$804$804"
"StineStine","","19RA02 (2) §19RA02 (2) §","RR2YRR2Y","1 91.9","RR","CMBCMB","53 453.4","7 77.7","00","","123 9123.9","","$801$801"
"AsgrowAsgrow","","AG1832 §AG1832 §","RR2YRR2Y","1 81.8","MRMR","Ac PVAc,PV","52 952.9","7 77.7","00","","122 0122.0","","$794$794"
"Prairie Brandiid","","PB-1566R2662","RR2Y2","1.5","R","CMB","52.8","7.7","0","","122.9","","$792$"
"Channel","","1901R2","RR2Y","1.9","R","Ac,PV,","52.8","7.6","0","","123.4","","$791$"
"Titan ProTitan Pro","","20M120M1","RR2YRR2Y","2.02.0","RR","AmAm","52.552.5","7.57.5","00","","124.4124.4","","$788$788"
"KrugerKruger","","K2-2002K2-2002","RR2YRR2Y","2 02.0","RR","Ac PVAc,PV","52 452.4","7 97.9","00","","125 4125.4","","$786$786"
"ChannelChannel","","1700R21700R2","RR2YRR2Y","1 71.7","RR","Ac PVAc,PV","52 352.3","7 97.9","00","","123 9123.9","","$784$784"
"H ftHefty","","H16Y11H16Y11","RR2YRR2Y","1 61.6","MRMR","II","51 451.4","7 67.6","00","","123 9123.9","","$771$771"
"Anderson","","162R2Y","RR2Y","1.6","R","None","51.3","7.5","0","","119.5","","$770"
"Titan ProTitan Pro","","15M2215M22","RR2YRR2Y","1.51.5","RR","CMBCMB","51.351.3","7.87.8","00","","125.4125.4","","$769$769"
"DairylandDairyland","","DSR-1710R2YDSR-1710R2Y","RR2YRR2Y","1 71.7","RR","CMBCMB","51 351.3","7 77.7","00","","122 0122.0","","$769$769"
"HeftyHefty","","H20R3H20R3","RR2YRR2Y","2 02.0","MRMR","II","50 550.5","8 28.2","00","","121 0121.0","","$757$757"
"PPrairie BrandiiBd","","PB 1743R2PB-1743R2","RR2YRR2Y","1 71.7","RR","CMBCMB","50 250.2","7 77.7","00","","125 8125.8","","$752$752"
"Gold Country","","1741","RR2Y","1.7","R","Ac","50.1","7.8","0","","123.9","","$751"
"Trelaye ay","","20RR4303","RR2Y","2.00","R","Ac,Exc,","49.99 9","7.66","00","","127.88","","$749$9"
"HeftyHefty","","H14R3H14R3","RR2YRR2Y","1.41.4","MRMR","II","49.749.7","7.77.7","00","","122.9122.9","","$746$746"
"Prairie BrandPrairie Brand","","PB-2099NRR2PB-2099NRR2","RR2YRR2Y","2 02.0","RR","CMBCMB","49 649.6","7 87.8","00","","126 3126.3","","$743$743"
"WensmanWensman","","W 3174NR2W 3174NR2","RR2YRR2Y","1 71.7","RR","AcAc","49 349.3","7 67.6","00","","122 5122.5","","$740$740"
"KKruger","","K2 1602K2-1602","RR2YRR2Y","1 61.6","R","Ac,PV","48.78","7.66","00","","125.412","","$731$31"
"NK Brand","","S18-C2 §§","RR2Y","1.8","R","CMB","48.7","7.7","0","","126.8","","$731$"
"KrugerKruger","","K2-1902K2 1902","RR2YRR2Y","1.91.9","RR","Ac,PVAc,PV","48.748.7","7.57.5","00","","124.4124.4","","$730$730"
"Prairie BrandPrairie Brand","","PB-1823R2PB-1823R2","RR2YRR2Y","1 81.8","RR","NoneNone","48 548.5","7 67.6","00","","121 0121.0","","$727$727"
"Gold CountryGold Country","","15411541","RR2YRR2Y","1 51.5","RR","AcAc","48 448.4","7 67.6","00","","110 4110.4","","$726$726"
"","","","","","","Test Average =","47 647.6","7 77.7","00","","122 9122.9","","$713$713"
"","","","","","","LSD (0.10) =","5.7","0.3","ns","","37.8","","566.4"
"","F.I.R.S.T. Managerg","","","","","C.V. =","8.8","2.9","","","56.4","","846.2"
1 TILLAGE/CULTIVATION:TILLAGE/CULTIVATION: conventional w/ fall tillconventional w/ fall till
2 PEST MANAGEMENT:PEST MANAGEMENT: Roundup twiceRoundup twice
3 SEEDED - RATE: May 15M15 140,000 /A140 000 /A TOP 30 foTOP 30 for YIELD of 63 TESTED YIELD of 63 TESTED
4 HARVESTEDHARVESTED - STAND:STAND O t 3Oct 3 122 921 /A122,921 /A AVERAGE of (3) REPLICATIONSAVERAGE of (3) REPLICATIONS
5 SCN Seed Yield Moisture Lodgingg g Stand Gross
6 Company/Brandpy Product/Brand† Technol.† Mat. Resist. Trmt.† Bu/A % % (x 1000)( ) Income
7 KrugerKruger K2-1901K2 1901 RR2YRR2Y 1.91.9 RR Ac,PVAc,PV 56.456.4 7.67.6 00 126.3126.3 $846$846
8 StineStine 19RA02 §19RA02 § RR2YRR2Y 1 91.9 RR CMBCMB 55.355.3 7 67.6 00 120 0120.0 $830$830
9 WensmanWensman W 3190NR2W 3190NR2 RR2YRR2Y 1 91.9 RR AcAc 54 554.5 7 67.6 00 119 5119.5 $818$818
10 H ftHefty H17Y12H17Y12 RR2YRR2Y 1 71.7 MRMR II 53 753.7 7 77.7 00 124 4124.4 $806$806
11 Dyna-Gro S15RY53 RR2Y 1.5 R Ac 53.6 7.7 0 126.8 $804
12 LG SeedsLG Seeds C2050R2C2050R2 RR2YRR2Y 2.12.1 RR AcAc 53.653.6 7.77.7 00 123.9123.9 $804$804
13 Titan ProTitan Pro 19M4219M42 RR2YRR2Y 1.91.9 RR CMBCMB 53.653.6 7.77.7 00 121.0121.0 $804$804
14 StineStine 19RA02 (2) §19RA02 (2) § RR2YRR2Y 1 91.9 RR CMBCMB 53 453.4 7 77.7 00 123 9123.9 $801$801
15 AsgrowAsgrow AG1832 §AG1832 § RR2YRR2Y 1 81.8 MRMR Ac PVAc,PV 52 952.9 7 77.7 00 122 0122.0 $794$794
16 Prairie Brandiid PB-1566R2662 RR2Y2 1.5 R CMB 52.8 7.7 0 122.9 $792$
17 Channel 1901R2 RR2Y 1.9 R Ac,PV, 52.8 7.6 0 123.4 $791$
18 Titan ProTitan Pro 20M120M1 RR2YRR2Y 2.02.0 RR AmAm 52.552.5 7.57.5 00 124.4124.4 $788$788
19 KrugerKruger K2-2002K2-2002 RR2YRR2Y 2 02.0 RR Ac PVAc,PV 52 452.4 7 97.9 00 125 4125.4 $786$786
20 ChannelChannel 1700R21700R2 RR2YRR2Y 1 71.7 RR Ac PVAc,PV 52 352.3 7 97.9 00 123 9123.9 $784$784
21 H ftHefty H16Y11H16Y11 RR2YRR2Y 1 61.6 MRMR II 51 451.4 7 67.6 00 123 9123.9 $771$771
22 Anderson 162R2Y RR2Y 1.6 R None 51.3 7.5 0 119.5 $770
23 Titan ProTitan Pro 15M2215M22 RR2YRR2Y 1.51.5 RR CMBCMB 51.351.3 7.87.8 00 125.4125.4 $769$769
24 DairylandDairyland DSR-1710R2YDSR-1710R2Y RR2YRR2Y 1 71.7 RR CMBCMB 51 351.3 7 77.7 00 122 0122.0 $769$769
25 HeftyHefty H20R3H20R3 RR2YRR2Y 2 02.0 MRMR II 50 550.5 8 28.2 00 121 0121.0 $757$757
26 PPrairie BrandiiBd PB 1743R2PB-1743R2 RR2YRR2Y 1 71.7 RR CMBCMB 50 250.2 7 77.7 00 125 8125.8 $752$752
27 Gold Country 1741 RR2Y 1.7 R Ac 50.1 7.8 0 123.9 $751
28 Trelaye ay 20RR4303 RR2Y 2.00 R Ac,Exc, 49.99 9 7.66 00 127.88 $749$9
29 HeftyHefty H14R3H14R3 RR2YRR2Y 1.41.4 MRMR II 49.749.7 7.77.7 00 122.9122.9 $746$746
30 Prairie BrandPrairie Brand PB-2099NRR2PB-2099NRR2 RR2YRR2Y 2 02.0 RR CMBCMB 49 649.6 7 87.8 00 126 3126.3 $743$743
31 WensmanWensman W 3174NR2W 3174NR2 RR2YRR2Y 1 71.7 RR AcAc 49 349.3 7 67.6 00 122 5122.5 $740$740
32 KKruger K2 1602K2-1602 RR2YRR2Y 1 61.6 R Ac,PV 48.78 7.66 00 125.412 $731$31
33 NK Brand S18-C2 §§ RR2Y 1.8 R CMB 48.7 7.7 0 126.8 $731$
34 KrugerKruger K2-1902K2 1902 RR2YRR2Y 1.91.9 RR Ac,PVAc,PV 48.748.7 7.57.5 00 124.4124.4 $730$730
35 Prairie BrandPrairie Brand PB-1823R2PB-1823R2 RR2YRR2Y 1 81.8 RR NoneNone 48 548.5 7 67.6 00 121 0121.0 $727$727
36 Gold CountryGold Country 15411541 RR2YRR2Y 1 51.5 RR AcAc 48 448.4 7 67.6 00 110 4110.4 $726$726
37 Test Average = 47 647.6 7 77.7 00 122 9122.9 $713$713
38 LSD (0.10) = 5.7 0.3 ns 37.8 566.4
39 F.I.R.S.T. Managerg C.V. = 8.8 2.9 56.4 846.2

View File

@ -0,0 +1,66 @@
"0","1","2","3","4"
"","DLHS-4 (2012-13)","","DLHS-3 (2007-08)",""
"Indicators","TOTAL","RURAL","TOTAL","RURAL"
"Child feeding practices (based on last-born child in the reference period) (%)","","","",""
"Children age 0-5 months exclusively breastfed9 .......................................................................... 76.9 80.0
Children age 6-9 months receiving solid/semi-solid food and breast milk .................................... 78.6 75.0
Children age 12-23 months receiving breast feeding along with complementary feeding ........... 31.8 24.2
Children age 6-35 months exclusively breastfed for at least 6 months ........................................ 4.7 3.4
Children under 3 years breastfed within one hour of birth ............................................................ 42.9 46.5","","","NA","NA"
"","","","85.9","89.3"
"","","","NA","NA"
"","","","30.0","27.7"
"","","","50.6","52.9"
"Birth Weight (%) (age below 36 months)","","","",""
"Percentage of Children weighed at birth ...................................................................................... 38.8 41.0 NA NA
Percentage of Children with low birth weight (out of those who weighted) ( below 2.5 kg) ......... 12.8 14.6 NA NA","","","",""
"Awareness about Diarrhoea (%)","","","",""
"Women know about what to do when a child gets diarrhoea ..................................................... 96.3 96.2","","","94.4","94.2"
"Awareness about ARI (%)","","","",""
"Women aware about danger signs of ARI10 ................................................................................. 55.9 59.7","","","32.8","34.7"
"Treatment of childhood diseases (based on last two surviving children born during the","","","",""
"","","","",""
"reference period) (%)","","","",""
"","","","",""
"Prevalence of diarrhoea in last 2 weeks for under 5 years old children ....................................... 1.6 1.3 6.5 7.0
Children with diarrhoea in the last 2 weeks and received ORS11 ................................................. 100.0 100.0 54.8 53.3
Children with diarrhoea in the last 2 weeks and sought advice/treatment ................................... 100.0 50.0 72.9 73.3
Prevalence of ARI in last 2 weeks for under 5 years old children ............................................ 4.3 3.9 3.9 4.2
Children with acute respiratory infection or fever in last 2 weeks and sought advice/treatment 37.5 33.3 69.8 68.0
Children with diarrhoea in the last 2 weeks given Zinc along with ORS ...................................... 66.6 50.0 NA NA","","","6.5","7.0"
"","","","54.8","53.3"
"","","","72.9","73.3"
"","","","3.9","4.2"
"","","","69.8","68.0"
"Awareness of RTI/STI and HIV/AIDS (%)","","","",""
"Women who have heard of RTI/STI ............................................................................................. 55.8 57.1
Women who have heard of HIV/AIDS .......................................................................................... 98.9 99.0
Women who have any symptoms of RTI/STI .............................................................................. 13.9 13.5
Women who know the place to go for testing of HIV/AIDS12 ....................................................... 59.9 57.1
Women underwent test for detecting HIV/AIDS12 ........................................................................ 37.3 36.8","","","34.8","38.2"
"","","","98.3","98.1"
"","","","15.6","16.1"
"","","","48.6","46.3"
"","","","14.1","12.3"
"Utilization of Government Health Services (%)","","","",""
"Antenatal care .............................................................................................................................. 69.7 66.7 79.0 81.0
Treatment for pregnancy complications ....................................................................................... 57.1 59.3 88.0 87.8
Treatment for post-delivery complications ................................................................................... 33.3 33.3 68.4 68.4
Treatment for vaginal discharge ................................................................................................... 20.0 25.0 73.9 71.4
Treatment for children with diarrhoea13 ........................................................................................ 50.0 100.0 NA NA
Treatment for children with ARI13 ................................................................................................. NA NA NA NA","","","79.0","81.0"
"","","","88.0","87.8"
"","","","68.4","68.4"
"","","","73.9","71.4"
"Birth Registration (%)","","","",""
"Children below age 5 years having birth registration done .......................................................... 40.6 44.3 NA NA
Children below age 5 years who received birth certificate (out of those registered) .................... 65.9 63.6 NA NA","","","",""
"Personal Habits (age 15 years and above) (%)","","","",""
"Men who use any kind of smokeless tobacco ............................................................................. 74.6 74.2 NA NA
Women who use any kind of smokeless tobacco ........................................................................ 59.5 58.9 NA NA
Men who smoke ........................................................................................................................... 56.0 56.4 NA NA
Women who smoke ...................................................................................................................... 18.4 18.0 NA NA
Men who consume alcohol ........................................................................................................... 58.4 58.2 NA NA
Women who consume alcohol ..................................................................................................... 10.9 9.3 NA NA","","","",""
"9 Children Who were given nothing but breast milk till the survey date 10Acute Respiratory Infections11Oral Rehydration Solutions/Salts.12Based on","","","",""
"the women who have heard of HIV/AIDS.13 Last two weeks","","","",""
1 0 1 2 3 4
2 DLHS-4 (2012-13) DLHS-3 (2007-08)
3 Indicators TOTAL RURAL TOTAL RURAL
4 Child feeding practices (based on last-born child in the reference period) (%)
5 Children age 0-5 months exclusively breastfed9 .......................................................................... 76.9 80.0 Children age 6-9 months receiving solid/semi-solid food and breast milk .................................... 78.6 75.0 Children age 12-23 months receiving breast feeding along with complementary feeding ........... 31.8 24.2 Children age 6-35 months exclusively breastfed for at least 6 months ........................................ 4.7 3.4 Children under 3 years breastfed within one hour of birth ............................................................ 42.9 46.5 NA NA
6 85.9 89.3
7 NA NA
8 30.0 27.7
9 50.6 52.9
10 Birth Weight (%) (age below 36 months)
11 Percentage of Children weighed at birth ...................................................................................... 38.8 41.0 NA NA Percentage of Children with low birth weight (out of those who weighted) ( below 2.5 kg) ......... 12.8 14.6 NA NA
12 Awareness about Diarrhoea (%)
13 Women know about what to do when a child gets diarrhoea ..................................................... 96.3 96.2 94.4 94.2
14 Awareness about ARI (%)
15 Women aware about danger signs of ARI10 ................................................................................. 55.9 59.7 32.8 34.7
16 Treatment of childhood diseases (based on last two surviving children born during the
17
18 reference period) (%)
19
20 Prevalence of diarrhoea in last 2 weeks for under 5 years old children ....................................... 1.6 1.3 6.5 7.0 Children with diarrhoea in the last 2 weeks and received ORS11 ................................................. 100.0 100.0 54.8 53.3 Children with diarrhoea in the last 2 weeks and sought advice/treatment ................................... 100.0 50.0 72.9 73.3 Prevalence of ARI in last 2 weeks for under 5 years old children ............................................ 4.3 3.9 3.9 4.2 Children with acute respiratory infection or fever in last 2 weeks and sought advice/treatment 37.5 33.3 69.8 68.0 Children with diarrhoea in the last 2 weeks given Zinc along with ORS ...................................... 66.6 50.0 NA NA 6.5 7.0
21 54.8 53.3
22 72.9 73.3
23 3.9 4.2
24 69.8 68.0
25 Awareness of RTI/STI and HIV/AIDS (%)
26 Women who have heard of RTI/STI ............................................................................................. 55.8 57.1 Women who have heard of HIV/AIDS .......................................................................................... 98.9 99.0 Women who have any symptoms of RTI/STI .............................................................................. 13.9 13.5 Women who know the place to go for testing of HIV/AIDS12 ....................................................... 59.9 57.1 Women underwent test for detecting HIV/AIDS12 ........................................................................ 37.3 36.8 34.8 38.2
27 98.3 98.1
28 15.6 16.1
29 48.6 46.3
30 14.1 12.3
31 Utilization of Government Health Services (%)
32 Antenatal care .............................................................................................................................. 69.7 66.7 79.0 81.0 Treatment for pregnancy complications ....................................................................................... 57.1 59.3 88.0 87.8 Treatment for post-delivery complications ................................................................................... 33.3 33.3 68.4 68.4 Treatment for vaginal discharge ................................................................................................... 20.0 25.0 73.9 71.4 Treatment for children with diarrhoea13 ........................................................................................ 50.0 100.0 NA NA Treatment for children with ARI13 ................................................................................................. NA NA NA NA 79.0 81.0
33 88.0 87.8
34 68.4 68.4
35 73.9 71.4
36 Birth Registration (%)
37 Children below age 5 years having birth registration done .......................................................... 40.6 44.3 NA NA Children below age 5 years who received birth certificate (out of those registered) .................... 65.9 63.6 NA NA
38 Personal Habits (age 15 years and above) (%)
39 Men who use any kind of smokeless tobacco ............................................................................. 74.6 74.2 NA NA Women who use any kind of smokeless tobacco ........................................................................ 59.5 58.9 NA NA Men who smoke ........................................................................................................................... 56.0 56.4 NA NA Women who smoke ...................................................................................................................... 18.4 18.0 NA NA Men who consume alcohol ........................................................................................................... 58.4 58.2 NA NA Women who consume alcohol ..................................................................................................... 10.9 9.3 NA NA
40 9 Children Who were given nothing but breast milk till the survey date 10Acute Respiratory Infections11Oral Rehydration Solutions/Salts.12Based on
41 the women who have heard of HIV/AIDS.13 Last two weeks

View File

@ -0,0 +1,44 @@
"0","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23"
"","Table: 5 Public Health Outlay 2012-13 (Budget Estimates) (Rs. in 000)","","","","","","","","","","","","","","","","","","","","","",""
"","States-A","","","Revenue","","","","","","Capital","","","","","","Total","","","Others(1)","","","Total",""
"","","","","","","","","","","","","","","","","Revenue &","","","","","","",""
"","","","Medical & Family Medical & Family
Public Welfare Public Welfare
Health Health","","","","","","","","","","","","","","","","","","","",""
"","","","","","","","","","","","","","","","","Capital","","","","","","",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"","Andhra Pradesh","","","47,824,589","","","9,967,837","","","1,275,000","","","15,000","","","59,082,426","","","14,898,243","","","73,980,669",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Arunachal Pradesh 2,241,609 107,549 23,000 0 2,372,158 86,336 2,458,494","","","","","","","","","","","","","","","","","","","","","","",""
"","Assam","","","14,874,821","","","2,554,197","","","161,600","","","0","","","17,590,618","","","4,408,505","","","21,999,123",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Bihar 21,016,708 4,332,141 5,329,000 0 30,677,849 2,251,571 32,929,420","","","","","","","","","","","","","","","","","","","","","","",""
"","Chhattisgarh","","","11,427,311","","","1,415,660","","","2,366,592","","","0","","","15,209,563","","","311,163","","","15,520,726",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Delhi 28,084,780 411,700 4,550,000 0 33,046,480 5,000 33,051,480","","","","","","","","","","","","","","","","","","","","","","",""
"","Goa","","","4,055,567","","","110,000","","","330,053","","","0","","","4,495,620","","","12,560","","","4,508,180",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Gujarat 26,328,400 6,922,900 12,664,000 42,000 45,957,300 455,860 46,413,160","","","","","","","","","","","","","","","","","","","","","","",""
"","Haryana","","","15,156,681","","","1,333,527","","","40,100","","","0","","","16,530,308","","","1,222,698","","","17,753,006",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Himachal Pradesh 8,647,229 1,331,529 580,800 0 10,559,558 725,315 11,284,873","","","","","","","","","","","","","","","","","","","","","","",""
"","Jammu & Kashmir","","","14,411,984","","","270,840","","","3,188,550","","","0","","","17,871,374","","","166,229","","","18,037,603",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Jharkhand 8,185,079 3,008,077 3,525,558 0 14,718,714 745,139 15,463,853","","","","","","","","","","","","","","","","","","","","","","",""
"","Karnataka","","","34,939,843","","","4,317,801","","","3,669,700","","","0","","","42,927,344","","","631,088","","","43,558,432",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Kerala 27,923,965 3,985,473 929,503 0 32,838,941 334,640 33,173,581","","","","","","","","","","","","","","","","","","","","","","",""
"","Madhya Pradesh","","","28,459,540","","","4,072,016","","","3,432,711","","","0","","","35,964,267","","","472,139","","","36,436,406",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Maharashtra 55,011,100 6,680,721 5,038,576 0 66,730,397 313,762 67,044,159","","","","","","","","","","","","","","","","","","","","","","",""
"","Manipur","","","2,494,600","","","187,700","","","897,400","","","0","","","3,579,700","","","0","","","3,579,700",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Meghalaya 2,894,093 342,893 705,500 5,000 3,947,486 24,128 3,971,614","","","","","","","","","","","","","","","","","","","","","","",""
"","Mizoram","","","1,743,501","","","84,185","","","10,250","","","0","","","1,837,936","","","17,060","","","1,854,996",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Nagaland 2,368,724 204,329 226,400 0 2,799,453 783,054 3,582,507","","","","","","","","","","","","","","","","","","","","","","",""
"","Odisha","","","14,317,179","","","2,552,292","","","1,107,250","","","0","","","17,976,721","","","451,438","","","18,428,159",""
"","","","","","","","","","","","","","","","","","","","","","","",""
"Puducherry 4,191,757 52,249 192,400 0 4,436,406 2,173 4,438,579","","","","","","","","","","","","","","","","","","","","","","",""
"","Punjab","","","19,775,485","","","2,208,343","","","2,470,882","","","0","","","24,454,710","","","1,436,522","","","25,891,232",""
"","","","","","","","","","","","","","","","","","","","","","","",""
1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
2 Table: 5 Public Health Outlay 2012-13 (Budget Estimates) (Rs. in 000)
3 States-A Revenue Capital Total Others(1) Total
4 Revenue &
5 Medical & Family Medical & Family Public Welfare Public Welfare Health Health
6 Capital
7
8 Andhra Pradesh 47,824,589 9,967,837 1,275,000 15,000 59,082,426 14,898,243 73,980,669
9
10 Arunachal Pradesh 2,241,609 107,549 23,000 0 2,372,158 86,336 2,458,494
11 Assam 14,874,821 2,554,197 161,600 0 17,590,618 4,408,505 21,999,123
12
13 Bihar 21,016,708 4,332,141 5,329,000 0 30,677,849 2,251,571 32,929,420
14 Chhattisgarh 11,427,311 1,415,660 2,366,592 0 15,209,563 311,163 15,520,726
15
16 Delhi 28,084,780 411,700 4,550,000 0 33,046,480 5,000 33,051,480
17 Goa 4,055,567 110,000 330,053 0 4,495,620 12,560 4,508,180
18
19 Gujarat 26,328,400 6,922,900 12,664,000 42,000 45,957,300 455,860 46,413,160
20 Haryana 15,156,681 1,333,527 40,100 0 16,530,308 1,222,698 17,753,006
21
22 Himachal Pradesh 8,647,229 1,331,529 580,800 0 10,559,558 725,315 11,284,873
23 Jammu & Kashmir 14,411,984 270,840 3,188,550 0 17,871,374 166,229 18,037,603
24
25 Jharkhand 8,185,079 3,008,077 3,525,558 0 14,718,714 745,139 15,463,853
26 Karnataka 34,939,843 4,317,801 3,669,700 0 42,927,344 631,088 43,558,432
27
28 Kerala 27,923,965 3,985,473 929,503 0 32,838,941 334,640 33,173,581
29 Madhya Pradesh 28,459,540 4,072,016 3,432,711 0 35,964,267 472,139 36,436,406
30
31 Maharashtra 55,011,100 6,680,721 5,038,576 0 66,730,397 313,762 67,044,159
32 Manipur 2,494,600 187,700 897,400 0 3,579,700 0 3,579,700
33
34 Meghalaya 2,894,093 342,893 705,500 5,000 3,947,486 24,128 3,971,614
35 Mizoram 1,743,501 84,185 10,250 0 1,837,936 17,060 1,854,996
36
37 Nagaland 2,368,724 204,329 226,400 0 2,799,453 783,054 3,582,507
38 Odisha 14,317,179 2,552,292 1,107,250 0 17,976,721 451,438 18,428,159
39
40 Puducherry 4,191,757 52,249 192,400 0 4,436,406 2,173 4,438,579
41 Punjab 19,775,485 2,208,343 2,470,882 0 24,454,710 1,436,522 25,891,232
42

View File

@ -0,0 +1,71 @@
"0","1","2","3","4"
"","DLHS-4 (2012-13)","","DLHS-3 (2007-08)",""
"Indicators","TOTAL","RURAL","TOTAL","RURAL"
"Reported Prevalence of Morbidity","","","",""
"Any Injury ..................................................................................................................................... 1.9 2.1
Acute Illness ................................................................................................................................. 4.5 5.6
Chronic Illness .............................................................................................................................. 5.1 4.1","","","",""
"","","","",""
"","","","",""
"Reported Prevalence of Chronic Illness during last one year (%)","","","",""
"Disease of respiratory system ...................................................................................................... 11.7 15.0
Disease of cardiovascular system ................................................................................................ 8.9 9.3
Persons suffering from tuberculosis ............................................................................................. 2.2 1.5","","","",""
"","","","",""
"","","","",""
"Anaemia Status by Haemoglobin Level14 (%)","","","",""
"Children (6-59 months) having anaemia ...................................................................................... 68.5 71.9
Children (6-59 months) having severe anaemia .......................................................................... 6.7 9.4
Children (6-9 Years) having anaemia - Male ................................................................................ 67.1 71.4
Children (6-9 Years) having severe anaemia - Male .................................................................... 4.4 2.4
Children (6-9 Years) having anaemia - Female ........................................................................... 52.4 48.8
Children (6-9 Years) having severe anaemia - Female ................................................................ 1.2 0.0
Children (6-14 years) having anaemia - Male ............................................................................. 50.8 62.5
Children (6-14 years) having severe anaemia - Male .................................................................. 3.7 3.6
Children (6-14 years) having anaemia - Female ......................................................................... 48.3 50.0
Children (6-14 years) having severe anaemia - Female .............................................................. 4.3 6.1
Children (10-19 Years15) having anaemia - Male ......................................................................... 37.9 51.2
Children (10-19 Years15) having severe anaemia - Male ............................................................. 3.5 4.0
Children (10-19 Years15) having anaemia - Female ..................................................................... 46.6 52.1
Children (10-19 Years15) having severe anaemia - Female ......................................................... 6.4 6.5
Adolescents (15-19 years) having anaemia ................................................................................ 39.4 46.5
Adolescents (15-19 years) having severe anaemia ..................................................................... 5.4 5.1
Pregnant women (15-49 aged) having anaemia .......................................................................... 48.8 51.5
Pregnant women (15-49 aged) having severe anaemia .............................................................. 7.1 8.8
Women (15-49 aged) having anaemia ......................................................................................... 45.2 51.7
Women (15-49 aged) having severe anaemia ............................................................................. 4.8 5.9
Persons (20 years and above) having anaemia ........................................................................... 37.8 42.1
Persons (20 years and above) having Severe anaemia .............................................................. 4.6 4.8","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"","","","",""
"Blood Sugar Level (age 18 years and above) (%)","","","",""
"Blood Sugar Level >140 mg/dl (high) ........................................................................................... 12.9 11.1
Blood Sugar Level >160 mg/dl (very high) ................................................................................... 7.0 5.1","","","",""
"","","","",""
"Hypertension (age 18 years and above) (%)","","","",""
"Above Normal Range (Systolic >140 mm of Hg & Diastolic >90 mm of Hg ) .............................. 23.8 22.8
Moderately High (Systolic >160 mm of Hg & Diastolic >100 mm of Hg ) ..................................... 8.2 7.1
Very High (Systolic >180 mm of Hg & Diastolic >110 mm of Hg ) ............................................... 3.7 3.1","","","",""
"","","","",""
"","","","",""
"14 Any anaemia below 11g/dl, severe anaemia below 7g/dl. 15 Excluding age group 19 years","","","",""
"Chronic Illness :Any person with symptoms persisting for longer than one month is defined as suffering from chronic illness","","","",""
1 0 1 2 3 4
2 DLHS-4 (2012-13) DLHS-3 (2007-08)
3 Indicators TOTAL RURAL TOTAL RURAL
4 Reported Prevalence of Morbidity
5 Any Injury ..................................................................................................................................... 1.9 2.1 Acute Illness ................................................................................................................................. 4.5 5.6 Chronic Illness .............................................................................................................................. 5.1 4.1
6
7
8 Reported Prevalence of Chronic Illness during last one year (%)
9 Disease of respiratory system ...................................................................................................... 11.7 15.0 Disease of cardiovascular system ................................................................................................ 8.9 9.3 Persons suffering from tuberculosis ............................................................................................. 2.2 1.5
10
11
12 Anaemia Status by Haemoglobin Level14 (%)
13 Children (6-59 months) having anaemia ...................................................................................... 68.5 71.9 Children (6-59 months) having severe anaemia .......................................................................... 6.7 9.4 Children (6-9 Years) having anaemia - Male ................................................................................ 67.1 71.4 Children (6-9 Years) having severe anaemia - Male .................................................................... 4.4 2.4 Children (6-9 Years) having anaemia - Female ........................................................................... 52.4 48.8 Children (6-9 Years) having severe anaemia - Female ................................................................ 1.2 0.0 Children (6-14 years) having anaemia - Male ............................................................................. 50.8 62.5 Children (6-14 years) having severe anaemia - Male .................................................................. 3.7 3.6 Children (6-14 years) having anaemia - Female ......................................................................... 48.3 50.0 Children (6-14 years) having severe anaemia - Female .............................................................. 4.3 6.1 Children (10-19 Years15) having anaemia - Male ......................................................................... 37.9 51.2 Children (10-19 Years15) having severe anaemia - Male ............................................................. 3.5 4.0 Children (10-19 Years15) having anaemia - Female ..................................................................... 46.6 52.1 Children (10-19 Years15) having severe anaemia - Female ......................................................... 6.4 6.5 Adolescents (15-19 years) having anaemia ................................................................................ 39.4 46.5 Adolescents (15-19 years) having severe anaemia ..................................................................... 5.4 5.1 Pregnant women (15-49 aged) having anaemia .......................................................................... 48.8 51.5 Pregnant women (15-49 aged) having severe anaemia .............................................................. 7.1 8.8 Women (15-49 aged) having anaemia ......................................................................................... 45.2 51.7 Women (15-49 aged) having severe anaemia ............................................................................. 4.8 5.9 Persons (20 years and above) having anaemia ........................................................................... 37.8 42.1 Persons (20 years and above) having Severe anaemia .............................................................. 4.6 4.8
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35 Blood Sugar Level (age 18 years and above) (%)
36 Blood Sugar Level >140 mg/dl (high) ........................................................................................... 12.9 11.1 Blood Sugar Level >160 mg/dl (very high) ................................................................................... 7.0 5.1
37
38 Hypertension (age 18 years and above) (%)
39 Above Normal Range (Systolic >140 mm of Hg & Diastolic >90 mm of Hg ) .............................. 23.8 22.8 Moderately High (Systolic >160 mm of Hg & Diastolic >100 mm of Hg ) ..................................... 8.2 7.1 Very High (Systolic >180 mm of Hg & Diastolic >110 mm of Hg ) ............................................... 3.7 3.1
40
41
42 14 Any anaemia below 11g/dl, severe anaemia below 7g/dl. 15 Excluding age group 19 years
43 Chronic Illness :Any person with symptoms persisting for longer than one month is defined as suffering from chronic illness

View File

@ -63,7 +63,7 @@ master_doc = 'index'
# General information about the project. # General information about the project.
project = u'Camelot' project = u'Camelot'
copyright = u'2018, Peeply Private Ltd (Singapore)' copyright = u'2018, <a href="https://socialcops.com" target="_blank">SocialCops</a>'
author = u'Vinayak Mehta' author = u'Vinayak Mehta'
# The version info for the project you're documenting, acts as replacement for # The version info for the project you're documenting, acts as replacement for

View File

@ -7,7 +7,7 @@ If you're reading this, you're probably looking to contributing to Camelot. *Tim
This document will help you get started with contributing documentation, code, testing and filing issues. If you have any questions, feel free to reach out to `Vinayak Mehta`_, the author and maintainer. This document will help you get started with contributing documentation, code, testing and filing issues. If you have any questions, feel free to reach out to `Vinayak Mehta`_, the author and maintainer.
.. _Vinayak Mehta: https://vinayak-mehta.github.io .. _Vinayak Mehta: https://www.vinayakmehta.com
Code Of Conduct Code Of Conduct
--------------- ---------------

View File

@ -11,6 +11,10 @@ Release v\ |version|. (:ref:`Installation <install>`)
.. image:: https://travis-ci.org/socialcopsdev/camelot.svg?branch=master .. image:: https://travis-ci.org/socialcopsdev/camelot.svg?branch=master
:target: https://travis-ci.org/socialcopsdev/camelot :target: https://travis-ci.org/socialcopsdev/camelot
.. image:: https://readthedocs.org/projects/camelot-py/badge/?version=master
:target: https://camelot-py.readthedocs.io/en/master/
:alt: Documentation Status
.. image:: https://codecov.io/github/socialcopsdev/camelot/badge.svg?branch=master&service=github .. image:: https://codecov.io/github/socialcopsdev/camelot/badge.svg?branch=master&service=github
:target: https://codecov.io/github/socialcopsdev/camelot?branch=master :target: https://codecov.io/github/socialcopsdev/camelot?branch=master
@ -23,8 +27,15 @@ Release v\ |version|. (:ref:`Installation <install>`)
.. image:: https://img.shields.io/pypi/pyversions/camelot-py.svg .. image:: https://img.shields.io/pypi/pyversions/camelot-py.svg
:target: https://pypi.org/project/camelot-py/ :target: https://pypi.org/project/camelot-py/
.. image:: https://badges.gitter.im/camelot-dev/Lobby.png
:target: https://gitter.im/camelot-dev/Lobby
**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files! **Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!
.. note:: You can also check out `Excalibur`_, which is a web interface for Camelot!
.. _Excalibur: https://github.com/camelot-dev/excalibur
---- ----
**Here's how you can extract tables from PDF files.** Check out the PDF used in this example `here`_. **Here's how you can extract tables from PDF files.** Check out the PDF used in this example `here`_.
@ -81,6 +92,7 @@ This part of the documentation begins with some background information about why
:maxdepth: 2 :maxdepth: 2
user/intro user/intro
user/install-deps
user/install user/install
user/how-it-works user/how-it-works
user/quickstart user/quickstart

View File

@ -24,25 +24,34 @@ To process background lines, you can pass ``process_background=True``.
>>> tables = camelot.read_pdf('background_lines.pdf', process_background=True) >>> tables = camelot.read_pdf('background_lines.pdf', process_background=True)
>>> tables[1].df >>> tables[1].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -back background_lines.pdf
.. csv-table:: .. csv-table::
:file: ../_static/csv/background_lines.csv :file: ../_static/csv/background_lines.csv
Plot geometry Visual debugging
------------- ----------------
You can use a :class:`table <camelot.core.Table>` object's :meth:`plot() <camelot.core.TableList.plot>` method to plot various geometries that were detected by Camelot while processing the PDF page. This can help you select table areas, column separators and debug bad table outputs, by tweaking different configuration parameters. .. note:: Visual debugging using ``plot()`` requires `matplotlib <https://matplotlib.org/>`_ which is an optional dependency. You can install it using ``$ pip install camelot-py[plot]``.
The following geometries are available for plotting. You can pass them to the :meth:`plot() <camelot.core.TableList.plot>` method, which will then generate a `matplotlib <https://matplotlib.org/>`_ plot for the passed geometry. You can use the :class:`plot() <camelot.plotting.PlotMethods>` method to generate a `matplotlib <https://matplotlib.org/>`_ plot of various elements that were detected on the PDF page while processing it. This can help you select table areas, column separators and debug bad table outputs, by tweaking different configuration parameters.
You can specify the type of element you want to plot using the ``kind`` keyword argument. The generated plot can be saved to a file by passing a ``filename`` keyword argument. The following plot types are supported:
- 'text' - 'text'
- 'table' - 'grid'
- 'contour' - 'contour'
- 'line' - 'line'
- 'joint' - 'joint'
- 'textedge'
.. note:: The last three geometries can only be used with :ref:`Lattice <lattice>`, i.e. when ``flavor='lattice'``. .. note:: 'line' and 'joint' can only be used with :ref:`Lattice <lattice>` and 'textedge' can only be used with :ref:`Stream <stream>`.
Let's generate a plot for each geometry using this `PDF <../_static/pdf/foo.pdf>`__ as an example. First, let's get all the tables out. Let's generate a plot for each type using this `PDF <../_static/pdf/foo.pdf>`__ as an example. First, let's get all the tables out.
:: ::
@ -50,8 +59,6 @@ Let's generate a plot for each geometry using this `PDF <../_static/pdf/foo.pdf>
>>> tables >>> tables
<TableList n=1> <TableList n=1>
.. _geometry_text:
text text
^^^^ ^^^^
@ -59,9 +66,16 @@ Let's plot all the text present on the table's PDF page.
:: ::
>>> tables[0].plot('text') >>> camelot.plot(tables[0], kind='text')
>>> plt.show()
.. figure:: ../_static/png/geometry_text.png .. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -plot text foo.pdf
.. figure:: ../_static/png/plot_text.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
@ -72,18 +86,23 @@ This, as we shall later see, is very helpful with :ref:`Stream <stream>` for not
.. note:: The *x-y* coordinates shown above change as you move your mouse cursor on the image, which can help you note coordinates. .. note:: The *x-y* coordinates shown above change as you move your mouse cursor on the image, which can help you note coordinates.
.. _geometry_table:
table table
^^^^^ ^^^^^
Let's plot the table (to see if it was detected correctly or not). This geometry type, along with contour, line and joint is useful for debugging and improving the extraction output, in case the table wasn't detected correctly. (More on that later.) Let's plot the table (to see if it was detected correctly or not). This plot type, along with contour, line and joint is useful for debugging and improving the extraction output, in case the table wasn't detected correctly. (More on that later.)
:: ::
>>> tables[0].plot('table') >>> camelot.plot(tables[0], kind='grid')
>>> plt.show()
.. figure:: ../_static/png/geometry_table.png .. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -plot grid foo.pdf
.. figure:: ../_static/png/plot_table.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
@ -92,8 +111,6 @@ Let's plot the table (to see if it was detected correctly or not). This geometry
The table is perfect! The table is perfect!
.. _geometry_contour:
contour contour
^^^^^^^ ^^^^^^^
@ -101,17 +118,22 @@ Now, let's plot all table boundaries present on the table's PDF page.
:: ::
>>> tables[0].plot('contour') >>> camelot.plot(tables[0], kind='contour')
>>> plt.show()
.. figure:: ../_static/png/geometry_contour.png .. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -plot contour foo.pdf
.. figure:: ../_static/png/plot_contour.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
:alt: A plot of all contours on a PDF page :alt: A plot of all contours on a PDF page
:align: left :align: left
.. _geometry_line:
line line
^^^^ ^^^^
@ -119,17 +141,22 @@ Cool, let's plot all line segments present on the table's PDF page.
:: ::
>>> tables[0].plot('line') >>> camelot.plot(tables[0], kind='line')
>>> plt.show()
.. figure:: ../_static/png/geometry_line.png .. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -plot line foo.pdf
.. figure:: ../_static/png/plot_line.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
:alt: A plot of all lines on a PDF page :alt: A plot of all lines on a PDF page
:align: left :align: left
.. _geometry_joint:
joint joint
^^^^^ ^^^^^
@ -137,19 +164,49 @@ Finally, let's plot all line intersections present on the table's PDF page.
:: ::
>>> tables[0].plot('joint') >>> camelot.plot(tables[0], kind='joint')
>>> plt.show()
.. figure:: ../_static/png/geometry_joint.png .. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -plot joint foo.pdf
.. figure:: ../_static/png/plot_joint.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
:alt: A plot of all line intersections on a PDF page :alt: A plot of all line intersections on a PDF page
:align: left :align: left
textedge
^^^^^^^^
You can also visualize the textedges found on a page by specifying ``kind='textedge'``. To know more about what a "textedge" is, you can see pages 20, 35 and 40 of `Anssi Nurminen's master's thesis <http://dspace.cc.tut.fi/dpub/bitstream/handle/123456789/21520/Nurminen.pdf?sequence=3>`_.
::
>>> camelot.plot(tables[0], kind='textedge')
>>> plt.show()
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot stream -plot textedge foo.pdf
.. figure:: ../_static/png/plot_textedge.png
:height: 674
:width: 1366
:scale: 50%
:alt: A plot of relevant textedges on a PDF page
:align: left
Specify table areas Specify table areas
------------------- -------------------
Since :ref:`Stream <stream>` treats the whole page as a table, `for now`_, it's useful to specify table boundaries in cases such as `these <../_static/pdf/table_areas.pdf>`__. You can :ref:`plot the text <geometry_text>` on this page and note the left-top and right-bottom coordinates of the table. In cases such as `these <../_static/pdf/table_areas.pdf>`__, it can be useful to specify table boundaries. You can plot the text on this page and note the top left and bottom right coordinates of the table.
Table areas that you want Camelot to analyze can be passed as a list of comma-separated strings to :meth:`read_pdf() <camelot.read_pdf>`, using the ``table_areas`` keyword argument. Table areas that you want Camelot to analyze can be passed as a list of comma-separated strings to :meth:`read_pdf() <camelot.read_pdf>`, using the ``table_areas`` keyword argument.
@ -160,27 +217,39 @@ Table areas that you want Camelot to analyze can be passed as a list of comma-se
>>> tables = camelot.read_pdf('table_areas.pdf', flavor='stream', table_areas=['316,499,566,337']) >>> tables = camelot.read_pdf('table_areas.pdf', flavor='stream', table_areas=['316,499,566,337'])
>>> tables[0].df >>> tables[0].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot stream -T 316,499,566,337 table_areas.pdf
.. csv-table:: .. csv-table::
:file: ../_static/csv/table_areas.csv :file: ../_static/csv/table_areas.csv
Specify column separators Specify column separators
------------------------- -------------------------
In cases like `these <../_static/pdf/column_separators.pdf>`__, where the text is very close to each other, it is possible that Camelot may guess the column separators' coordinates incorrectly. To correct this, you can explicitly specify the *x* coordinate for each column separator by :ref:`plotting the text <geometry_text>` on the page. In cases like `these <../_static/pdf/column_separators.pdf>`__, where the text is very close to each other, it is possible that Camelot may guess the column separators' coordinates incorrectly. To correct this, you can explicitly specify the *x* coordinate for each column separator by plotting the text on the page.
You can pass the column separators as a list of comma-separated strings to :meth:`read_pdf() <camelot.read_pdf>`, using the ``columns`` keyword argument. You can pass the column separators as a list of comma-separated strings to :meth:`read_pdf() <camelot.read_pdf>`, using the ``columns`` keyword argument.
In case you passed a single column separators string list, and no table area is specified, the separators will be applied to the whole page. When a list of table areas is specified and you need to specify column separators as well, **the length of both lists should be equal**. Each table area will be mapped to each column separators' string using their indices. In case you passed a single column separators string list, and no table area is specified, the separators will be applied to the whole page. When a list of table areas is specified and you need to specify column separators as well, **the length of both lists should be equal**. Each table area will be mapped to each column separators' string using their indices.
For example, if you have specified two table areas, ``table_areas=['12,23,43,54', '20,33,55,67']``, and only want to specify column separators for the first table, you can pass an empty string for the second table in the column separators' list like this, ``columns=['10,120,200,400', '']``. For example, if you have specified two table areas, ``table_areas=['12,54,43,23', '20,67,55,33']``, and only want to specify column separators for the first table, you can pass an empty string for the second table in the column separators' list like this, ``columns=['10,120,200,400', '']``.
Let's get back to the *x* coordinates we got from :ref:`plotting text <geometry_text>` that exists on this `PDF <../_static/pdf/column_separators.pdf>`__, and get the table out! Let's get back to the *x* coordinates we got from plotting the text that exists on this `PDF <../_static/pdf/column_separators.pdf>`__, and get the table out!
:: ::
>>> tables = camelot.read_pdf('column_separators.pdf', flavor='stream', columns=['72,95,209,327,442,529,566,606,683']) >>> tables = camelot.read_pdf('column_separators.pdf', flavor='stream', columns=['72,95,209,327,442,529,566,606,683'])
>>> tables[0].df >>> tables[0].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot stream -C 72,95,209,327,442,529,566,606,683 column_separators.pdf
.. csv-table:: .. csv-table::
"...","...","...","...","...","...","...","...","...","..." "...","...","...","...","...","...","...","...","...","..."
@ -200,6 +269,12 @@ To deal with cases like the output from the previous section, you can pass ``spl
>>> tables = camelot.read_pdf('column_separators.pdf', flavor='stream', columns=['72,95,209,327,442,529,566,606,683'], split_text=True) >>> tables = camelot.read_pdf('column_separators.pdf', flavor='stream', columns=['72,95,209,327,442,529,566,606,683'], split_text=True)
>>> tables[0].df >>> tables[0].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot -split stream -C 72,95,209,327,442,529,566,606,683 column_separators.pdf
.. csv-table:: .. csv-table::
"...","...","...","...","...","...","...","...","...","..." "...","...","...","...","...","...","...","...","...","..."
@ -227,6 +302,12 @@ You can solve this by passing ``flag_size=True``, which will enclose the supersc
>>> tables = camelot.read_pdf('superscript.pdf', flavor='stream', flag_size=True) >>> tables = camelot.read_pdf('superscript.pdf', flavor='stream', flag_size=True)
>>> tables[0].df >>> tables[0].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot -flag stream superscript.pdf
.. csv-table:: .. csv-table::
"...","...","...","...","...","...","...","...","...","...","..." "...","...","...","...","...","...","...","...","...","...","..."
@ -259,6 +340,12 @@ You can pass ``row_close_tol=<+int>`` to group the rows closer together, as show
>>> tables = camelot.read_pdf('group_rows.pdf', flavor='stream', row_close_tol=10) >>> tables = camelot.read_pdf('group_rows.pdf', flavor='stream', row_close_tol=10)
>>> tables[0].df >>> tables[0].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot stream -r 10 group_rows.pdf
.. csv-table:: .. csv-table::
"Clave","Nombre Entidad","Clave","","Nombre Municipio","Clave","Nombre Localidad" "Clave","Nombre Entidad","Clave","","Nombre Municipio","Clave","Nombre Localidad"
@ -282,23 +369,31 @@ Here's a `PDF <../_static/pdf/short_lines.pdf>`__ where small lines separating t
:alt: A PDF table with short lines :alt: A PDF table with short lines
:align: left :align: left
Let's :ref:`plot the table <geometry_table>` for this PDF. Let's plot the table for this PDF.
:: ::
>>> tables = camelot.read_pdf('short_lines.pdf') >>> tables = camelot.read_pdf('short_lines.pdf')
>>> tables[0].plot('table') >>> camelot.plot(tables[0], kind='grid')
>>> plt.show()
.. figure:: ../_static/png/short_lines_1.png .. figure:: ../_static/png/short_lines_1.png
:alt: A plot of the PDF table with short lines :alt: A plot of the PDF table with short lines
:align: left :align: left
Clearly, the smaller lines separating the headers, couldn't be detected. Let's try with ``line_size_scaling=40``, and `plot the table <geometry_table>`_ again. Clearly, the smaller lines separating the headers, couldn't be detected. Let's try with ``line_size_scaling=40``, and plot the table again.
:: ::
>>> tables = camelot.read_pdf('short_lines.pdf', line_size_scaling=40) >>> tables = camelot.read_pdf('short_lines.pdf', line_size_scaling=40)
>>> tables[0].plot('table') >>> camelot.plot(tables[0], kind='grid')
>>> plt.show()
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -scale 40 -plot grid short_lines.pdf
.. figure:: ../_static/png/short_lines_2.png .. figure:: ../_static/png/short_lines_2.png
:alt: An improved plot of the PDF table with short lines :alt: An improved plot of the PDF table with short lines
@ -363,6 +458,12 @@ No surprises there — it did remain in place (observe the strings "2400" and "A
>>> tables = camelot.read_pdf('short_lines.pdf', line_size_scaling=40, shift_text=['r', 'b']) >>> tables = camelot.read_pdf('short_lines.pdf', line_size_scaling=40, shift_text=['r', 'b'])
>>> tables[0].df >>> tables[0].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -scale 40 -shift r -shift b short_lines.pdf
.. csv-table:: .. csv-table::
"Investigations","No. ofHHs","Age/Sex/Physiological Group","Preva-lence","C.I*","RelativePrecision","Sample sizeper State" "Investigations","No. ofHHs","Age/Sex/Physiological Group","Preva-lence","C.I*","RelativePrecision","Sample sizeper State"
@ -408,6 +509,12 @@ We don't need anything else. Now, let's pass ``copy_text=['v']`` to copy text in
>>> tables = camelot.read_pdf('copy_text.pdf', copy_text=['v']) >>> tables = camelot.read_pdf('copy_text.pdf', copy_text=['v'])
>>> tables[0].df >>> tables[0].df
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot lattice -copy v copy_text.pdf
.. csv-table:: .. csv-table::
"Sl. No.","Name of State/UT","Name of District","Disease/ Illness","No. of Cases","No. of Deaths","Date of start of outbreak","Date of reporting","Current Status","..." "Sl. No.","Name of State/UT","Name of District","Disease/ Illness","No. of Cases","No. of Deaths","Date of start of outbreak","Date of reporting","Current Status","..."

View File

@ -11,12 +11,14 @@ You can print the help for the interface by typing ``camelot --help`` in your fa
Usage: camelot [OPTIONS] COMMAND [ARGS]... Usage: camelot [OPTIONS] COMMAND [ARGS]...
Camelot: PDF Table Extraction for Humans Camelot: PDF Table Extraction for Humans
Options: Options:
--version Show the version and exit. --version Show the version and exit.
-q, --quiet TEXT Suppress logs and warnings.
-p, --pages TEXT Comma-separated page numbers. Example: 1,3,4 -p, --pages TEXT Comma-separated page numbers. Example: 1,3,4
or 1,4-end. or 1,4-end.
-pw, --password TEXT Password for decryption.
-o, --output TEXT Output file path. -o, --output TEXT Output file path.
-f, --format [csv|json|excel|html] -f, --format [csv|json|excel|html]
Output file format. Output file format.

View File

@ -5,24 +5,24 @@ How It Works
This part of the documentation includes a high-level explanation of how Camelot extracts tables from PDF files. This part of the documentation includes a high-level explanation of how Camelot extracts tables from PDF files.
You can choose between two table parsing methods, *Stream* and *Lattice*. These names for parsing methods inside Camelot were inspired from `Tabula`_. You can choose between two table parsing methods, *Stream* and *Lattice*. These names for parsing methods inside Camelot were inspired from `Tabula <https://github.com/tabulapdf/tabula>`_.
.. _Tabula: https://github.com/tabulapdf/tabula
.. _stream: .. _stream:
Stream Stream
------ ------
Stream can be used to parse tables that have whitespaces between cells to simulate a table structure. It looks for these spaces between text to form a table representation. Stream can be used to parse tables that have whitespaces between cells to simulate a table structure. It is built on top of PDFMiner's functionality of grouping characters on a page into words and sentences, using `margins <https://euske.github.io/pdfminer/#tools>`_.
It is built on top of PDFMiner's functionality of grouping characters on a page into words and sentences, using `margins`_. After getting the words on a page, it groups them into rows based on their *y* coordinates. It then tries to guess the number of columns the table might have by calculating the mode of the number of words in each row. This mode is used to calculate *x* ranges for the table's columns. It then adds columns to this column range list based on any words that may lie outside or inside the current column *x* ranges. 1. Words on the PDF page are grouped into text rows based on their *y* axis overlaps.
.. _margins: https://euske.github.io/pdfminer/#tools 2. Textedges are calculated and then used to guess interesting table areas on the PDF page. You can read `Anssi Nurminen's master's thesis <http://dspace.cc.tut.fi/dpub/bitstream/handle/123456789/21520/Nurminen.pdf?sequence=3>`_ to know more about this table detection technique. [See pages 20, 35 and 40]
.. note:: By default, Stream treats the whole PDF page as a table, which isn't ideal when there are more than two tables on a page with different number of columns. Automatic table detection for Stream is `in the works`_. 3. The number of columns inside each table area are then guessed. This is done by calculating the mode of number of words in each text row. Based on this mode, words in each text row are chosen to calculate a list of column *x* ranges.
.. _in the works: https://github.com/socialcopsdev/camelot/issues/102 4. Words that lie inside/outside the current column *x* ranges are then used to extend extend the current list of columns.
5. Finally, a table is formed using the text rows' *y* ranges and column *x* ranges and words found on the page are assigned to the table's cells based on their *x* and *y* coordinates.
.. _lattice: .. _lattice:
@ -39,7 +39,7 @@ Let's see how Lattice processes the second page of `this PDF`_, step-by-step.
1. Line segments are detected. 1. Line segments are detected.
.. image:: ../_static/png/geometry_line.png .. image:: ../_static/png/plot_line.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
@ -49,7 +49,7 @@ Let's see how Lattice processes the second page of `this PDF`_, step-by-step.
.. _and: https://en.wikipedia.org/wiki/Logical_conjunction .. _and: https://en.wikipedia.org/wiki/Logical_conjunction
.. image:: ../_static/png/geometry_joint.png .. image:: ../_static/png/plot_joint.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
@ -59,7 +59,7 @@ Let's see how Lattice processes the second page of `this PDF`_, step-by-step.
.. _or: https://en.wikipedia.org/wiki/Logical_disjunction .. _or: https://en.wikipedia.org/wiki/Logical_disjunction
.. image:: ../_static/png/geometry_contour.png .. image:: ../_static/png/plot_contour.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%
@ -75,7 +75,7 @@ Let's see how Lattice processes the second page of `this PDF`_, step-by-step.
5. Spanning cells are detected using the line segments and line intersections. 5. Spanning cells are detected using the line segments and line intersections.
.. image:: ../_static/png/geometry_table.png .. image:: ../_static/png/plot_table.png
:height: 674 :height: 674
:width: 1366 :width: 1366
:scale: 50% :scale: 50%

View File

@ -0,0 +1,76 @@
.. _install_deps:
Installation of dependencies
============================
The dependencies `Tkinter`_ and `ghostscript`_ can be installed using your system's package manager. You can run one of the following, based on your OS.
.. _Tkinter: https://wiki.python.org/moin/TkInter
.. _ghostscript: https://www.ghostscript.com
OS-specific instructions
------------------------
For Ubuntu
^^^^^^^^^^
::
$ apt install python-tk ghostscript
Or for Python 3::
$ apt install python3-tk ghostscript
For macOS
^^^^^^^^^
::
$ brew install tcl-tk ghostscript
For Windows
^^^^^^^^^^^
For Tkinter, you can download the `ActiveTcl Community Edition`_ from ActiveState. For ghostscript, you can get the installer at the `ghostscript downloads page`_.
After installing ghostscript, you'll need to reboot your system to make sure that the ghostscript executable's path is in the windows PATH environment variable. In case you don't want to reboot, you can manually add the ghostscript executable's path to the PATH variable, `as shown here`_.
.. _ActiveTcl Community Edition: https://www.activestate.com/activetcl/downloads
.. _ghostscript downloads page: https://www.ghostscript.com/download/gsdnld.html
.. _as shown here: https://java.com/en/download/help/path.xml
Checks to see if dependencies were installed correctly
------------------------------------------------------
You can do the following checks to see if the dependencies were installed correctly.
For Tkinter
^^^^^^^^^^^
Launch Python, and then at the prompt, type::
>>> import Tkinter
Or in Python 3::
>>> import tkinter
If you have Tkinter, Python will not print an error message, and if not, you will see an ``ImportError``.
For ghostscript
^^^^^^^^^^^^^^^
Run the following to check the ghostscript version.
For Ubuntu/macOS::
$ gs -version
For Windows::
C:\> gswin64c.exe -version
Or for Windows 32-bit::
C:\> gswin32c.exe -version
If you have ghostscript, you should see the ghostscript version and copyright information.

View File

@ -3,90 +3,37 @@
Installation of Camelot Installation of Camelot
======================= =======================
This part of the documentation covers how to install Camelot. First, you'll need to install the dependencies, which include `Tkinter`_ and `ghostscript`_. This part of the documentation covers the steps to install Camelot.
Using conda
-----------
The easiest way to install Camelot is to install it with `conda`_, which is a package manager and environment management system for the `Anaconda`_ distribution.
::
$ conda install -c conda-forge camelot-py
.. note:: Camelot is available for Python 2.7, 3.5 and 3.6 on Linux, macOS and Windows. For Windows, you will need to install ghostscript which you can get from their `downloads page`_.
.. _conda: https://conda.io/docs/
.. _Anaconda: http://docs.continuum.io/anaconda/
.. _downloads page: https://www.ghostscript.com/download/gsdnld.html
.. _conda-forge: https://conda-forge.org/
Using pip
---------
After :ref:`installing the dependencies <install_deps>`, which include `Tkinter`_ and `ghostscript`_, you can simply use pip to install Camelot::
$ pip install camelot-py[cv]
.. _Tkinter: https://wiki.python.org/moin/TkInter .. _Tkinter: https://wiki.python.org/moin/TkInter
.. _ghostscript: https://www.ghostscript.com .. _ghostscript: https://www.ghostscript.com
Install the dependencies From the source code
------------------------ --------------------
These can be installed using your system's package manager. You can run one of the following, based on your OS. After :ref:`installing the dependencies <install_deps>`, you can install from the source by:
For Ubuntu
^^^^^^^^^^
::
$ apt install python-tk ghostscript
Or for Python 3::
$ apt install python3-tk ghostscript
For macOS
^^^^^^^^^
::
$ brew install tcl-tk ghostscript
For Windows
^^^^^^^^^^^
For Tkinter, you can download the `ActiveTcl Community Edition`_ from ActiveState. For ghostscript, you can get the installer at the `ghostscript downloads page`_.
After installing ghostscript, you'll need to reboot your system to make sure that the ghostscript executable's path is in the windows PATH environment variable. In case you don't want to reboot, you can manually add the ghostscript executable's path to the PATH variable, `as shown here`_.
.. _ActiveTcl Community Edition: https://www.activestate.com/activetcl/downloads
.. _ghostscript downloads page: https://www.ghostscript.com/download/gsdnld.html
.. _as shown here: https://java.com/en/download/help/path.xml
----
You can do the following checks to see if the dependencies were installed correctly.
For Tkinter
^^^^^^^^^^^
Launch Python, and then at the prompt, type::
>>> import Tkinter
Or in Python 3::
>>> import tkinter
If you have Tkinter, Python will not print an error message, and if not, you will see an ``ImportError``.
For ghostscript
^^^^^^^^^^^^^^^
Run the following to check the ghostscript version.
For Ubuntu/macOS::
$ gs -version
For Windows::
C:\> gswin64c.exe -version
Or for Windows 32-bit::
C:\> gswin32c.exe -version
If you have ghostscript, you should see the ghostscript version and copyright information.
$ pip install camelot-py
------------------------
After installing the dependencies, you can simply use pip to install Camelot::
$ pip install camelot-py
Get the source code
-------------------
Alternatively, you can install from the source by:
1. Cloning the GitHub repository. 1. Cloning the GitHub repository.
:: ::
@ -97,4 +44,4 @@ Alternatively, you can install from the source by:
:: ::
$ cd camelot $ cd camelot
$ pip install . $ pip install ".[cv]"

View File

@ -70,6 +70,12 @@ You can also export all tables at once, using the :class:`tables <camelot.core.T
>>> tables.export('foo.csv', f='csv') >>> tables.export('foo.csv', f='csv')
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot --format csv --output foo.csv lattice foo.pdf
This will export all tables as CSV files at the path specified. Alternatively, you can use ``f='json'``, ``f='excel'`` or ``f='html'``. This will export all tables as CSV files at the path specified. Alternatively, you can use ``f='json'``, ``f='excel'`` or ``f='html'``.
.. note:: The :meth:`export() <camelot.core.TableList.export>` method exports files with a ``page-*-table-*`` suffix. In the example above, the single table in the list will be exported to ``foo-page-1-table-1.csv``. If the list contains multiple tables, multiple CSV files will be created. To avoid filling up your path with multiple files, you can use ``compress=True``, which will create a single ZIP file at your path with all the CSV files. .. note:: The :meth:`export() <camelot.core.TableList.export>` method exports files with a ``page-*-table-*`` suffix. In the example above, the single table in the list will be exported to ``foo-page-1-table-1.csv``. If the list contains multiple tables, multiple CSV files will be created. To avoid filling up your path with multiple files, you can use ``compress=True``, which will create a single ZIP file at your path with all the CSV files.
@ -85,8 +91,42 @@ By default, Camelot only uses the first page of the PDF to extract tables. To sp
>>> camelot.read_pdf('your.pdf', pages='1,2,3') >>> camelot.read_pdf('your.pdf', pages='1,2,3')
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot --pages 1,2,3 lattice your.pdf
The ``pages`` keyword argument accepts pages as comma-separated string of page numbers. You can also specify page ranges — for example, ``pages=1,4-10,20-30`` or ``pages=1,4-10,20-end``. The ``pages`` keyword argument accepts pages as comma-separated string of page numbers. You can also specify page ranges — for example, ``pages=1,4-10,20-30`` or ``pages=1,4-10,20-end``.
------------------------ Reading encrypted PDFs
----------------------
To extract tables from encrypted PDF files you must provide a password when calling :meth:`read_pdf() <camelot.read_pdf>`.
::
>>> tables = camelot.read_pdf('foo.pdf', password='userpass')
>>> tables
<TableList n=1>
.. tip::
Here's how you can do the same with the :ref:`command-line interface <cli>`.
::
$ camelot --password userpass lattice foo.pdf
Currently Camelot only supports PDFs encrypted with ASCII passwords and algorithm `code 1 or 2`_. An exception is thrown if the PDF cannot be read. This may be due to no password being provided, an incorrect password, or an unsupported encryption algorithm.
Further encryption support may be added in future, however in the meantime if your PDF files are using unsupported encryption algorithms you are advised to remove encryption before calling :meth:`read_pdf() <camelot.read_pdf>`. This can been successfully achieved with third-party tools such as `QPDF`_.
::
$ qpdf --password=<PASSWORD> --decrypt input.pdf output.pdf
.. _code 1 or 2: https://github.com/mstamy2/PyPDF2/issues/378
.. _QPDF: https://www.github.com/qpdf/qpdf
----
Ready for more? Check out the :ref:`advanced <advanced>` section. Ready for more? Check out the :ref:`advanced <advanced>` section.

View File

@ -1,5 +0,0 @@
codecov==2.0.15
pytest==3.8.0
pytest-cov==2.6.0
pytest-runner==4.2
Sphinx==1.7.9

18
requirements.txt 100644 → 100755
View File

@ -1,9 +1,9 @@
click==6.7 click>=6.7
ghostscript==0.6 ghostscript>=0.6
matplotlib==2.2.3 matplotlib>=2.2.3
numpy==1.15.2 numpy>=1.13.3
opencv-python==3.4.2.17 opencv-python>=3.4.2.17
openpyxl==2.5.8 openpyxl>=2.5.8
pandas==0.23.4 pandas>=0.23.4
pdfminer.six==20170720 pdfminer.six>=20170720
PyPDF2==1.26.0 PyPDF2>=1.26.0

View File

@ -2,5 +2,5 @@
test=pytest test=pytest
[tool:pytest] [tool:pytest]
addopts = --verbose --cov-config .coveragerc --cov-report term --cov-report xml --cov=camelot tests addopts = --verbose --cov-config .coveragerc --cov-report term --cov-report xml --cov=camelot --mpl
python_files = tests/test_*.py python_files = tests/test_*.py

View File

@ -13,17 +13,38 @@ with open('README.md', 'r') as f:
readme = f.read() readme = f.read()
requires = [
'chardet>=3.0.4',
'click>=6.7',
'numpy>=1.13.3',
'openpyxl>=2.5.8',
'pandas>=0.23.4',
'pdfminer.six>=20170720',
'PyPDF2>=1.26.0'
]
cv_requires = [
'opencv-python>=3.4.2.17'
]
plot_requires = [
'matplotlib>=2.2.3',
]
dev_requires = [
'codecov>=2.0.15',
'pytest>=3.8.0',
'pytest-cov>=2.6.0',
'pytest-mpl>=0.10',
'pytest-runner>=4.2',
'Sphinx>=1.7.9'
]
all_requires = cv_requires + plot_requires
dev_requires = dev_requires + all_requires
def setup_package(): def setup_package():
reqs = []
with open('requirements.txt', 'r') as f:
for line in f:
reqs.append(line.strip())
dev_reqs = []
with open('requirements-dev.txt', 'r') as f:
for line in f:
dev_reqs.append(line.strip())
metadata = dict(name=about['__title__'], metadata = dict(name=about['__title__'],
version=about['__version__'], version=about['__version__'],
description=about['__description__'], description=about['__description__'],
@ -34,9 +55,12 @@ def setup_package():
author_email=about['__author_email__'], author_email=about['__author_email__'],
license=about['__license__'], license=about['__license__'],
packages=find_packages(exclude=('tests',)), packages=find_packages(exclude=('tests',)),
install_requires=reqs, install_requires=requires,
extras_require={ extras_require={
'dev': dev_reqs 'all': all_requires,
'cv': cv_requires,
'dev': dev_requires,
'plot': plot_requires
}, },
entry_points={ entry_points={
'console_scripts': [ 'console_scripts': [

View File

@ -0,0 +1,2 @@
import matplotlib
matplotlib.use('agg')

View File

@ -33,58 +33,141 @@ data_stream = [
["Nagaland", "2,368,724", "204,329", "226,400", "0", "2,799,453", "783,054", "3,582,507"], ["Nagaland", "2,368,724", "204,329", "226,400", "0", "2,799,453", "783,054", "3,582,507"],
["Odisha", "14,317,179", "2,552,292", "1,107,250", "0", "17,976,721", "451,438", "18,428,159"], ["Odisha", "14,317,179", "2,552,292", "1,107,250", "0", "17,976,721", "451,438", "18,428,159"],
["Puducherry", "4,191,757", "52,249", "192,400", "0", "4,436,406", "2,173", "4,438,579"], ["Puducherry", "4,191,757", "52,249", "192,400", "0", "4,436,406", "2,173", "4,438,579"],
["Punjab", "19,775,485", "2,208,343", "2,470,882", "0", "24,454,710", "1,436,522", "25,891,232"], ["Punjab", "19,775,485", "2,208,343", "2,470,882", "0", "24,454,710", "1,436,522", "25,891,232"]
["", "Health Sector Financing by Centre and States/UTs in India [2009-10 to 2012-13](Revised) P a g e |23", "", "", "", "", "", ""]
] ]
data_stream_table_rotated = [ data_stream_table_rotated = [
["", "", "Table 21 Current use of contraception by background characteristics\u2014Continued", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""], ["Table 21 Current use of contraception by background characteristics\u2014Continued", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "", "", "", "", "", "Modern method", "", "", "", "", "", "", "Traditional method", "", "", "", ""], ["", "", "", "", "", "Modern method", "", "", "", "", "", "", "Traditional method", "", "", "", ""],
["", "", "", "Any", "", "", "", "", "", "", "Other", "Any", "", "", "", "Not", "", "Number"], ["", "", "Any", "", "", "", "", "", "", "Other", "Any", "", "", "", "Not", "", "Number"],
["", "", "Any", "modern", "Female", "Male", "", "", "", "Condom/", "modern", "traditional", "", "With-", "Folk", "currently", "", "of"], ["", "Any", "modern", "Female", "Male", "", "", "", "Condom/", "modern", "traditional", "", "With-", "Folk", "currently", "", "of"],
["", "Background characteristic", "method", "method", "sterilization", "sterilization", "Pill", "IUD", "Injectables", "Nirodh", "method", "method", "Rhythm", "drawal", "method", "using", "Total", "women"], ["Background characteristic", "method", "method", "sterilization", "sterilization", "Pill", "IUD", "Injectables", "Nirodh", "method", "method", "Rhythm", "drawal", "method", "using", "Total", "women"],
["", "Caste/tribe", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""], ["Caste/tribe", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "Scheduled caste", "74.8", "55.8", "42.9", "0.9", "9.7", "0.0", "0.2", "2.2", "0.0", "19.0", "11.2", "7.4", "0.4", "25.2", "100.0", "1,363"], ["Scheduled caste", "74.8", "55.8", "42.9", "0.9", "9.7", "0.0", "0.2", "2.2", "0.0", "19.0", "11.2", "7.4", "0.4", "25.2", "100.0", "1,363"],
["", "Scheduled tribe", "59.3", "39.0", "26.8", "0.6", "6.4", "0.6", "1.2", "3.5", "0.0", "20.3", "10.4", "5.8", "4.1", "40.7", "100.0", "256"], ["Scheduled tribe", "59.3", "39.0", "26.8", "0.6", "6.4", "0.6", "1.2", "3.5", "0.0", "20.3", "10.4", "5.8", "4.1", "40.7", "100.0", "256"],
["", "Other backward class", "71.4", "51.1", "34.9", "0.0", "8.6", "1.4", "0.0", "6.2", "0.0", "20.4", "12.6", "7.8", "0.0", "28.6", "100.0", "211"], ["Other backward class", "71.4", "51.1", "34.9", "0.0", "8.6", "1.4", "0.0", "6.2", "0.0", "20.4", "12.6", "7.8", "0.0", "28.6", "100.0", "211"],
["", "Other", "71.1", "48.8", "28.2", "0.8", "13.3", "0.9", "0.3", "5.2", "0.1", "22.3", "12.9", "9.1", "0.3", "28.9", "100.0", "3,319"], ["Other", "71.1", "48.8", "28.2", "0.8", "13.3", "0.9", "0.3", "5.2", "0.1", "22.3", "12.9", "9.1", "0.3", "28.9", "100.0", "3,319"],
["", "Wealth index", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""], ["Wealth index", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "Lowest", "64.5", "48.6", "34.3", "0.5", "10.5", "0.6", "0.7", "2.0", "0.0", "15.9", "9.9", "4.6", "1.4", "35.5", "100.0", "1,258"], ["Lowest", "64.5", "48.6", "34.3", "0.5", "10.5", "0.6", "0.7", "2.0", "0.0", "15.9", "9.9", "4.6", "1.4", "35.5", "100.0", "1,258"],
["", "Second", "68.5", "50.4", "36.2", "1.1", "11.4", "0.5", "0.1", "1.1", "0.0", "18.1", "11.2", "6.7", "0.2", "31.5", "100.0", "1,317"], ["Second", "68.5", "50.4", "36.2", "1.1", "11.4", "0.5", "0.1", "1.1", "0.0", "18.1", "11.2", "6.7", "0.2", "31.5", "100.0", "1,317"],
["", "Middle", "75.5", "52.8", "33.6", "0.6", "14.2", "0.4", "0.5", "3.4", "0.1", "22.7", "13.4", "8.9", "0.4", "24.5", "100.0", "1,018"], ["Middle", "75.5", "52.8", "33.6", "0.6", "14.2", "0.4", "0.5", "3.4", "0.1", "22.7", "13.4", "8.9", "0.4", "24.5", "100.0", "1,018"],
["", "Fourth", "73.9", "52.3", "32.0", "0.5", "12.5", "0.6", "0.2", "6.3", "0.2", "21.6", "11.5", "9.9", "0.2", "26.1", "100.0", "908"], ["Fourth", "73.9", "52.3", "32.0", "0.5", "12.5", "0.6", "0.2", "6.3", "0.2", "21.6", "11.5", "9.9", "0.2", "26.1", "100.0", "908"],
["", "Highest", "78.3", "44.4", "19.5", "1.0", "9.7", "1.4", "0.0", "12.7", "0.0", "33.8", "18.2", "15.6", "0.0", "21.7", "100.0", "733"], ["Highest", "78.3", "44.4", "19.5", "1.0", "9.7", "1.4", "0.0", "12.7", "0.0", "33.8", "18.2", "15.6", "0.0", "21.7", "100.0", "733"],
["", "Number of living children", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""], ["Number of living children", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "No children", "25.1", "7.6", "0.3", "0.5", "2.0", "0.0", ["No children", "25.1", "7.6", "0.3", "0.5", "2.0", "0.0",
"0.0", "4.8", "0.0", "17.5", "9.0", "8.5", "0.0", "74.9", "100.0", "563"], "0.0", "4.8", "0.0", "17.5", "9.0", "8.5", "0.0", "74.9", "100.0", "563"],
["", "1 child", "66.5", "32.1", "3.7", "0.7", "20.1", "0.7", "0.1", "6.9", "0.0", "34.3", "18.9", "15.2", "0.3", "33.5", "100.0", "1,190"], ["1 child", "66.5", "32.1", "3.7", "0.7", "20.1", "0.7", "0.1", "6.9", "0.0", "34.3", "18.9", "15.2", "0.3", "33.5", "100.0", "1,190"],
["\x18\x18", "1 son", "66.8", "33.2", "4.1", "0.7", "21.1", "0.5", "0.3", "6.6", "0.0", "33.5", "21.2", "12.3", "0.0", "33.2", "100.0", "672"], ["1 son", "66.8", "33.2", "4.1", "0.7", "21.1", "0.5", "0.3", "6.6", "0.0", "33.5", "21.2", "12.3", "0.0", "33.2", "100.0", "672"],
["", "No sons", "66.1", "30.7", "3.1", "0.6", "18.8", "0.8", "0.0", "7.3", "0.0", "35.4", "15.8", "19.0", "0.6", "33.9", "100.0", "517"], ["No sons", "66.1", "30.7", "3.1", "0.6", "18.8", "0.8", "0.0", "7.3", "0.0", "35.4", "15.8", "19.0", "0.6", "33.9", "100.0", "517"],
["", "2 children", "81.6", "60.5", "41.8", "0.9", "11.6", "0.8", "0.3", "4.8", "0.2", "21.1", "12.2", "8.3", "0.6", "18.4", "100.0", "1,576"], ["2 children", "81.6", "60.5", "41.8", "0.9", "11.6", "0.8", "0.3", "4.8", "0.2", "21.1", "12.2", "8.3", "0.6", "18.4", "100.0", "1,576"],
["", "1 or more sons", "83.7", "64.2", "46.4", "0.9", "10.8", "0.8", "0.4", "4.8", "0.1", "19.5", "11.1", "7.6", "0.7", "16.3", "100.0", "1,268"], ["1 or more sons", "83.7", "64.2", "46.4", "0.9", "10.8", "0.8", "0.4", "4.8", "0.1", "19.5", "11.1", "7.6", "0.7", "16.3", "100.0", "1,268"],
["", "No sons", "73.2", "45.5", "23.2", "1.0", "15.1", "0.9", "0.0", "4.8", "0.5", "27.7", "16.8", "11.0", "0.0", "26.8", "100.0", "308"], ["No sons", "73.2", "45.5", "23.2", "1.0", "15.1", "0.9", "0.0", "4.8", "0.5", "27.7", "16.8", "11.0", "0.0", "26.8", "100.0", "308"],
["", "3 children", "83.9", "71.2", "57.7", "0.8", "9.8", "0.6", "0.5", "1.8", "0.0", "12.7", "8.7", "3.3", "0.8", "16.1", "100.0", "961"], ["3 children", "83.9", "71.2", "57.7", "0.8", "9.8", "0.6", "0.5", "1.8", "0.0", "12.7", "8.7", "3.3", "0.8", "16.1", "100.0", "961"],
["", "1 or more sons", "85.0", "73.2", "60.3", "0.9", "9.4", "0.5", "0.5", "1.6", "0.0", "11.8", "8.1", "3.0", "0.7", "15.0", "100.0", "860"], ["1 or more sons", "85.0", "73.2", "60.3", "0.9", "9.4", "0.5", "0.5", "1.6", "0.0", "11.8", "8.1", "3.0", "0.7", "15.0", "100.0", "860"],
["", "No sons", "74.7", "53.8", "35.3", "0.0", "13.7", "1.6", "0.0", "3.2", "0.0", "20.9", "13.4", "6.1", "1.5", "25.3", "100.0", "101"], ["No sons", "74.7", "53.8", "35.3", "0.0", "13.7", "1.6", "0.0", "3.2", "0.0", "20.9", "13.4", "6.1", "1.5", "25.3", "100.0", "101"],
["", "4+ children", "74.3", "58.1", "45.1", "0.6", "8.7", "0.6", "0.7", "2.4", "0.0", "16.1", "9.9", "5.4", "0.8", "25.7", "100.0", "944"], ["4+ children", "74.3", "58.1", "45.1", "0.6", "8.7", "0.6", "0.7", "2.4", "0.0", "16.1", "9.9", "5.4", "0.8", "25.7", "100.0", "944"],
["", "1 or more sons", "73.9", "58.2", "46.0", "0.7", "8.3", "0.7", "0.7", "1.9", "0.0", "15.7", "9.4", "5.5", "0.8", "26.1", "100.0", "901"], ["1 or more sons", "73.9", "58.2", "46.0", "0.7", "8.3", "0.7", "0.7", "1.9", "0.0", "15.7", "9.4", "5.5", "0.8", "26.1", "100.0", "901"],
["", "No sons", "(82.1)", "(57.3)", "(25.6)", "(0.0)", "(17.8)", "(0.0)", "(0.0)", "(13.9)", "(0.0)", "(24.8)", "(21.3)", "(3.5)", "(0.0)", "(17.9)", "100.0", "43"], ["No sons", "(82.1)", "(57.3)", "(25.6)", "(0.0)", "(17.8)", "(0.0)", "(0.0)", "(13.9)", "(0.0)", "(24.8)", "(21.3)", "(3.5)", "(0.0)", "(17.9)", "100.0", "43"],
["", "Total", "71.2", "49.9", "32.2", ["Total", "71.2", "49.9", "32.2",
"0.7", "11.7", "0.6", "0.3", "4.3", "0.1", "21.3", "12.3", "8.4", "0.5", "28.8", "100.0", "5,234"], "0.7", "11.7", "0.6", "0.3", "4.3", "0.1", "21.3", "12.3", "8.4", "0.5", "28.8", "100.0", "5,234"],
["", "NFHS-2 (1998-99)", "66.6", "47.3", "32.0", "1.8", "9.2", "1.4", "na", "2.9", "na", "na", "8.7", "9.8", "na", "33.4", "100.0", "4,116"], ["NFHS-2 (1998-99)", "66.6", "47.3", "32.0", "1.8", "9.2", "1.4", "na", "2.9", "na", "na", "8.7", "9.8", "na", "33.4", "100.0", "4,116"],
["", "NFHS-1 (1992-93)", "57.7", "37.6", "26.5", "4.3", "3.6", "1.3", "0.1", "1.9", "na", "na", "11.3", "8.3", "na", "42.3", "100.0", "3,970"], ["NFHS-1 (1992-93)", "57.7", "37.6", "26.5", "4.3", "3.6", "1.3", "0.1", "1.9", "na", "na", "11.3", "8.3", "na", "42.3", "100.0", "3,970"]
["", "", "Note: If more than one method is used, only the most effective method is considered in this tabulation. Total includes women for whom caste/tribe was not known or is missing, who are", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "not shown separately.", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "na = Not available", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "", "ns = Not shown; see table 2b, footnote 1", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "( ) Based on 25-49 unweighted cases.", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
["", "", "", "", "", "", "", "", "54", "", "", "", "", "", "", "", "", ""]
] ]
data_stream_table_area = [ data_stream_two_tables_1 = [
["[In thousands (11,062.6 represents 11,062,600) For year ending December 31. Based on Uniform Crime Reporting (UCR)", "", "", "", "", "", "", "", "", ""],
["Program. Represents arrests reported (not charged) by 12,910 agencies with a total population of 247,526,916 as estimated", "", "", "", "", "", "", "", "", ""],
["by the FBI. Some persons may be arrested more than once during a year, therefore, the data in this table, in some cases,", "", "", "", "", "", "", "", "", ""],
["could represent multiple arrests of the same person. See text, this section and source]", "", "", "", "", "", "", "", "", ""],
["", "", "Total", "", "", "Male", "", "", "Female", ""],
["Offense charged", "", "Under 18", "18 years", "", "Under 18", "18 years", "", "Under 18", "18 years"],
["", "Total", "years", "and over", "Total", "years", "and over", "Total", "years", "and over"],
["Total .\n .\n . . . . . .\n . .\n . .\n . .\n . .\n . .\n . .\n . .\n . . .", "11,062 .6", "1,540 .0", "9,522 .6", "8,263 .3", "1,071 .6", "7,191 .7", "2,799 .2", "468 .3", "2,330 .9"],
["Violent crime . . . . . . . .\n . .\n . .\n . .\n . .\n . .", "467 .9", "69 .1", "398 .8", "380 .2", "56 .5", "323 .7", "87 .7", "12 .6", "75 .2"],
["Murder and nonnegligent", "", "", "", "", "", "", "", "", ""],
["manslaughter . . . . . . . .\n. .\n. .\n. .\n. .\n.", "10.0", "0.9", "9.1", "9.0", "0.9", "8.1", "1.1", "", "1.0"],
["Forcible rape . . . . . . . .\n. .\n. .\n. .\n. .\n. .", "17.5", "2.6", "14.9", "17.2", "2.5", "14.7", "", "", ""],
["Robbery . . . .\n. .\n. . .\n. . .\n.\n. . .\n.\n. . .\n.\n.", "102.1", "25.5", "76.6", "90.0", "22.9", "67.1", "12.1", "2.5", "9.5"],
["Aggravated assault . . . . . . . .\n. .\n. .\n.", "338.4", "40.1", "298.3", "264.0", "30.2", "233.8", "74.4", "9.9", "64.5"],
["Property crime . . . .\n . .\n . . .\n . . .\n .\n . . . .", "1,396 .4", "338 .7", "1,057 .7", "875 .9", "210 .8", "665 .1", "608 .2", "127 .9", "392 .6"],
["Burglary . .\n. . . . . .\n. .\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.", "240.9", "60.3", "180.6", "205.0", "53.4", "151.7", "35.9", "6.9", "29.0"],
["Larceny-theft . . . . . . . .\n. .\n. .\n. .\n. .\n. .", "1,080.1", "258.1", "822.0", "608.8", "140.5", "468.3", "471.3", "117.6", "353.6"],
["Motor vehicle theft . . . . .\n. .\n. . .\n.\n.\n. .", "65.6", "16.0", "49.6", "53.9", "13.3", "40.7", "11.7", "2.7", "8.9"],
["Arson .\n. . . . .\n. . .\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. .", "9.8", "4.3", "5.5", "8.1", "3.7", "4.4", "1.7", "0.6", "1.1"],
["Other assaults .\n. . . . . .\n. . .\n.\n. . .\n.\n. .\n.", "1,061.3", "175.3", "886.1", "785.4", "115.4", "670.0", "276.0", "59.9", "216.1"],
["Forgery and counterfeiting .\n. . . . . . .\n.", "68.9", "1.7", "67.2", "42.9", "1.2", "41.7", "26.0", "0.5", "25.5"],
["Fraud .\n.\n.\n. .\n. . . .\n. .\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.\n.", "173.7", "5.1", "168.5", "98.4", "3.3", "95.0", "75.3", "1.8", "73.5"],
["Embezzlement . . .\n. . . . .\n. . .\n.\n. . .\n.\n.\n.", "14.6", "", "14.1", "7.2", "", "6.9", "7.4", "", "7.2"],
["Stolen property 1 . . . . . . .\n. . .\n. .\n. .\n.\n.", "84.3", "15.1", "69.2", "66.7", "12.2", "54.5", "17.6", "2.8", "14.7"],
["Vandalism . . . . . . . .\n. .\n. .\n. .\n. .\n. .\n.\n.\n.", "217.4", "72.7", "144.7", "178.1", "62.8", "115.3", "39.3", "9.9", "29.4"],
["Weapons; carrying, possessing, etc. .", "132.9", "27.1", "105.8", "122.1", "24.3", "97.8", "10.8", "2.8", "8.0"],
["Prostitution and commercialized vice", "56.9", "1.1", "55.8", "17.3", "", "17.1", "39.6", "0.8", "38.7"],
["Sex offenses 2 . . . . .\n. . . . .\n. .\n. .\n. . .\n.", "61.5", "10.7", "50.7", "56.1", "9.6", "46.5", "5.4", "1.1", "4.3"],
["Drug abuse violations . . . . . . . .\n. .\n.\n.", "1,333.0", "136.6", "1,196.4", "1,084.3", "115.2", "969.1", "248.7", "21.4", "227.3"],
["Gambling .\n. . . . . .\n. .\n.\n. . .\n.\n. . .\n.\n. .\n.\n.", "8.2", "1.4", "6.8", "7.2", "1.4", "5.9", "0.9", "", "0.9"],
["Offenses against the family and", "", "", "", "", "", "", "", "", ""],
["children . . . .\n. . . .\n. .\n. .\n. .\n. .\n. .\n. . .\n.", "92.4", "3.7", "88.7", "68.9", "2.4", "66.6", "23.4", "1.3", "22.1"],
["Driving under the influence . . . . . .\n. .", "1,158.5", "109.2", "1,147.5", "895.8", "8.2", "887.6", "262.7", "2.7", "260.0"],
["Liquor laws . . . . . . . .\n. .\n. .\n. .\n. .\n. .\n. .", "48.2", "90.2", "368.0", "326.8", "55.4", "271.4", "131.4", "34.7", "96.6"],
["Drunkenness . . .\n. . . . .\n. . .\n.\n. . .\n.\n. .\n.", "488.1", "11.4", "476.8", "406.8", "8.5", "398.3", "81.3", "2.9", "78.4"],
["Disorderly conduct . .\n. . . . . . .\n. .\n. .\n. .", "529.5", "136.1", "393.3", "387.1", "90.8", "296.2", "142.4", "45.3", "97.1"],
["Vagrancy . . . .\n. . . . .\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.", "26.6", "2.2", "24.4", "20.9", "1.6", "19.3", "5.7", "0.6", "5.1"],
["All other offenses (except traffic) . . .\n.", "306.1", "263.4", "2,800.8", "2,337.1", "194.2", "2,142.9", "727.0", "69.2", "657.9"],
["Suspicion . . . .\n. . . .\n. .\n. .\n. .\n. .\n. .\n. . .\n.", "1.6", "", "1.4", "1.2", "", "1.0", "", "", ""],
["Curfew and loitering law violations .\n.", "91.0", "91.0", "(X)", "63.1", "63.1", "(X)", "28.0", "28.0", "(X)"],
["Runaways . . . . . . . .\n. .\n. .\n. .\n. .\n. .\n.\n.\n.", "75.8", "75.8", "(X)", "34.0", "34.0", "(X)", "41.8", "41.8", "(X)"],
["", " Represents zero. X Not applicable. 1 Buying, receiving, possessing stolen property. 2 Except forcible rape and prostitution.", "", "", "", "", "", "", "", ""],
["", "Source: U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reports, Arrests Master Files.", "", "", "", "", "", "", "", ""]
]
data_stream_two_tables_2 = [
["", "Source: U.S. Department of Justice, Federal Bureau of Investigation, Uniform Crime Reports, Arrests Master Files.", "", "", "", ""],
["Table 325. Arrests by Race: 2009", "", "", "", "", ""],
["[Based on Uniform Crime Reporting (UCR) Program. Represents arrests reported (not charged) by 12,371 agencies", "", "", "", "", ""],
["with a total population of 239,839,971 as estimated by the FBI. See headnote, Table 324]", "", "", "", "", ""],
["", "", "", "", "American", ""],
["Offense charged", "", "", "", "Indian/Alaskan", "Asian Pacific"],
["", "Total", "White", "Black", "Native", "Islander"],
["Total .\n .\n .\n .\n . .\n . . .\n . . .\n .\n . . .\n .\n . . .\n . .\n .\n . . .\n .\n .\n .\n . .\n . .\n . .", "10,690,561", "7,389,208", "3,027,153", "150,544", "123,656"],
["Violent crime . . . . . . . .\n . .\n . .\n . .\n . .\n .\n .\n . .\n . .\n .\n .\n .\n .\n . .", "456,965", "268,346", "177,766", "5,608", "5,245"],
["Murder and nonnegligent manslaughter . .\n. .\n.\n. .", "9,739", "4,741", "4,801", "100", "97"],
["Forcible rape . . . . . . . .\n. .\n. .\n. .\n. .\n.\n.\n. .\n. .\n.\n.\n.\n.\n. .", "16,362", "10,644", "5,319", "169", "230"],
["Robbery . . . . .\n. . . . .\n.\n. . .\n.\n. . .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. . . .", "100,496", "43,039", "55,742", "726", "989"],
["Aggravated assault . . . . . . . .\n. .\n. .\n.\n.\n.\n.\n. .\n. .\n.\n.\n.", "330,368", "209,922", "111,904", "4,613", "3,929"],
["Property crime . . . . .\n . . . . .\n .\n . . .\n .\n . .\n .\n .\n .\n . .\n .\n . .\n .\n .", "1,364,409", "922,139", "406,382", "17,599", "18,289"],
["Burglary . . .\n. . . . .\n. . .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n. . . .", "234,551", "155,994", "74,419", "2,021", "2,117"],
["Larceny-theft . . . . . . . .\n. .\n. .\n. .\n. .\n.\n.\n. .\n. .\n.\n.\n.\n.\n. .", "1,056,473", "719,983", "306,625", "14,646", "15,219"],
["Motor vehicle theft . . . . . .\n. .\n.\n. . .\n.\n. .\n.\n.\n.\n. .\n.\n. .\n.", "63,919", "39,077", "23,184", "817", "841"],
["Arson .\n. . . .\n. .\n. .\n. .\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. . . . . .", "9,466", "7,085", "2,154", "115", "112"],
["Other assaults .\n. . . . . . .\n.\n. . .\n.\n. . .\n.\n. .\n.\n.\n.\n. .\n.\n. .\n.", "1,032,502", "672,865", "332,435", "15,127", "12,075"],
["Forgery and counterfeiting .\n. . . . . . .\n.\n. .\n.\n.\n.\n. .\n. .\n.", "67,054", "44,730", "21,251", "345", "728"],
["Fraud .\n.\n. . . . . .\n. .\n. .\n. .\n. .\n. .\n. .\n. .\n. .\n. .\n.\n.\n. . . . . . .", "161,233", "108,032", "50,367", "1,315", "1,519"],
["Embezzlement . . . .\n. . . . .\n.\n. . .\n.\n. . .\n.\n.\n. .\n.\n. .\n.\n.\n.\n.", "13,960", "9,208", "4,429", "75", "248"],
["Stolen property; buying, receiving, possessing .\n. .", "82,714", "51,953", "29,357", "662", "742"],
["Vandalism . . . . . . . .\n. .\n. .\n. .\n. .\n. .\n. .\n.\n.\n. .\n. .\n.\n.\n.\n. .", "212,173", "157,723", "48,746", "3,352", "2,352"],
["Weapons—carrying, possessing, etc. .\n. .\n. .\n.\n. .\n. .", "130,503", "74,942", "53,441", "951", "1,169"],
["Prostitution and commercialized vice . .\n.\n. .\n. .\n. .\n.", "56,560", "31,699", "23,021", "427", "1,413"],
["Sex offenses 1 . . . . . . . .\n. .\n. .\n. .\n. .\n.\n.\n. .\n. .\n.\n.\n.\n.\n. .", "60,175", "44,240", "14,347", "715", "873"],
["Drug abuse violations . . . . . . . .\n. . .\n.\n.\n.\n. .\n. .\n.\n.\n.\n.", "1,301,629", "845,974", "437,623", "8,588", "9,444"],
["Gambling . . . . .\n. . . . .\n.\n. . .\n.\n. . .\n. .\n.\n. . .\n.\n.\n.\n.\n. .\n. .", "8,046", "2,290", "5,518", "27", "211"],
["Offenses against the family and children .\n.\n. .\n. .\n. .", "87,232", "58,068", "26,850", "1,690", "624"],
["Driving under the influence . . . . . . .\n. .\n.\n. .\n.\n.\n.\n.\n. .", "1,105,401", "954,444", "121,594", "14,903", "14,460"],
["Liquor laws . . . . . . . .\n. .\n. .\n. .\n. .\n. . .\n.\n.\n.\n. .\n. .\n.\n.\n.\n.", "444,087", "373,189", "50,431", "14,876", "5,591"],
["Drunkenness . .\n. . . . . . .\n.\n. . .\n.\n. . .\n.\n.\n.\n. . .\n.\n.\n.\n.\n.\n.", "469,958", "387,542", "71,020", "8,552", "2,844"],
["Disorderly conduct . . .\n. . . . . .\n. .\n. . .\n.\n.\n.\n. .\n. .\n.\n.\n.\n.", "515,689", "326,563", "176,169", "8,783", "4,174"],
["Vagrancy . . .\n. .\n. . . .\n. .\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. .\n.\n.\n. . . .", "26,347", "14,581", "11,031", "543", "192"],
["All other offenses (except traffic) . .\n. .\n. .\n. .\n.\n.\n.\n. .\n.", "2,929,217", "1,937,221", "911,670", "43,880", "36,446"],
["Suspicion . . .\n. . . . .\n. .\n. .\n. .\n. .\n. .\n. .\n. .\n.\n.\n.\n.\n. .\n. . . .", "1,513", "677", "828", "1", "7"],
["Curfew and loitering law violations . .\n. .\n.\n. .\n. .\n.\n.\n.", "89,578", "54,439", "33,207", "872", "1,060"],
["Runaways . . . . . . . .\n. .\n. .\n. .\n. .\n. .\n. .\n.\n.\n. .\n. .\n.\n.\n.\n. .", "73,616", "48,343", "19,670", "1,653", "3,950"],
["1 Except forcible rape and prostitution.", "", "", "", "", ""],
["", "Source: U.S. Department of Justice, Federal Bureau of Investigation, “Crime in the United States, Arrests,” September 2010,", "", "", "", ""]
]
data_stream_table_areas = [
["", "One Withholding"], ["", "One Withholding"],
["Payroll Period", "Allowance"], ["Payroll Period", "Allowance"],
["Weekly", "$71.15"], ["Weekly", "$\n71.15"],
["Biweekly", "142.31"], ["Biweekly", "142.31"],
["Semimonthly", "154.17"], ["Semimonthly", "154.17"],
["Monthly", "308.33"], ["Monthly", "308.33"],
@ -187,14 +270,10 @@ data_stream_split_text = [
["", "", "", "", "1522 WEST LINDSEY", "", "", "", "", ""], ["", "", "", "", "1522 WEST LINDSEY", "", "", "", "", ""],
["632575", "BAW", "BASHU LEGENDS", "HYH HE CHUANG LLC", "STREET", "NORMAN", "OK", "73069", "-", "2014/07/21"], ["632575", "BAW", "BASHU LEGENDS", "HYH HE CHUANG LLC", "STREET", "NORMAN", "OK", "73069", "-", "2014/07/21"],
["", "", "", "DEEP FORK HOLDINGS", "", "", "", "", "", ""], ["", "", "", "DEEP FORK HOLDINGS", "", "", "", "", "", ""],
["543149", "BAW", "BEDLAM BAR-B-Q", "LLC", "610 NORTHEAST 50TH", "OKLAHOMA CITY", "OK", "73105", "(405) 528-7427", "2015/02/23"], ["543149", "BAW", "BEDLAM BAR-B-Q", "LLC", "610 NORTHEAST 50TH", "OKLAHOMA CITY", "OK", "73105", "(405) 528-7427", "2015/02/23"]
["", "", "", "", "Page 1 of 151", "", "", "", "", ""]
] ]
data_stream_flag_size = [ data_stream_flag_size = [
["", "TABLE 125: STATE-WISE COMPOSITION OF OUTSTANDING LIABILITIES - 1997 <s>(Contd.)</s>", "", "", "", "", "", "", "", "", ""],
["", "", "", "", "(As at end-March)", "", "", "", "", "", ""],
["", "", "", "", "", "", "", "", "", "", "(<s>`</s> Billion)"],
["States", "Total", "Market", "NSSF", "WMA", "Loans", "Loans", "Loans", "Loans", "Loans", "Loans"], ["States", "Total", "Market", "NSSF", "WMA", "Loans", "Loans", "Loans", "Loans", "Loans", "Loans"],
["", "Internal", "Loans", "", "from", "from", "from", "from", "from", "from SBI", "from"], ["", "Internal", "Loans", "", "from", "from", "from", "from", "from", "from SBI", "from"],
["", "Debt", "", "", "RBI", "Banks", "LIC", "GIC", "NABARD", "& Other", "NCDC"], ["", "Debt", "", "", "RBI", "Banks", "LIC", "GIC", "NABARD", "& Other", "NCDC"],
@ -230,14 +309,12 @@ data_stream_flag_size = [
["Uttar Pradesh", "80.62", "74.89", "-", "4.34", "1.34", "0.6", "-", "-0.21", "0.18", "0.03"], ["Uttar Pradesh", "80.62", "74.89", "-", "4.34", "1.34", "0.6", "-", "-0.21", "0.18", "0.03"],
["West Bengal", "34.23", "32.19", "-", "-", "2.04", "0.77", "-", "0.06", "-", "0.51"], ["West Bengal", "34.23", "32.19", "-", "-", "2.04", "0.77", "-", "0.06", "-", "0.51"],
["NCT Delhi", "-", "-", "-", "-", "-", "-", "-", "-", "-", "-"], ["NCT Delhi", "-", "-", "-", "-", "-", "-", "-", "-", "-", "-"],
["ALL STATES", "513.38", "436.02", "-", "25.57", "51.06", "14.18", "-", "8.21", "11.83", "11.08"], ["ALL STATES", "513.38", "436.02", "-", "25.57", "51.06", "14.18", "-", "8.21", "11.83", "11.08"]
["<s>2</s> Includes `2.45 crore outstanding under “Market Loan Suspense”.", "", "", "", "", "", "", "", "", "", ""],
["", "", "", "", "445", "", "", "", "", "", ""]
] ]
data_lattice = [ data_lattice = [
["Cycle Name", "KI (1/km)", "Distance (mi)", "Percent Fuel Savings", "", "", ""], ["Cycle \nName", "KI \n(1/km)", "Distance \n(mi)", "Percent Fuel Savings", "", "", ""],
["", "", "", "Improved Speed", "Decreased Accel", "Eliminate Stops", "Decreased Idle"], ["", "", "", "Improved \nSpeed", "Decreased \nAccel", "Eliminate \nStops", "Decreased \nIdle"],
["2012_2", "3.30", "1.3", "5.9%", "9.5%", "29.2%", "17.4%"], ["2012_2", "3.30", "1.3", "5.9%", "9.5%", "29.2%", "17.4%"],
["2145_1", "0.68", "11.2", "2.4%", "0.1%", "9.5%", "2.7%"], ["2145_1", "0.68", "11.2", "2.4%", "0.1%", "9.5%", "2.7%"],
["4234_1", "0.59", "58.7", "8.5%", "1.3%", "8.5%", "3.3%"], ["4234_1", "0.59", "58.7", "8.5%", "1.3%", "8.5%", "3.3%"],
@ -246,7 +323,7 @@ data_lattice = [
] ]
data_lattice_table_rotated = [ data_lattice_table_rotated = [
["State", "Nutritional Assessment (No. of individuals)", "", "", "", "IYCF Practices (No. of mothers: 2011-12)", "Blood Pressure (No. of adults: 2011-12)", "", "Fasting Blood Sugar (No. of adults:2011-12)", ""], ["State", "Nutritional Assessment \n(No. of individuals)", "", "", "", "IYCF Practices \n(No. of mothers: \n2011-12)", "Blood Pressure \n(No. of adults: \n2011-12)", "", "Fasting Blood Sugar \n(No. of adults:\n2011-12)", ""],
["", "1975-79", "1988-90", "1996-97", "2011-12", "", "Men", "Women", "Men", "Women"], ["", "1975-79", "1988-90", "1996-97", "2011-12", "", "Men", "Women", "Men", "Women"],
["Kerala", "5738", "6633", "8864", "8297", "245", "2161", "3195", "1645", "2391"], ["Kerala", "5738", "6633", "8864", "8297", "245", "2161", "3195", "1645", "2391"],
["Tamil Nadu", "7387", "10217", "5813", "7851", "413", "2134", "2858", "1119", "1739"], ["Tamil Nadu", "7387", "10217", "5813", "7851", "413", "2134", "2858", "1119", "1739"],
@ -261,10 +338,42 @@ data_lattice_table_rotated = [
["Pooled", "38742", "53618", "60601", "86898", "4459", "21918", "27041", "14312", "18519"] ["Pooled", "38742", "53618", "60601", "86898", "4459", "21918", "27041", "14312", "18519"]
] ]
data_lattice_table_area = [ data_lattice_two_tables_1 = [
["State", "n", "Literacy Status", "", "", "", "", ""],
["", "", "Illiterate", "Read & \nWrite", "1-4 std.", "5-8 std.", "9-12 std.", "College"],
["Kerala", "2400", "7.2", "0.5", "25.3", "20.1", "41.5", "5.5"],
["Tamil Nadu", "2400", "21.4", "2.3", "8.8", "35.5", "25.8", "6.2"],
["Karnataka", "2399", "37.4", "2.8", "12.5", "18.3", "23.1", "5.8"],
["Andhra Pradesh", "2400", "54.0", "1.7", "8.4", "13.2", "18.8", "3.9"],
["Maharashtra", "2400", "22.0", "0.9", "17.3", "20.3", "32.6", "7.0"],
["Gujarat", "2390", "28.6", "0.1", "14.4", "23.1", "26.9", "6.8"],
["Madhya Pradesh", "2402", "29.1", "3.4", "8.5", "35.1", "13.3", "10.6"],
["Orissa", "2405", "33.2", "1.0", "10.4", "25.7", "21.2", "8.5"],
["West Bengal", "2293", "41.7", "4.4", "13.2", "17.1", "21.2", "2.4"],
["Uttar Pradesh", "2400", "35.3", "2.1", "4.5", "23.3", "27.1", "7.6"],
["Pooled", "23889", "30.9", "1.9", "12.3", "23.2", "25.2", "6.4"]
]
data_lattice_two_tables_2 = [
["State", "n", "Literacy Status", "", "", "", "", ""],
["", "", "Illiterate", "Read & \nWrite", "1-4 std.", "5-8 std.", "9-12 std.", "College"],
["Kerala", "2400", "8.8", "0.3", "20.1", "17.0", "45.6", "8.2"],
["Tamil Nadu", "2400", "29.9", "1.5", "8.5", "33.1", "22.3", "4.8"],
["Karnataka", "2399", "47.9", "2.5", "10.2", "18.8", "18.4", "2.3"],
["Andhra Pradesh", "2400", "66.4", "0.7", "6.8", "12.9", "11.4", "1.8"],
["Maharashtra", "2400", "41.3", "0.6", "14.1", "20.1", "21.6", "2.2"],
["Gujarat", "2390", "57.6", "0.1", "10.3", "16.5", "12.9", "2.7"],
["Madhya Pradesh", "2402", "58.7", "2.2", "6.6", "24.1", "5.3", "3.0"],
["Orissa", "2405", "50.0", "0.9", "8.1", "21.9", "15.1", "4.0"],
["West Bengal", "2293", "49.1", "4.8", "11.2", "16.8", "17.1", "1.1"],
["Uttar Pradesh", "2400", "67.3", "2.0", "3.1", "17.2", "7.7", "2.7"],
["Pooled", "23889", "47.7", "1.5", "9.9", "19.9", "17.8", "3.3"]
]
data_lattice_table_areas = [
["", "", "", "", "", "", "", "", ""], ["", "", "", "", "", "", "", "", ""],
["State", "n", "Literacy Status", "", "", "", "", "", ""], ["State", "n", "Literacy Status", "", "", "", "", "", ""],
["", "", "Illiterate", "Read & Write", "1-4 std.", "5-8 std.", "9-12 std.", "College", ""], ["", "", "Illiterate", "Read & \nWrite", "1-4 std.", "5-8 std.", "9-12 std.", "College", ""],
["Kerala", "2400", "7.2", "0.5", "25.3", "20.1", "41.5", "5.5", ""], ["Kerala", "2400", "7.2", "0.5", "25.3", "20.1", "41.5", "5.5", ""],
["Tamil Nadu", "2400", "21.4", "2.3", "8.8", "35.5", "25.8", "6.2", ""], ["Tamil Nadu", "2400", "21.4", "2.3", "8.8", "35.5", "25.8", "6.2", ""],
["Karnataka", "2399", "37.4", "2.8", "12.5", "18.3", "23.1", "5.8", ""], ["Karnataka", "2399", "37.4", "2.8", "12.5", "18.3", "23.1", "5.8", ""],
@ -280,13 +389,13 @@ data_lattice_table_area = [
] ]
data_lattice_process_background = [ data_lattice_process_background = [
["State", "Date", "Halt stations", "Halt days", "Persons directly reached(in lakh)", "Persons trained", "Persons counseled" ,"Persons testedfor HIV"], ["State", "Date", "Halt \nstations", "Halt \ndays", "Persons \ndirectly \nreached\n(in lakh)", "Persons \ntrained", "Persons \ncounseled", "Persons \ntested\nfor HIV"],
["Delhi", "1.12.2009", "8", "17", "1.29", "3,665", "2,409", "1,000"], ["Delhi", "1.12.2009", "8", "17", "1.29", "3,665", "2,409", "1,000"],
["Rajasthan", "2.12.2009 to 19.12.2009", "", "", "", "", "", ""], ["Rajasthan", "2.12.2009 to \n19.12.2009", "", "", "", "", "", ""],
["Gujarat", "20.12.2009 to 3.1.2010", "6", "13", "6.03", "3,810", "2,317", "1,453"], ["Gujarat", "20.12.2009 to \n3.1.2010", "6", "13", "6.03", "3,810", "2,317", "1,453"],
["Maharashtra", "4.01.2010 to 1.2.2010", "13", "26", "1.27", "5,680", "9,027", "4,153"], ["Maharashtra", "4.01.2010 to \n1.2.2010", "13", "26", "1.27", "5,680", "9,027", "4,153"],
["Karnataka", "2.2.2010 to 22.2.2010", "11", "19", "1.80", "5,741", "3,658", "3,183"], ["Karnataka", "2.2.2010 to \n22.2.2010", "11", "19", "1.80", "5,741", "3,658", "3,183"],
["Kerala", "23.2.2010 to 11.3.2010", "9", "17", "1.42", "3,559", "2,173", "855"], ["Kerala", "23.2.2010 to \n11.3.2010", "9", "17", "1.42", "3,559", "2,173", "855"],
["Total", "", "47", "92", "11.81", "22,455", "19,584", "10,644"] ["Total", "", "47", "92", "11.81", "22,455", "19,584", "10,644"]
] ]
@ -330,11 +439,11 @@ data_lattice_copy_text = [
["PCCM", "San Francisco", "Family Mosaic", "25"], ["PCCM", "San Francisco", "Family Mosaic", "25"],
["PCCM", "Total PHP Enrollment", "", "853"], ["PCCM", "Total PHP Enrollment", "", "853"],
["All Models Total Enrollments", "", "", "10,132,875"], ["All Models Total Enrollments", "", "", "10,132,875"],
["Source: Data Warehouse 12/14/15", "", "", ""] ["Source: Data Warehouse \n12/14/15", "", "", ""]
] ]
data_lattice_shift_text_left_top = [ data_lattice_shift_text_left_top = [
["Investigations", "No. ofHHs", "Age/Sex/Physiological Group", "Preva-lence", "C.I*", "RelativePrecision", "Sample sizeper State"], ["Investigations", "No. of\nHHs", "Age/Sex/\nPhysiological Group", "Preva-\nlence", "C.I*", "Relative\nPrecision", "Sample size\nper State"],
["Anthropometry", "2400", "All the available individuals", "", "", "", ""], ["Anthropometry", "2400", "All the available individuals", "", "", "", ""],
["Clinical Examination", "", "", "", "", "", ""], ["Clinical Examination", "", "", "", "", "", ""],
["History of morbidity", "", "", "", "", "", ""], ["History of morbidity", "", "", "", "", "", ""],
@ -343,12 +452,12 @@ data_lattice_shift_text_left_top = [
["", "", "Women (≥ 18 yrs)", "", "", "", "1728"], ["", "", "Women (≥ 18 yrs)", "", "", "", "1728"],
["Fasting blood glucose", "2400", "Men (≥ 18 yrs)", "5%", "95%", "20%", "1825"], ["Fasting blood glucose", "2400", "Men (≥ 18 yrs)", "5%", "95%", "20%", "1825"],
["", "", "Women (≥ 18 yrs)", "", "", "", "1825"], ["", "", "Women (≥ 18 yrs)", "", "", "", "1825"],
["Knowledge &Practices on HTN &DM", "2400", "Men (≥ 18 yrs)", "-", "-", "-", "1728"], ["Knowledge &\nPractices on HTN &\nDM", "2400", "Men (≥ 18 yrs)", "-", "-", "-", "1728"],
["", "2400", "Women (≥ 18 yrs)", "-", "-", "-", "1728"] ["", "2400", "Women (≥ 18 yrs)", "-", "-", "-", "1728"]
] ]
data_lattice_shift_text_disable = [ data_lattice_shift_text_disable = [
["Investigations", "No. ofHHs", "Age/Sex/Physiological Group", "Preva-lence", "C.I*", "RelativePrecision", "Sample sizeper State"], ["Investigations", "No. of\nHHs", "Age/Sex/\nPhysiological Group", "Preva-\nlence", "C.I*", "Relative\nPrecision", "Sample size\nper State"],
["Anthropometry", "", "", "", "", "", ""], ["Anthropometry", "", "", "", "", "", ""],
["Clinical Examination", "2400", "", "All the available individuals", "", "", ""], ["Clinical Examination", "2400", "", "All the available individuals", "", "", ""],
["History of morbidity", "", "", "", "", "", ""], ["History of morbidity", "", "", "", "", "", ""],
@ -357,12 +466,12 @@ data_lattice_shift_text_disable = [
["Blood Pressure #", "2400", "Women (≥ 18 yrs)", "10%", "95%", "20%", "1728"], ["Blood Pressure #", "2400", "Women (≥ 18 yrs)", "10%", "95%", "20%", "1728"],
["", "", "Men (≥ 18 yrs)", "", "", "", "1825"], ["", "", "Men (≥ 18 yrs)", "", "", "", "1825"],
["Fasting blood glucose", "2400", "Women (≥ 18 yrs)", "5%", "95%", "20%", "1825"], ["Fasting blood glucose", "2400", "Women (≥ 18 yrs)", "5%", "95%", "20%", "1825"],
["Knowledge &Practices on HTN &", "2400", "Men (≥ 18 yrs)", "-", "-", "-", "1728"], ["Knowledge &\nPractices on HTN &", "2400", "Men (≥ 18 yrs)", "-", "-", "-", "1728"],
["DM", "2400", "Women (≥ 18 yrs)", "-", "-", "-", "1728"] ["DM", "2400", "Women (≥ 18 yrs)", "-", "-", "-", "1728"]
] ]
data_lattice_shift_text_right_bottom = [ data_lattice_shift_text_right_bottom = [
["Investigations", "No. ofHHs", "Age/Sex/Physiological Group", "Preva-lence", "C.I*", "RelativePrecision", "Sample sizeper State"], ["Investigations", "No. of\nHHs", "Age/Sex/\nPhysiological Group", "Preva-\nlence", "C.I*", "Relative\nPrecision", "Sample size\nper State"],
["Anthropometry", "", "", "", "", "", ""], ["Anthropometry", "", "", "", "", "", ""],
["Clinical Examination", "", "", "", "", "", ""], ["Clinical Examination", "", "", "", "", "", ""],
["History of morbidity", "2400", "", "", "", "", "All the available individuals"], ["History of morbidity", "2400", "", "", "", "", "All the available individuals"],
@ -372,5 +481,13 @@ data_lattice_shift_text_right_bottom = [
["", "", "Men (≥ 18 yrs)", "", "", "", "1825"], ["", "", "Men (≥ 18 yrs)", "", "", "", "1825"],
["Fasting blood glucose", "2400", "Women (≥ 18 yrs)", "5%", "95%", "20%", "1825"], ["Fasting blood glucose", "2400", "Women (≥ 18 yrs)", "5%", "95%", "20%", "1825"],
["", "2400", "Men (≥ 18 yrs)", "-", "-", "-", "1728"], ["", "2400", "Men (≥ 18 yrs)", "-", "-", "-", "1728"],
["Knowledge &Practices on HTN &DM", "2400", "Women (≥ 18 yrs)", "-", "-", "-", "1728"] ["Knowledge &\nPractices on HTN &\nDM", "2400", "Women (≥ 18 yrs)", "-", "-", "-", "1728"]
]
data_arabic = [
['ً\n\xa0\nﺎﺒﺣﺮﻣ', 'ﻥﺎﻄﻠﺳ\xa0ﻲﻤﺳﺍ'],
['ﻝﺎﻤﺸﻟﺍ\xa0ﺎﻨﻴﻟﻭﺭﺎﻛ\xa0ﺔﻳﻻﻭ\xa0ﻦﻣ\xa0ﺎﻧﺍ', '؟ﺖﻧﺍ\xa0ﻦﻳﺍ\xa0ﻦﻣ'],
['1234', 'ﻂﻄﻗ\xa047\xa0ﻱﺪﻨﻋ'],
['؟ﻙﺎﺒﺷ\xa0ﺖﻧﺍ\xa0ﻞﻫ', 'ﺔﻳﺰﻴﻠﺠﻧﻻﺍ\xa0ﻲﻓ\xa0Jeremy\xa0ﻲﻤﺳﺍ'],
['Jeremy\xa0is\xa0ﻲﻣﺮﺟ\xa0in\xa0Arabic', '']
] ]

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

View File

@ -52,6 +52,30 @@ def test_cli_stream():
assert format_error in result.output assert format_error in result.output
def test_cli_password():
with TemporaryDirectory() as tempdir:
infile = os.path.join(testdir, 'health_protected.pdf')
outfile = os.path.join(tempdir, 'health_protected.csv')
runner = CliRunner()
result = runner.invoke(cli, ['--password', 'userpass',
'--format', 'csv', '--output', outfile,
'stream', infile])
assert result.exit_code == 0
assert result.output == 'Found 1 tables\n'
output_error = 'file has not been decrypted'
# no password
result = runner.invoke(cli, ['--format', 'csv', '--output', outfile,
'stream', infile])
assert output_error in str(result.exception)
# bad password
result = runner.invoke(cli, ['--password', 'wrongpass',
'--format', 'csv', '--output', outfile,
'stream', infile])
assert output_error in str(result.exception)
def test_cli_output_format(): def test_cli_output_format():
with TemporaryDirectory() as tempdir: with TemporaryDirectory() as tempdir:
infile = os.path.join(testdir, 'health.pdf') infile = os.path.join(testdir, 'health.pdf')
@ -77,3 +101,17 @@ def test_cli_output_format():
result = runner.invoke(cli, ['--zip', '--format', 'csv', '--output', outfile.format('csv'), result = runner.invoke(cli, ['--zip', '--format', 'csv', '--output', outfile.format('csv'),
'stream', infile]) 'stream', infile])
assert result.exit_code == 0 assert result.exit_code == 0
def test_cli_quiet():
with TemporaryDirectory() as tempdir:
infile = os.path.join(testdir, 'blank.pdf')
outfile = os.path.join(tempdir, 'blank.csv')
runner = CliRunner()
result = runner.invoke(cli, ['--format', 'csv', '--output', outfile,
'stream', infile])
assert 'No tables found on page-1' in result.output
result = runner.invoke(cli, ['--quiet', '--format', 'csv',
'--output', outfile, 'stream', infile])
assert 'No tables found on page-1' not in result.output

View File

@ -25,6 +25,17 @@ def test_parsing_report():
assert tables[0].parsing_report == parsing_report assert tables[0].parsing_report == parsing_report
def test_password():
df = pd.DataFrame(data_stream)
filename = os.path.join(testdir, "health_protected.pdf")
tables = camelot.read_pdf(filename, password="ownerpass", flavor="stream")
assert df.equals(tables[0].df)
tables = camelot.read_pdf(filename, password="userpass", flavor="stream")
assert df.equals(tables[0].df)
def test_stream(): def test_stream():
df = pd.DataFrame(data_stream) df = pd.DataFrame(data_stream)
@ -45,11 +56,23 @@ def test_stream_table_rotated():
assert df.equals(tables[0].df) assert df.equals(tables[0].df)
def test_stream_table_area(): def test_stream_two_tables():
df = pd.DataFrame(data_stream_table_area) df1 = pd.DataFrame(data_stream_two_tables_1)
df2 = pd.DataFrame(data_stream_two_tables_2)
filename = os.path.join(testdir, "tabula/12s0324.pdf")
tables = camelot.read_pdf(filename, flavor='stream')
assert len(tables) == 2
assert df1.equals(tables[0].df)
assert df2.equals(tables[1].df)
def test_stream_table_areas():
df = pd.DataFrame(data_stream_table_areas)
filename = os.path.join(testdir, "tabula/us-007.pdf") filename = os.path.join(testdir, "tabula/us-007.pdf")
tables = camelot.read_pdf(filename, flavor="stream", table_area=["320,500,573,335"]) tables = camelot.read_pdf(filename, flavor="stream", table_areas=["320,500,573,335"])
assert df.equals(tables[0].df) assert df.equals(tables[0].df)
@ -100,11 +123,22 @@ def test_lattice_table_rotated():
assert df.equals(tables[0].df) assert df.equals(tables[0].df)
def test_lattice_table_area(): def test_lattice_two_tables():
df = pd.DataFrame(data_lattice_table_area) df1 = pd.DataFrame(data_lattice_two_tables_1)
df2 = pd.DataFrame(data_lattice_two_tables_2)
filename = os.path.join(testdir, "twotables_2.pdf") filename = os.path.join(testdir, "twotables_2.pdf")
tables = camelot.read_pdf(filename, table_area=["80,693,535,448"]) tables = camelot.read_pdf(filename)
assert len(tables) == 2
assert df1.equals(tables[0].df)
assert df2.equals(tables[1].df)
def test_lattice_table_areas():
df = pd.DataFrame(data_lattice_table_areas)
filename = os.path.join(testdir, "twotables_2.pdf")
tables = camelot.read_pdf(filename, table_areas=["80,693,535,448"])
assert df.equals(tables[0].df) assert df.equals(tables[0].df)
@ -146,3 +180,11 @@ def test_repr():
assert repr(tables) == "<TableList n=1>" assert repr(tables) == "<TableList n=1>"
assert repr(tables[0]) == "<Table shape=(7, 7)>" assert repr(tables[0]) == "<Table shape=(7, 7)>"
assert repr(tables[0].cells[0][0]) == "<Cell x1=120.48 y1=218.42 x2=164.64 y2=233.89>" assert repr(tables[0].cells[0][0]) == "<Cell x1=120.48 y1=218.42 x2=164.64 y2=233.89>"
def test_arabic():
df = pd.DataFrame(data_arabic)
filename = os.path.join(testdir, "tabula/arabic.pdf")
tables = camelot.read_pdf(filename)
assert df.equals(tables[0].df)

View File

@ -34,20 +34,70 @@ def test_unsupported_format():
def test_stream_equal_length(): def test_stream_equal_length():
message = ("Length of table_area and columns" message = ("Length of table_areas and columns"
" should be equal") " should be equal")
with pytest.raises(ValueError, message=message): with pytest.raises(ValueError, message=message):
tables = camelot.read_pdf(filename, flavor='stream', tables = camelot.read_pdf(filename, flavor='stream',
table_area=['10,20,30,40'], columns=['10,20,30,40', '10,20,30,40']) table_areas=['10,20,30,40'], columns=['10,20,30,40', '10,20,30,40'])
def test_no_tables_found(): def test_no_tables_found():
filename = os.path.join(testdir, 'blank.pdf') filename = os.path.join(testdir, 'blank.pdf')
# TODO: use pytest.warns
with warnings.catch_warnings(): with warnings.catch_warnings():
warnings.simplefilter('error') warnings.simplefilter('error')
try: with pytest.raises(UserWarning) as e:
tables = camelot.read_pdf(filename) tables = camelot.read_pdf(filename)
except Exception as e: assert str(e.value) == 'No tables found on page-1'
assert type(e).__name__ == 'UserWarning'
assert str(e) == 'No tables found on page-1'
def test_no_tables_found_logs_suppressed():
filename = os.path.join(testdir, 'foo.pdf')
with warnings.catch_warnings():
# the test should fail if any warning is thrown
warnings.simplefilter('error')
try:
tables = camelot.read_pdf(filename, suppress_stdout=True)
except Warning as e:
warning_text = str(e)
pytest.fail('Unexpected warning: {}'.format(warning_text))
def test_no_tables_found_warnings_suppressed():
filename = os.path.join(testdir, 'blank.pdf')
with warnings.catch_warnings():
# the test should fail if any warning is thrown
warnings.simplefilter('error')
try:
tables = camelot.read_pdf(filename, suppress_stdout=True)
except Warning as e:
warning_text = str(e)
pytest.fail('Unexpected warning: {}'.format(warning_text))
def test_ghostscript_not_found(monkeypatch):
import distutils
def _find_executable_patch(arg):
return ''
monkeypatch.setattr(distutils.spawn, 'find_executable', _find_executable_patch)
message = ('Please make sure that Ghostscript is installed and available'
' on the PATH environment variable')
filename = os.path.join(testdir, 'foo.pdf')
with pytest.raises(Exception, message=message):
tables = camelot.read_pdf(filename)
def test_no_password():
filename = os.path.join(testdir, 'health_protected.pdf')
message = 'file has not been decrypted'
with pytest.raises(Exception, message=message):
tables = camelot.read_pdf(filename)
def test_bad_password():
filename = os.path.join(testdir, 'health_protected.pdf')
message = 'file has not been decrypted'
with pytest.raises(Exception, message=message):
tables = camelot.read_pdf(filename, password='wrongpass')

View File

@ -0,0 +1,67 @@
# -*- coding: utf-8 -*-
import os
import pytest
import camelot
testdir = os.path.dirname(os.path.abspath(__file__))
testdir = os.path.join(testdir, "files")
@pytest.mark.mpl_image_compare(
baseline_dir="files/baseline_plots", remove_text=True)
def test_text_plot():
filename = os.path.join(testdir, "foo.pdf")
tables = camelot.read_pdf(filename)
return camelot.plot(tables[0], kind='text')
@pytest.mark.mpl_image_compare(
baseline_dir="files/baseline_plots", remove_text=True)
def test_grid_plot():
filename = os.path.join(testdir, "foo.pdf")
tables = camelot.read_pdf(filename)
return camelot.plot(tables[0], kind='grid')
@pytest.mark.mpl_image_compare(
baseline_dir="files/baseline_plots", remove_text=True)
def test_lattice_contour_plot():
filename = os.path.join(testdir, "foo.pdf")
tables = camelot.read_pdf(filename)
return camelot.plot(tables[0], kind='contour')
@pytest.mark.mpl_image_compare(
baseline_dir="files/baseline_plots", remove_text=True)
def test_stream_contour_plot():
filename = os.path.join(testdir, "tabula/12s0324.pdf")
tables = camelot.read_pdf(filename, flavor='stream')
return camelot.plot(tables[0], kind='contour')
@pytest.mark.mpl_image_compare(
baseline_dir="files/baseline_plots", remove_text=True)
def test_line_plot():
filename = os.path.join(testdir, "foo.pdf")
tables = camelot.read_pdf(filename)
return camelot.plot(tables[0], kind='line')
@pytest.mark.mpl_image_compare(
baseline_dir="files/baseline_plots", remove_text=True)
def test_joint_plot():
filename = os.path.join(testdir, "foo.pdf")
tables = camelot.read_pdf(filename)
return camelot.plot(tables[0], kind='joint')
@pytest.mark.mpl_image_compare(
baseline_dir="files/baseline_plots", remove_text=True)
def test_textedge_plot():
filename = os.path.join(testdir, "tabula/12s0324.pdf")
tables = camelot.read_pdf(filename, flavor='stream')
return camelot.plot(tables[0], kind='textedge')