Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .github/layout-parser.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
128 changes: 90 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,120 @@
<p align="center">
<img src="https://github.com/Layout-Parser/layout-parser/raw/master/.github/layout-parser.png" alt="Layout Parser Logo" width="35%">
<p align="center">
<h2 align="center">
A unified toolkit for Deep Learning Based Document Image Analysis
</p>
</h2>
</p>

<p align=center>
<a href="https://arxiv.org/abs/2103.15348"><img src="https://img.shields.io/badge/arXiv-2103.15348-b31b1b.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.github.io"><img src="https://img.shields.io/badge/website-layout--parser.github.io-informational.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/doc-layout--parser.readthedocs.io-light.svg" title="Layout Parser Documentation"></a>
<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/v/layoutparser?color=%23099cec&label=PyPI%20package&logo=pypi&logoColor=white" title="The current version of Layout Parser"></a>
<a href="https://github.com/Layout-Parser/layout-parser/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/layoutparser" title="Layout Parser uses Apache 2 License"></a>
<img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/layoutparser">
</p>

<p align=center>
<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/v/layoutparser?color=%23099cec&label=PyPI%20package&logo=pypi&logoColor=white" title="The current version of Layout Parser"></a>
<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/pyversions/layoutparser?color=%23099cec&" alt="Python 3.6 3.7 3.8" title="Layout Parser supports Python 3.6 and above"></a>
<img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/layoutparser">
<a href="https://github.com/Layout-Parser/layout-parser/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/layoutparser" title="Layout Parser uses Apache 2 License"></a>
<a href="https://arxiv.org/abs/2103.15348"><img src="https://img.shields.io/badge/paper-2103.15348-b31b1b.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.github.io"><img src="https://img.shields.io/badge/website-layout--parser.github.io-informational.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/doc-layout--parser.readthedocs.io-light.svg" title="Layout Parser Documentation"></a>
</p>

---

## Installation
## What is LayoutParser

![Example Usage](.github/example.png)

You can find detailed installation instructions in [installation.md](installation.md). But generally, it's just `pip install`
some libraries:
LayoutParser aims to provide a wide range of tools that aims to streamline Document Image Analysis (DIA) tasks. Please check the LayoutParser [demo video](https://youtu.be/8yA5xB4Dg8c) (1 min) or [full talk](https://www.youtube.com/watch?v=YG0qepPgyGY) (15 min) for details. And here are some key features:

- LayoutParser provides a rich repository of deep learning models for layout detection as well as a set of unified APIs for using them. For example,

<details>
<summary>Perform DL layout detection in 4 lines of code</summary>

```python
import layoutparser as lp
model = lp.AutoLayoutModel('lp://EfficientDete/PubLayNet')
# image = Image.open("path/to/image")
layout = model.detect(image)
```

</details>

- LayoutParser comes with a set of layout data structures with carefully designed APIs that are optimized for document image analysis tasks. For example,

<details>
<summary>Selecting layout/textual elements in the left column of a page</summary>

```python
image_width = image.size[0]
left_column = lp.Interval(0, image_width/2, axis='x')
layout.filter_by(left_column, center=True) # select objects in the left column
```

</details>

<details>
<summary>Performing OCR for each detected Layout Region</summary>

```python
ocr_agent = lp.TesseractAgent()
for layout_region in layout:
image_segment = layout_region.crop(image)
text = ocr_agent.detect(image_segment)
```

</details>

<details>
<summary>Flexible APIs for visualizing the detected layouts</summary>

```python
lp.draw_box(image, layout, box_width=1, show_element_id=True, box_alpha=0.25)
```

</details>

</details>

<details>
<summary>Loading layout data stored in json, csv, and even PDFs</summary>

```python
layout = lp.load_json("path/to/json")
layout = lp.load_csv("path/to/csv")
pdf_layout = lp.load_pdf("path/to/pdf")
```

</details>

- LayoutParser is also a open platform that enables the sharing of layout detection models and DIA pipelines among the community.
<details>
<summary><a href="https://layout-parser.github.io/platform/">Check</a> the LayoutParser open platform</summary>
</details>

<details>
<summary><a href="https://github.com/Layout-Parser/platform">Submit</a> your models/pipelines to LayoutParser</summary>
</details>

```bash
pip install -U layoutparser
## Installation

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version.
pip install 'git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2'
After several major updates, layoutparser provides various functionalities and deep learning models from different backends. But it still easy to install layoutparser, and we designed the installation method in a way such that you can choose to install only the needed dependencies for your project:

# Install the ocr components when necessary
pip install layoutparser[ocr]
```bash
pip install layoutparser # Install the base layoutparser library with
pip install layoutparser[layoutmodels] # Install DL layout model toolkit
pip install layoutparser[ocr] # Install OCR toolkit
```

**For Windows Users:** Please read [installation.md](installation.md) for details about installing Detectron2.
Please check [installation.md](installation.md) for additional details on layoutparser installation.

## Quick Start
## Examples

We provide a series of examples for to help you start using the layout parser library:

1. [Table OCR and Results Parsing](https://github.com/Layout-Parser/layout-parser/blob/master/examples/OCR%20Tables%20and%20Parse%20the%20Output.ipynb): `layoutparser` can be used for conveniently OCR documents and convert the output in to structured data.

2. [Deep Layout Parsing Example](https://github.com/Layout-Parser/layout-parser/blob/master/examples/Deep%20Layout%20Parsing.ipynb): With the help of Deep Learning, `layoutparser` supports the analysis very complex documents and processing of the hierarchical structure in the layouts.


## DL Assisted Layout Prediction Example

![Example Usage](.github/example.png)

*The images shown in the figure above are: a screenshot of [this paper](https://arxiv.org/abs/2004.08686), an image from the [PRIMA Layout Analysis Dataset](https://www.primaresearch.org/dataset/), a screenshot of the [WSJ website](http://wsj.com), and an image from the [HJDataset](https://dell-research-harvard.github.io/HJDataset/).*

With only 4 lines of code in `layoutparse`, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the [ModelZoo](https://github.com/Layout-Parser/layout-parser/blob/master/docs/notes/modelzoo.md), or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

```python
>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations
```

## Contributing

We encourage you to contribute to Layout Parser! Please check out the [Contributing guidelines](.github/CONTRIBUTING.md) for guidelines about how to proceed. Join us!
Expand Down
37 changes: 22 additions & 15 deletions installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,39 @@

## Install Python

Layout Parser is a Python package that requires Python >= 3.6. If you do not have Python installed on your computer, you might want to turn to [the official instruction](https://www.python.org/downloads/) to download and install the appropriate version of Python.
LayoutParser is a Python package that requires Python >= 3.6. If you do not have Python installed on your computer, you might want to turn to [the official instruction](https://www.python.org/downloads/) to download and install the appropriate version of Python.

## Install the Layout Parser main library

Installing the Layout Parser library is very straightforward: you just need to run the following command:

```bash
pip3 install -U layoutparser
```
## Install the LayoutParser library

After several major updates, LayoutParser provides various functionalities and deep learning models from different backends. However, you might only need a fraction of the functions, and it would be redundant for you to install all the dependencies when they are not required. Therefore, we design highly customizable ways for installing the LayoutParser library:

## [Optional] Install Detectron2 for Using Layout Models

### For Mac OS and Linux Users
| Command | Description |
| --- | --- |
| `pip install layoutparser` | **Install the base LayoutParser Library**<br>It will support all key functions in LayoutParser, including:<br />1. Layout Data Structure and operations<br />2. Layout Visualization <br />3. Load/export the layout data |
| `pip install layoutparser[effdet]` | **Install LayoutParser with Layout Detection Model Support**<br />It will install the LayoutParser base library as well as<br />supporting dependencies for the ***EfficientDet***-based layout detection models. |
| `pip install torch && pip install layoutparser[detectron2]` | **Install LayoutParser with Layout Detection Model Support**<br />It will install the LayoutParser base library as well as<br />supporting dependencies for the ***Detectron2***-based layout detection models. See details in [Additional Instruction: Install Detectron2 Layout Model Backend](#additional-instruction-install-detectron2-layout-model-backend). |
| `pip install layoutparser[paddledetection]` | **Install LayoutParser with Layout Detection Model Support**<br />It will install the LayoutParser base library as well as<br />supporting dependencies for the ***PaddleDetection***-based layout detection models. |
| `pip install layoutparser[ocr]` | **Install LayoutParser with OCR Support**<br />It will install the LayoutParser base library as well as<br />supporting dependencies for performing OCRs. See details in [Additional Instruction: Install OCR utils](#additional-instruction-install-ocr-utils). |

If you would like to use deep learning models for layout detection, you also need to install Detectron2 on your computer. This could be done by running the following command:
### Additional Instruction: Install Detectron2 Layout Model Backend

#### For Mac OS and Linux Users

If you would like to use the Detectron2 models for layout detection, you might need to run the following command:

```bash
pip3 install 'git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2'
pip install torch && pip install layoutparser[detectron2]
```

This might take some time as the command will *compile* the library. You might also want to install a Detectron2 version
with GPU support or encounter some issues during the installation process. Please refer to the official Detectron2
This might take some time as the command will *compile* the library. If you also want to install a Detectron2 version
with GPU support or encounter some issues during the installation process, please refer to the official Detectron2
[installation instruction](https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md) for detailed
information.

### For Windows users
#### For Windows users

As reported by many users, the installation of Detectron2 can be rather tricky on Windows platforms. In our extensive tests, we find that it is nearly impossible to provide a one-line installation command for Windows users. As a workaround solution, for now we list the possible challenges for installing Detectron2 on Windows, and attach helpful resources for solving them. We are also investigating other possibilities to avoid installing Detectron2 to use pre-trained models. If you have any suggestions or ideas, please feel free to [submit an issue](https://github.com/Layout-Parser/layout-parser/issues) in our repo.

Expand All @@ -39,12 +46,12 @@ As reported by many users, the installation of Detectron2 can be rather tricky o
- `Detectron2` maintainers claim that they won't provide official support for Windows (see [1](https://github.com/facebookresearch/detectron2/issues/9#issuecomment-540974288) and [2](https://detectron2.readthedocs.io/en/latest/tutorials/install.html)), but Detectron2 is continuously built on windows with CircleCI (see [3](https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues)). Hopefully this situation will be improved in the future.


## [Optional] Install OCR utils
### Additional Instructions: Install OCR utils

Layout Parser also comes with supports for OCR functions. In order to use them, you need to install the OCR utils via:

```bash
pip3 install -U layoutparser[ocr]
pip install layoutparser[ocr]
```

Additionally, if you want to use the Tesseract-OCR engine, you also need to install it on your computer. Please check the
Expand Down