How to parse documents

Upload the file here and look at the result. 

Build and Run

DeDoc service start commands:

docker build . -t dedoc_container
docker run -p 1231:1231 --rm dedoc_container:latest python3 /dedoc/main.py
-> Service should rise at port 1231

Service parameters are configured in the config file (dedoc_project/dedoc/config.py)

The config is a python file, so you can use everything that standard python can do, for example, calculate the maximum file size as 512 * 1024 * 1024

How to use

You can send the file using the POST request to the address host:1231/upload 

The name of the downloaded file should appear on the form

Additional query options:

  1. language: string - document recognition language. The default value is "rus+eng". Available values: "rus+eng", "rus", "eng".
  2. with_attachments: boolean - option including analysis of attached files. The option is False by default. Available values: True, False.
  3. insert_table: boolean - this option enables embedding the table in the document tree. The option is False by default. Available values: True, False.
  4. return_format: str - an option to return the response in pretty_json, html, json or tree form. The default value is json. Use the pretty_json, tree and html format for debug only.
    Warning: html-format is used only for viewing the recognition result (in a readable form). For further analysis, we recommend using the output json format.
  5. structure_type: string - type output structure ('linear' or 'tree')

Other useful links