Structure of additional metadata for different file types

Для всех файлов помещенных или загруженных в/из облака

  1. bucket_name: str (required field) - bucket name in which the file is located. Included when analyzing a file from a cloud storage (example: "dedoc")
  2. cloud_file_path: str (required field) absolute path in the "bucket_name" on the cloud.

Docx/doc/odt files

  1. document_subject: str (optional field) - the topic of the content of the document.
  2. keywords: str (optional field) - a delimited set of keywords to support searching and indexing.
  3. category: str (optional field) - a categorization of the content of this document. Example values for this property might include: Resume, Letter, Financial Forecast, Proposal, Technical Presentation, and so on.
  4. author: str (optional field) - an entity primarily responsible for making the content of the document.
  5. last_modified_by: str (optional field) - the user who performed the last modification. The identification is environment-specific. Examples include a name, email address, or employee ID.
  6. created_date: str (optional field) - date of creation of the resource.
  7. modified_date: str (optional field) - date on which the resource was changed.
  8. last_printed_date: str (optional field) - the date and time of the last printing.