paperap.models.document.metadata.model module
Document metadata models for Paperless-NgX.
This module provides models for representing document metadata in Paperless-NgX, including file information, checksums, and document properties. These models are used to access and manipulate metadata associated with documents stored in the Paperless-NgX system.
- class paperap.models.document.metadata.model.MetadataElement(**data)[source]
Bases:
BaseModel
Represents a key-value pair of document metadata in Paperless-NgX.
This model represents individual metadata elements extracted from document files, such as author, creation date, or other file-specific properties. Each element consists of a key and its corresponding value.
- key
The metadata field name or identifier.
- value
The value associated with the metadata field.
Examples
>>> metadata = MetadataElement(key="Author", value="John Doe") >>> print(f"{metadata.key}: {metadata.value}") Author: John Doe
- Parameters:
data (
Any
)
- key: str
- value: str
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class paperap.models.document.metadata.model.DocumentMetadata(**data)[source]
Bases:
StandardModel
Represents comprehensive metadata for a Paperless-NgX document.
This model encapsulates all metadata associated with a document in Paperless-NgX, including information about both the original document and its archived version (if available). It provides access to file properties such as checksums, sizes, MIME types, and extracted metadata elements.
The metadata is primarily read-only as it is generated by the Paperless-NgX system during document processing.
- original_checksum
The SHA256 checksum of the original document file.
- original_size
The size of the original document in bytes.
- original_mime_type
The MIME type of the original document (e.g., “application/pdf”).
- media_filename
The filename of the document in the Paperless-NgX media storage.
- has_archive_version
Whether the document has an archived version (typically a PDF/A).
- original_metadata
List of metadata elements extracted from the original document.
- archive_checksum
The SHA256 checksum of the archived document version.
- archive_media_filename
The filename of the archived version in media storage.
- original_filename
The original filename of the document when it was uploaded.
- lang
The detected language code of the document content.
- archive_size
The size of the archived document version in bytes.
- archive_metadata
List of metadata elements extracted from the archived version.
Examples
>>> # Access document metadata >>> metadata = client.documents.get(123).metadata >>> print(f"Original file: {metadata.original_filename}") >>> print(f"Size: {metadata.original_size} bytes") >>> print(f"MIME type: {metadata.original_mime_type}") >>> >>> # Iterate through extracted metadata elements >>> for element in metadata.original_metadata: ... print(f"{element.key}: {element.value}")
- Parameters:
data (
Any
)
- original_checksum: str | None
- original_size: int | None
- original_mime_type: str | None
- media_filename: str | None
- has_archive_version: bool | None
- original_metadata: list[MetadataElement]
- archive_checksum: str | None
- archive_media_filename: str | None
- original_filename: str | None
- lang: str | None
- archive_size: int | None
- archive_metadata: list[MetadataElement]
- class Meta(model)[source]
Bases:
Meta
Metadata configuration for the DocumentMetadata model.
This class defines metadata properties for the DocumentMetadata model, particularly specifying which fields are read-only.
- Parameters:
model (type[_Self])
- read_only_fields: ClassVar[set[str]] = {'archive_checksum', 'archive_media_filename', 'archive_metadata', 'archive_size', 'has_archive_version', 'id', 'lang', 'media_filename', 'original_checksum', 'original_filename', 'original_metadata', 'original_mime_type', 'original_size'}
- blacklist_filtering_params: ClassVar[set[str]] = {}
- field_map: dict[str, str] = {}
- filtering_disabled: ClassVar[set[str]] = {}
- filtering_fields: ClassVar[set[str]] = {'_resource', 'archive_checksum', 'archive_media_filename', 'archive_metadata', 'archive_size', 'has_archive_version', 'id', 'lang', 'media_filename', 'original_checksum', 'original_filename', 'original_metadata', 'original_mime_type', 'original_size'}
- supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
- model: type[_Self]
- name: str
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- id: int