paperap.models.document.model module
Provide document models for interacting with Paperless-ngx documents.
This module contains the Document and DocumentNote models, which represent documents and their associated notes in the Paperless-ngx system. These models provide methods for retrieving, updating, and managing document metadata, content, and relationships with other entities like tags, correspondents, and custom fields.
- class paperap.models.document.model.DocumentNote(**data)[source]
Bases:
StandardModel
Represent a note on a Paperless-ngx document.
This class models user-created notes that can be attached to documents in the Paperless-ngx system. Notes include information about when they were created, who created them, and their content.
- deleted_at
Timestamp when the note was deleted, or None if not deleted.
- Type:
datetime | None
- restored_at
Timestamp when the note was restored after deletion, or None.
- Type:
datetime | None
- transaction_id
ID of the transaction that created or modified this note.
- Type:
int | None
- created
Timestamp when the note was created.
- Type:
datetime
Examples
>>> note = client.document_notes().get(1) >>> print(note.note) 'This is an important document' >>> print(note.created) 2023-01-15 14:30:22
- Parameters:
data (
Any
)
- deleted_at: datetime | None
- restored_at: datetime | None
- transaction_id: int | None
- note: str
- created: datetime
- document: int
- user: int
- class Meta(model)[source]
Bases:
Meta
- Parameters:
model (type[_Self])
- read_only_fields: ClassVar[set[str]] = {'created', 'deleted_at', 'id', 'restored_at', 'transaction_id'}
- blacklist_filtering_params: ClassVar[set[str]] = {}
- field_map: dict[str, str] = {}
- filtering_disabled: ClassVar[set[str]] = {}
- filtering_fields: ClassVar[set[str]] = {'_resource', 'created', 'deleted_at', 'document', 'id', 'note', 'restored_at', 'transaction_id', 'user'}
- supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
- model: type[_Self]
- name: str
- serialize_datetime(value)[source]
Serialize datetime fields to ISO format.
Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.
- Parameters:
value (
datetime | None
) – The datetime value to serialize.- Returns:
The serialized datetime value as an ISO 8601 string, or None if the value is None.
- Return type:
str | None
- get_document()[source]
Get the document associated with this note.
Retrieves the full Document object that this note is attached to by making an API request using the document ID.
- Returns:
The document associated with this note.
- Return type:
Example
>>> note = client.document_notes().get(1) >>> document = note.get_document() >>> print(document.title) 'Invoice #12345'
- get_user()[source]
Get the user who created this note.
Retrieves the full User object for the user who created this note by making an API request using the user ID.
- Returns:
The user who created this note.
- Return type:
Example
>>> note = client.document_notes().get(1) >>> user = note.get_user() >>> print(user.username) 'admin'
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- id: int
- class paperap.models.document.model.Document(**data)[source]
Bases:
StandardModel
Represent a Paperless-ngx document.
This class models documents stored in the Paperless-ngx system, providing access to document metadata, content, and related objects. It supports operations like downloading, updating metadata, and managing tags and custom fields.
- added
Timestamp when the document was added to the system.
- Type:
datetime | None
- archive_checksum
Checksum of the archived version of the document.
- Type:
str | None
- archive_filename
Filename of the archived version.
- Type:
str | None
- archive_serial_number
Serial number in the archive system.
- Type:
int | None
- archived_file_name
Original name of the archived file.
- Type:
str | None
- checksum
Checksum of the original document.
- Type:
str | None
- correspondent_id
ID of the associated correspondent.
- Type:
int | None
- created
Timestamp when the document was created.
- Type:
datetime | None
- created_date
Creation date as a string.
- Type:
str | None
- custom_field_dicts
Custom fields associated with the document.
- Type:
list[CustomFieldValues]
- deleted_at
Timestamp when the document was deleted, or None.
- Type:
datetime | None
- document_type_id
ID of the document type.
- Type:
int | None
- filename
Current filename in the system.
- Type:
str | None
Whether the document is shared by the requester.
- Type:
- notes
Notes attached to this document.
- Type:
list[DocumentNote]
- original_filename
Original filename when uploaded.
- Type:
str | None
- owner
ID of the document owner.
- Type:
int | None
- page_count
Number of pages in the document.
- Type:
int | None
- storage_path_id
ID of the storage path.
- Type:
int | None
- storage_type
Type of storage used.
- Type:
DocumentStorageType | None
- tag_ids
List of tag IDs associated with this document.
- Type:
list[int]
- user_can_change
Whether the current user can modify this document.
- Type:
bool | None
Examples
>>> document = client.documents().get(pk=1) >>> document.title = 'Example Document' >>> document.save() >>> document.title 'Example Document'
# Get document metadata >>> metadata = document.get_metadata() >>> print(metadata.original_mime_type) ‘application/pdf’
# Download document >>> download = document.download() >>> with open(download.disposition_filename, ‘wb’) as f: … f.write(download.content)
# Get document suggestions >>> suggestions = document.get_suggestions() >>> print(suggestions.tags) [‘Invoice’, ‘Tax’, ‘2023’]
- Parameters:
data (
Any
)
- added: datetime | None
- archive_checksum: str | None
- archive_filename: str | None
- archive_serial_number: int | None
- archived_file_name: str | None
- checksum: str | None
- content: str
- correspondent_id: int | None
- created: datetime | None
- created_date: str | None
- custom_field_dicts: Annotated[list[CustomFieldValues], Field(default_factory=list)]
- deleted_at: datetime | None
- document_type_id: int | None
- filename: str | None
- is_shared_by_requester: bool
- notes: list[DocumentNote]
- original_filename: str | None
- owner: int | None
- page_count: int | None
- storage_path_id: int | None
- storage_type: DocumentStorageType | None
- tag_ids: Annotated[list[int], Field(default_factory=list)]
- title: str
- user_can_change: bool | None
- class Meta(model)[source]
Bases:
Meta
- Parameters:
model (type[_Self])
- read_only_fields: ClassVar[set[str]] = {'archived_file_name', 'deleted_at', 'id', 'is_shared_by_requester', 'page_count'}
- filtering_disabled: ClassVar[set[str]] = {'deleted_at', 'is_shared_by_requester', 'page_count'}
- filtering_strategies: ClassVar[set[FilteringStrategies]] = {FilteringStrategies.WHITELIST}
- field_map: dict[str, str] = {'correspondent': 'correspondent_id', 'custom_fields': 'custom_field_dicts', 'document_type': 'document_type_id', 'storage_path': 'storage_path_id', 'tags': 'tag_ids'}
- supported_filtering_params: ClassVar[set[str]] = {'added__date__gt', 'added__date__lt', 'added__day', 'added__gt', 'added__lt', 'added__month', 'added__year', 'archive_serial_number', 'archive_serial_number__gt', 'archive_serial_number__gte', 'archive_serial_number__isnull', 'archive_serial_number__lt', 'archive_serial_number__lte', 'checksum__icontains', 'checksum__iendswith', 'checksum__iexact', 'checksum__istartswith', 'content__contains', 'content__icontains', 'content__iendswith', 'content__iexact', 'content__istartswith', 'correspondent__id', 'correspondent__id__in', 'correspondent__id__none', 'correspondent__isnull', 'correspondent__name__icontains', 'correspondent__name__iendswith', 'correspondent__name__iexact', 'correspondent__name__istartswith', 'correspondent__slug__iexact', 'created__date__gt', 'created__date__lt', 'created__day', 'created__gt', 'created__lt', 'created__month', 'created__year', 'custom_field_query', 'custom_fields__icontains', 'custom_fields__id__all', 'custom_fields__id__in', 'custom_fields__id__none', 'document_type__id', 'document_type__id__in', 'document_type__id__none', 'document_type__isnull', 'document_type__name__icontains', 'document_type__name__iendswith', 'document_type__name__iexact', 'document_type__name__istartswith', 'has_custom_fields', 'id', 'id__in', 'is_in_inbox', 'is_tagged', 'limit', 'original_filename__icontains', 'original_filename__iendswith', 'original_filename__iexact', 'original_filename__istartswith', 'owner__id', 'owner__id__in', 'owner__id__none', 'owner__isnull', 'shared_by__id', 'shared_by__id__in', 'storage_path__id', 'storage_path__id__in', 'storage_path__id__none', 'storage_path__isnull', 'storage_path__name__icontains', 'storage_path__name__iendswith', 'storage_path__name__iexact', 'storage_path__name__istartswith', 'tags__id', 'tags__id__all', 'tags__id__in', 'tags__id__none', 'tags__name__icontains', 'tags__name__iendswith', 'tags__name__iexact', 'tags__name__istartswith', 'title__icontains', 'title__iendswith', 'title__iexact', 'title__istartswith', 'title_content'}
- blacklist_filtering_params: ClassVar[set[str]] = {}
- filtering_fields: ClassVar[set[str]] = {'__search_hit__', '_correspondent', '_document_type', '_resource', '_storage_path', 'added', 'archive_checksum', 'archive_filename', 'archive_serial_number', 'archived_file_name', 'checksum', 'content', 'correspondent_id', 'created', 'created_date', 'custom_field_dicts', 'document_type_id', 'filename', 'id', 'notes', 'original_filename', 'owner', 'storage_path_id', 'storage_type', 'tag_ids', 'title', 'user_can_change'}
- model: type[_Self]
- name: str
- serialize_datetime(value)[source]
Serialize datetime fields to ISO format.
Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.
- Parameters:
value (
datetime | None
) – The datetime value to serialize.- Returns:
The serialized datetime value as an ISO 8601 string, or None.
- Return type:
str | None
- serialize_notes(value)[source]
Serialize notes to a list of dictionaries.
Converts DocumentNote objects to dictionaries for JSON serialization. Returns an empty list if the input value is None or empty.
- classmethod validate_tags(value)[source]
Validate and convert tag IDs to a list of integers.
Ensures tag IDs are properly formatted as a list of integers. Handles various input formats including None, single integers, and lists.
- Parameters:
value (
Any
) – The tag IDs to validate, which can be None, an integer, or a list.- Returns:
A list of validated tag IDs.
- Return type:
- Raises:
TypeError – If the input value is not None, an integer, or a list.
Examples
>>> Document.validate_tags(None) [] >>> Document.validate_tags(5) [5] >>> Document.validate_tags([1, 2, 3]) [1, 2, 3]
- classmethod validate_custom_fields(value)[source]
Validate and return custom field dictionaries.
Ensures custom fields are properly formatted as a list of CustomFieldValues. Returns an empty list if the input value is None.
- Parameters:
value (
Any
) – The list of custom field dictionaries to validate.- Returns:
A list of validated custom field dictionaries.
- Return type:
- Raises:
TypeError – If the input value is not None or a list.
- classmethod validate_text(value)[source]
Validate and return a text field.
Ensures text fields are properly formatted as strings. Converts integers to strings and returns an empty string if the input value is None.
- Parameters:
value (
Any
) – The value of the text field to validate.- Returns:
The validated text value.
- Return type:
- Raises:
TypeError – If the input value is not None, a string, or an integer.
Examples
>>> Document.validate_text(None) '' >>> Document.validate_text("Hello") 'Hello' >>> Document.validate_text(123) '123'
- classmethod validate_notes(value)[source]
Validate and return the list of notes.
Ensures notes are properly formatted as a list of DocumentNote objects. Handles various input formats including None, single DocumentNote objects, and lists.
Validate and return the is_shared_by_requester flag.
Ensures the is_shared_by_requester flag is properly formatted as a boolean. Returns False if the input value is None.
- property custom_field_ids: list[int]
Get the IDs of the custom fields for this document.
Example
>>> document = client.documents().get(1) >>> field_ids = document.custom_field_ids >>> print(field_ids) [1, 3, 5]
- property custom_field_values: list[Any]
Get the values of the custom fields for this document.
- Returns:
A list of values for the custom fields associated with this document.
- Return type:
list[Any]
Example
>>> document = client.documents().get(1) >>> values = document.custom_field_values >>> print(values) ['2023-01-15', 'INV-12345', True]
- property tag_names: list[str]
Get the names of the tags for this document.
Example
>>> document = client.documents().get(1) >>> names = document.tag_names >>> print(names) ['Invoice', 'Tax', 'Important']
- property tags: TagQuerySet
Get the tags for this document.
Returns a QuerySet of Tag objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the tags are actually accessed.
- Returns:
QuerySet of tags associated with this document.
- Return type:
Examples
>>> document = client.documents().get(pk=1) >>> for tag in document.tags: ... print(f'{tag.name} # {tag.id}') Tag 1 # 1 Tag 2 # 2 Tag 3 # 3
>>> if 5 in document.tags: ... print('Tag ID #5 is associated with this document')
>>> tag = client.tags().get(pk=1) >>> if tag in document.tags: ... print('Tag ID #1 is associated with this document')
>>> filtered_tags = document.tags.filter(name__icontains='example') >>> for tag in filtered_tags: ... print(f'{tag.name} # {tag.id}')
- property correspondent: Correspondent | None
Get the correspondent for this document.
Retrieves the Correspondent object associated with this document. Uses caching to minimize API requests when accessing the same correspondent multiple times.
- Returns:
The correspondent object or None if not set.
- Return type:
Correspondent | None
Examples
>>> document = client.documents().get(pk=1) >>> if document.correspondent: ... print(document.correspondent.name) Example Correspondent
- property document_type: DocumentType | None
Get the document type for this document.
Retrieves the DocumentType object associated with this document. Uses caching to minimize API requests when accessing the same document type multiple times.
- Returns:
The document type object or None if not set.
- Return type:
DocumentType | None
Examples
>>> document = client.documents().get(pk=1) >>> if document.document_type: ... print(document.document_type.name) Example Document Type
- property storage_path: StoragePath | None
Get the storage path for this document.
Retrieves the StoragePath object associated with this document. Uses caching to minimize API requests when accessing the same storage path multiple times.
- Returns:
The storage path object or None if not set.
- Return type:
StoragePath | None
Examples
>>> document = client.documents().get(pk=1) >>> if document.storage_path: ... print(document.storage_path.name) Example Storage Path
- property custom_fields: CustomFieldQuerySet
Get the custom fields for this document.
Returns a QuerySet of CustomField objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the custom fields are actually accessed.
- Returns:
QuerySet of custom fields associated with this document.
- Return type:
Example
>>> document = client.documents().get(1) >>> for field in document.custom_fields: ... print(f'{field.name}: {field.value}') Due Date: 2023-04-15 Reference: INV-12345
- property has_search_hit: bool
Check if this document has search hit information.
- Returns:
- True if this document was returned as part of a search result
and has search hit information, False otherwise.
- Return type:
- property search_hit: dict[str, Any] | None
Get the search hit information for this document.
When a document is returned as part of a search result, this property contains additional information about the search match.
- custom_field_value(field_id, default=None, *, raise_errors=False)[source]
Get the value of a custom field by ID.
Retrieves the value of a specific custom field associated with this document.
- Parameters:
- Returns:
The value of the custom field or the default value if not found.
- Return type:
Any
- Raises:
ValueError – If raise_errors is True and the field is not found.
Example
>>> document = client.documents().get(1) >>> # Get value with default >>> due_date = document.custom_field_value(3, default="Not set") >>> # Get value with error handling >>> try: ... reference = document.custom_field_value(5, raise_errors=True) ... except ValueError: ... print("Reference field not found") Reference field not found
- add_tag(tag)[source]
Add a tag to the document.
Adds a tag to the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.
- Parameters:
tag (
Tag | int | str
) – The tag to add. Can be a Tag object, a tag ID, or a tag name.- Raises:
TypeError – If the input value is not a Tag object, an integer, or a string.
ResourceNotFoundError – If a tag name is provided but no matching tag is found.
- Return type:
Example
>>> document = client.documents().get(1) >>> # Add tag by ID >>> document.add_tag(5) >>> # Add tag by object >>> tag = client.tags().get(3) >>> document.add_tag(tag) >>> # Add tag by name >>> document.add_tag("Invoice")
- remove_tag(tag)[source]
Remove a tag from the document.
Removes a tag from the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.
- Parameters:
tag (
Tag | int | str
) – The tag to remove. Can be a Tag object, a tag ID, or a tag name.- Raises:
TypeError – If the input value is not a Tag object, an integer, or a string.
ResourceNotFoundError – If a tag name is provided but no matching tag is found.
ValueError – If the tag is not associated with this document.
- Return type:
Example
>>> document = client.documents().get(1) >>> # Remove tag by ID >>> document.remove_tag(5) >>> # Remove tag by object >>> tag = client.tags().get(3) >>> document.remove_tag(tag) >>> # Remove tag by name >>> document.remove_tag("Invoice")
- get_metadata()[source]
Get the metadata for this document.
Retrieves detailed metadata about the document from the Paperless-ngx API. This includes information like the original file format, creation date, modification date, and other technical details.
- Returns:
The document metadata object.
- Return type:
Examples
>>> metadata = document.get_metadata() >>> print(metadata.original_mime_type) application/pdf >>> print(metadata.media_filename) document.pdf
- download(original=False)[source]
Download the document file.
Downloads either the archived version (default) or the original version of the document from the Paperless-ngx server.
- Parameters:
original (
bool
, optional) – Whether to download the original file instead of the archived version. Defaults to False (download the archived version).- Returns:
- An object containing the downloaded document content
and metadata.
- Return type:
Examples
>>> # Download archived version >>> download = document.download() >>> with open(download.disposition_filename, 'wb') as f: ... f.write(download.content)
>>> # Download original version >>> original = document.download(original=True) >>> print(f"Downloaded {len(original.content)} bytes") Downloaded 245367 bytes
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- preview(original=False)[source]
Get a preview of the document.
Retrieves a preview version of the document from the Paperless-ngx server. This is typically a web-friendly version (e.g., PDF) that can be displayed in a browser.
- Parameters:
original (
bool
, optional) – Whether to preview the original file instead of the archived version. Defaults to False (preview the archived version).- Returns:
- An object containing the preview document content
and metadata.
- Return type:
Example
>>> preview = document.preview() >>> with open('preview.pdf', 'wb') as f: ... f.write(preview.content)
- id: int
- thumbnail(original=False)[source]
Get the document thumbnail.
Retrieves a thumbnail image of the document from the Paperless-ngx server. This is typically a small image representation of the first page.
- Parameters:
original (
bool
, optional) – Whether to get the thumbnail of the original file instead of the archived version. Defaults to False (get thumbnail of archived version).- Returns:
- An object containing the thumbnail image content
and metadata.
- Return type:
Example
>>> thumbnail = document.thumbnail() >>> with open('thumbnail.png', 'wb') as f: ... f.write(thumbnail.content)
- get_suggestions()[source]
Get suggestions for this document.
Retrieves AI-generated suggestions for document metadata from the Paperless-ngx server. This can include suggested tags, correspondent, document type, and other metadata based on the document’s content.
- Returns:
An object containing suggested metadata for the document.
- Return type:
Examples
>>> suggestions = document.get_suggestions() >>> print(f"Suggested tags: {suggestions.tags}") Suggested tags: [{'name': 'Invoice', 'score': 0.95}, {'name': 'Utility', 'score': 0.87}] >>> print(f"Suggested correspondent: {suggestions.correspondent}") Suggested correspondent: {'name': 'Electric Company', 'score': 0.92} >>> print(f"Suggested document type: {suggestions.document_type}") Suggested document type: {'name': 'Bill', 'score': 0.89}
- append_content(value)[source]
Append content to the document.
Adds the specified text to the end of the document’s content, separated by a newline.
Example
>>> document = client.documents().get(1) >>> document.append_content("Additional notes about this document") >>> document.save()
- update_locally(from_db=None, **kwargs)[source]
Update the document locally with the provided data.
Updates the document’s attributes with the provided data without sending an API request. Handles special cases for notes and tags, which cannot be set to None in Paperless-ngx if they already have values.
- Parameters:
from_db (
bool | None
, optional) – Whether the update is coming from the database. If True, bypasses certain validation checks. Defaults to None.**kwargs (
Any
) – Additional data to update the document with.
- Raises:
NotImplementedError – If attempting to set notes or tags to None when they are not already None and from_db is False.
- Return type:
Example
>>> document = client.documents().get(1) >>> document.update_locally(title="New Title", correspondent_id=5) >>> document.save()