paperap.models.document.model module

Provide document models for interacting with Paperless-ngx documents.

This module contains the Document and DocumentNote models, which represent documents and their associated notes in the Paperless-ngx system. These models provide methods for retrieving, updating, and managing document metadata, content, and relationships with other entities like tags, correspondents, and custom fields.

class paperap.models.document.model.DocumentNote(**data)[source]

Bases: StandardModel

Represent a note on a Paperless-ngx document.

This class models user-created notes that can be attached to documents in the Paperless-ngx system. Notes include information about when they were created, who created them, and their content.

deleted_at

Timestamp when the note was deleted, or None if not deleted.

Type:

datetime | None

restored_at

Timestamp when the note was restored after deletion, or None.

Type:

datetime | None

transaction_id

ID of the transaction that created or modified this note.

Type:

int | None

note

The text content of the note.

Type:

str

created

Timestamp when the note was created.

Type:

datetime

document

ID of the document this note is attached to.

Type:

int

user

ID of the user who created this note.

Type:

int

Examples

>>> note = client.document_notes().get(1)
>>> print(note.note)
'This is an important document'
>>> print(note.created)
2023-01-15 14:30:22
Parameters:

data (Any)

deleted_at: datetime | None
restored_at: datetime | None
transaction_id: int | None
note: str
created: datetime
document: int
user: int
class Meta(model)[source]

Bases: Meta

Parameters:

model (type[_Self])

read_only_fields: ClassVar[set[str]] = {'created', 'deleted_at', 'id', 'restored_at', 'transaction_id'}
blacklist_filtering_params: ClassVar[set[str]] = {}
field_map: dict[str, str] = {}
filtering_disabled: ClassVar[set[str]] = {}
filtering_fields: ClassVar[set[str]] = {'_resource', 'created', 'deleted_at', 'document', 'id', 'note', 'restored_at', 'transaction_id', 'user'}
supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
model: type[_Self]
name: str
serialize_datetime(value)[source]

Serialize datetime fields to ISO format.

Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.

Parameters:

value (datetime | None) – The datetime value to serialize.

Returns:

The serialized datetime value as an ISO 8601 string, or None if the value is None.

Return type:

str | None

get_document()[source]

Get the document associated with this note.

Retrieves the full Document object that this note is attached to by making an API request using the document ID.

Returns:

The document associated with this note.

Return type:

Document

Example

>>> note = client.document_notes().get(1)
>>> document = note.get_document()
>>> print(document.title)
'Invoice #12345'
get_user()[source]

Get the user who created this note.

Retrieves the full User object for the user who created this note by making an API request using the user ID.

Returns:

The user who created this note.

Return type:

User

Example

>>> note = client.document_notes().get(1)
>>> user = note.get_user()
>>> print(user.username)
'admin'
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

We need to both initialize private attributes and call the user-defined model_post_init method.

Parameters:
Return type:

None

id: int
class paperap.models.document.model.Document(**data)[source]

Bases: StandardModel

Represent a Paperless-ngx document.

This class models documents stored in the Paperless-ngx system, providing access to document metadata, content, and related objects. It supports operations like downloading, updating metadata, and managing tags and custom fields.

added

Timestamp when the document was added to the system.

Type:

datetime | None

archive_checksum

Checksum of the archived version of the document.

Type:

str | None

archive_filename

Filename of the archived version.

Type:

str | None

archive_serial_number

Serial number in the archive system.

Type:

int | None

archived_file_name

Original name of the archived file.

Type:

str | None

checksum

Checksum of the original document.

Type:

str | None

content

Full text content of the document.

Type:

str

correspondent_id

ID of the associated correspondent.

Type:

int | None

created

Timestamp when the document was created.

Type:

datetime | None

created_date

Creation date as a string.

Type:

str | None

custom_field_dicts

Custom fields associated with the document.

Type:

list[CustomFieldValues]

deleted_at

Timestamp when the document was deleted, or None.

Type:

datetime | None

document_type_id

ID of the document type.

Type:

int | None

filename

Current filename in the system.

Type:

str | None

is_shared_by_requester

Whether the document is shared by the requester.

Type:

bool

notes

Notes attached to this document.

Type:

list[DocumentNote]

original_filename

Original filename when uploaded.

Type:

str | None

owner

ID of the document owner.

Type:

int | None

page_count

Number of pages in the document.

Type:

int | None

storage_path_id

ID of the storage path.

Type:

int | None

storage_type

Type of storage used.

Type:

DocumentStorageType | None

tag_ids

List of tag IDs associated with this document.

Type:

list[int]

title

Title of the document.

Type:

str

user_can_change

Whether the current user can modify this document.

Type:

bool | None

Examples

>>> document = client.documents().get(pk=1)
>>> document.title = 'Example Document'
>>> document.save()
>>> document.title
'Example Document'

# Get document metadata >>> metadata = document.get_metadata() >>> print(metadata.original_mime_type) ‘application/pdf’

# Download document >>> download = document.download() >>> with open(download.disposition_filename, ‘wb’) as f: … f.write(download.content)

# Get document suggestions >>> suggestions = document.get_suggestions() >>> print(suggestions.tags) [‘Invoice’, ‘Tax’, ‘2023’]

Parameters:

data (Any)

added: datetime | None
archive_checksum: str | None
archive_filename: str | None
archive_serial_number: int | None
archived_file_name: str | None
checksum: str | None
content: str
correspondent_id: int | None
created: datetime | None
created_date: str | None
custom_field_dicts: Annotated[list[CustomFieldValues], Field(default_factory=list)]
deleted_at: datetime | None
document_type_id: int | None
filename: str | None
is_shared_by_requester: bool
notes: list[DocumentNote]
original_filename: str | None
owner: int | None
page_count: int | None
storage_path_id: int | None
storage_type: DocumentStorageType | None
tag_ids: Annotated[list[int], Field(default_factory=list)]
title: str
user_can_change: bool | None
class Meta(model)[source]

Bases: Meta

Parameters:

model (type[_Self])

read_only_fields: ClassVar[set[str]] = {'archived_file_name', 'deleted_at', 'id', 'is_shared_by_requester', 'page_count'}
filtering_disabled: ClassVar[set[str]] = {'deleted_at', 'is_shared_by_requester', 'page_count'}
filtering_strategies: ClassVar[set[FilteringStrategies]] = {FilteringStrategies.WHITELIST}
field_map: dict[str, str] = {'correspondent': 'correspondent_id', 'custom_fields': 'custom_field_dicts', 'document_type': 'document_type_id', 'storage_path': 'storage_path_id', 'tags': 'tag_ids'}
supported_filtering_params: ClassVar[set[str]] = {'added__date__gt', 'added__date__lt', 'added__day', 'added__gt', 'added__lt', 'added__month', 'added__year', 'archive_serial_number', 'archive_serial_number__gt', 'archive_serial_number__gte', 'archive_serial_number__isnull', 'archive_serial_number__lt', 'archive_serial_number__lte', 'checksum__icontains', 'checksum__iendswith', 'checksum__iexact', 'checksum__istartswith', 'content__contains', 'content__icontains', 'content__iendswith', 'content__iexact', 'content__istartswith', 'correspondent__id', 'correspondent__id__in', 'correspondent__id__none', 'correspondent__isnull', 'correspondent__name__icontains', 'correspondent__name__iendswith', 'correspondent__name__iexact', 'correspondent__name__istartswith', 'correspondent__slug__iexact', 'created__date__gt', 'created__date__lt', 'created__day', 'created__gt', 'created__lt', 'created__month', 'created__year', 'custom_field_query', 'custom_fields__icontains', 'custom_fields__id__all', 'custom_fields__id__in', 'custom_fields__id__none', 'document_type__id', 'document_type__id__in', 'document_type__id__none', 'document_type__isnull', 'document_type__name__icontains', 'document_type__name__iendswith', 'document_type__name__iexact', 'document_type__name__istartswith', 'has_custom_fields', 'id', 'id__in', 'is_in_inbox', 'is_tagged', 'limit', 'original_filename__icontains', 'original_filename__iendswith', 'original_filename__iexact', 'original_filename__istartswith', 'owner__id', 'owner__id__in', 'owner__id__none', 'owner__isnull', 'shared_by__id', 'shared_by__id__in', 'storage_path__id', 'storage_path__id__in', 'storage_path__id__none', 'storage_path__isnull', 'storage_path__name__icontains', 'storage_path__name__iendswith', 'storage_path__name__iexact', 'storage_path__name__istartswith', 'tags__id', 'tags__id__all', 'tags__id__in', 'tags__id__none', 'tags__name__icontains', 'tags__name__iendswith', 'tags__name__iexact', 'tags__name__istartswith', 'title__icontains', 'title__iendswith', 'title__iexact', 'title__istartswith', 'title_content'}
blacklist_filtering_params: ClassVar[set[str]] = {}
filtering_fields: ClassVar[set[str]] = {'__search_hit__', '_correspondent', '_document_type', '_resource', '_storage_path', 'added', 'archive_checksum', 'archive_filename', 'archive_serial_number', 'archived_file_name', 'checksum', 'content', 'correspondent_id', 'created', 'created_date', 'custom_field_dicts', 'document_type_id', 'filename', 'id', 'notes', 'original_filename', 'owner', 'storage_path_id', 'storage_type', 'tag_ids', 'title', 'user_can_change'}
model: type[_Self]
name: str
serialize_datetime(value)[source]

Serialize datetime fields to ISO format.

Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.

Parameters:

value (datetime | None) – The datetime value to serialize.

Returns:

The serialized datetime value as an ISO 8601 string, or None.

Return type:

str | None

serialize_notes(value)[source]

Serialize notes to a list of dictionaries.

Converts DocumentNote objects to dictionaries for JSON serialization. Returns an empty list if the input value is None or empty.

Parameters:

value (list[DocumentNote]) – The list of DocumentNote objects to serialize.

Returns:

A list of dictionaries representing the notes.

Return type:

list[dict[str, Any]]

classmethod validate_tags(value)[source]

Validate and convert tag IDs to a list of integers.

Ensures tag IDs are properly formatted as a list of integers. Handles various input formats including None, single integers, and lists.

Parameters:

value (Any) – The tag IDs to validate, which can be None, an integer, or a list.

Returns:

A list of validated tag IDs.

Return type:

list[int]

Raises:

TypeError – If the input value is not None, an integer, or a list.

Examples

>>> Document.validate_tags(None)
[]
>>> Document.validate_tags(5)
[5]
>>> Document.validate_tags([1, 2, 3])
[1, 2, 3]
classmethod validate_custom_fields(value)[source]

Validate and return custom field dictionaries.

Ensures custom fields are properly formatted as a list of CustomFieldValues. Returns an empty list if the input value is None.

Parameters:

value (Any) – The list of custom field dictionaries to validate.

Returns:

A list of validated custom field dictionaries.

Return type:

list[CustomFieldValues]

Raises:

TypeError – If the input value is not None or a list.

classmethod validate_text(value)[source]

Validate and return a text field.

Ensures text fields are properly formatted as strings. Converts integers to strings and returns an empty string if the input value is None.

Parameters:

value (Any) – The value of the text field to validate.

Returns:

The validated text value.

Return type:

str

Raises:

TypeError – If the input value is not None, a string, or an integer.

Examples

>>> Document.validate_text(None)
''
>>> Document.validate_text("Hello")
'Hello'
>>> Document.validate_text(123)
'123'
classmethod validate_notes(value)[source]

Validate and return the list of notes.

Ensures notes are properly formatted as a list of DocumentNote objects. Handles various input formats including None, single DocumentNote objects, and lists.

Parameters:

value (Any) – The list of notes to validate.

Returns:

The validated list of notes.

Return type:

list[Any]

Raises:

TypeError – If the input value is not None, a DocumentNote, or a list.

classmethod validate_is_shared_by_requester(value)[source]

Validate and return the is_shared_by_requester flag.

Ensures the is_shared_by_requester flag is properly formatted as a boolean. Returns False if the input value is None.

Parameters:

value (Any) – The flag to validate.

Returns:

The validated flag.

Return type:

bool

Raises:

TypeError – If the input value is not None or a boolean.

property custom_field_ids: list[int]

Get the IDs of the custom fields for this document.

Returns:

A list of custom field IDs associated with this document.

Return type:

list[int]

Example

>>> document = client.documents().get(1)
>>> field_ids = document.custom_field_ids
>>> print(field_ids)
[1, 3, 5]
property custom_field_values: list[Any]

Get the values of the custom fields for this document.

Returns:

A list of values for the custom fields associated with this document.

Return type:

list[Any]

Example

>>> document = client.documents().get(1)
>>> values = document.custom_field_values
>>> print(values)
['2023-01-15', 'INV-12345', True]
property tag_names: list[str]

Get the names of the tags for this document.

Returns:

A list of tag names associated with this document.

Return type:

list[str]

Example

>>> document = client.documents().get(1)
>>> names = document.tag_names
>>> print(names)
['Invoice', 'Tax', 'Important']
property tags: TagQuerySet

Get the tags for this document.

Returns a QuerySet of Tag objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the tags are actually accessed.

Returns:

QuerySet of tags associated with this document.

Return type:

TagQuerySet

Examples

>>> document = client.documents().get(pk=1)
>>> for tag in document.tags:
...     print(f'{tag.name} # {tag.id}')
Tag 1 # 1
Tag 2 # 2
Tag 3 # 3
>>> if 5 in document.tags:
...     print('Tag ID #5 is associated with this document')
>>> tag = client.tags().get(pk=1)
>>> if tag in document.tags:
...     print('Tag ID #1 is associated with this document')
>>> filtered_tags = document.tags.filter(name__icontains='example')
>>> for tag in filtered_tags:
...     print(f'{tag.name} # {tag.id}')
property correspondent: Correspondent | None

Get the correspondent for this document.

Retrieves the Correspondent object associated with this document. Uses caching to minimize API requests when accessing the same correspondent multiple times.

Returns:

The correspondent object or None if not set.

Return type:

Correspondent | None

Examples

>>> document = client.documents().get(pk=1)
>>> if document.correspondent:
...     print(document.correspondent.name)
Example Correspondent
property document_type: DocumentType | None

Get the document type for this document.

Retrieves the DocumentType object associated with this document. Uses caching to minimize API requests when accessing the same document type multiple times.

Returns:

The document type object or None if not set.

Return type:

DocumentType | None

Examples

>>> document = client.documents().get(pk=1)
>>> if document.document_type:
...     print(document.document_type.name)
Example Document Type
property storage_path: StoragePath | None

Get the storage path for this document.

Retrieves the StoragePath object associated with this document. Uses caching to minimize API requests when accessing the same storage path multiple times.

Returns:

The storage path object or None if not set.

Return type:

StoragePath | None

Examples

>>> document = client.documents().get(pk=1)
>>> if document.storage_path:
...     print(document.storage_path.name)
Example Storage Path
property custom_fields: CustomFieldQuerySet

Get the custom fields for this document.

Returns a QuerySet of CustomField objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the custom fields are actually accessed.

Returns:

QuerySet of custom fields associated with this document.

Return type:

CustomFieldQuerySet

Example

>>> document = client.documents().get(1)
>>> for field in document.custom_fields:
...     print(f'{field.name}: {field.value}')
Due Date: 2023-04-15
Reference: INV-12345
property has_search_hit: bool

Check if this document has search hit information.

Returns:

True if this document was returned as part of a search result

and has search hit information, False otherwise.

Return type:

bool

property search_hit: dict[str, Any] | None

Get the search hit information for this document.

When a document is returned as part of a search result, this property contains additional information about the search match.

Returns:

Dictionary with search hit information or None

if this document was not part of a search result.

Return type:

dict[str, Any] | None

custom_field_value(field_id, default=None, *, raise_errors=False)[source]

Get the value of a custom field by ID.

Retrieves the value of a specific custom field associated with this document.

Parameters:
  • field_id (int) – The ID of the custom field to retrieve.

  • default (Any, optional) – The value to return if the field is not found. Defaults to None.

  • raise_errors (bool, optional) – Whether to raise an error if the field is not found. Defaults to False.

Returns:

The value of the custom field or the default value if not found.

Return type:

Any

Raises:

ValueError – If raise_errors is True and the field is not found.

Example

>>> document = client.documents().get(1)
>>> # Get value with default
>>> due_date = document.custom_field_value(3, default="Not set")
>>> # Get value with error handling
>>> try:
...     reference = document.custom_field_value(5, raise_errors=True)
... except ValueError:
...     print("Reference field not found")
Reference field not found
add_tag(tag)[source]

Add a tag to the document.

Adds a tag to the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.

Parameters:

tag (Tag | int | str) – The tag to add. Can be a Tag object, a tag ID, or a tag name.

Raises:
  • TypeError – If the input value is not a Tag object, an integer, or a string.

  • ResourceNotFoundError – If a tag name is provided but no matching tag is found.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> # Add tag by ID
>>> document.add_tag(5)
>>> # Add tag by object
>>> tag = client.tags().get(3)
>>> document.add_tag(tag)
>>> # Add tag by name
>>> document.add_tag("Invoice")
remove_tag(tag)[source]

Remove a tag from the document.

Removes a tag from the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.

Parameters:

tag (Tag | int | str) – The tag to remove. Can be a Tag object, a tag ID, or a tag name.

Raises:
  • TypeError – If the input value is not a Tag object, an integer, or a string.

  • ResourceNotFoundError – If a tag name is provided but no matching tag is found.

  • ValueError – If the tag is not associated with this document.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> # Remove tag by ID
>>> document.remove_tag(5)
>>> # Remove tag by object
>>> tag = client.tags().get(3)
>>> document.remove_tag(tag)
>>> # Remove tag by name
>>> document.remove_tag("Invoice")
get_metadata()[source]

Get the metadata for this document.

Retrieves detailed metadata about the document from the Paperless-ngx API. This includes information like the original file format, creation date, modification date, and other technical details.

Returns:

The document metadata object.

Return type:

DocumentMetadata

Examples

>>> metadata = document.get_metadata()
>>> print(metadata.original_mime_type)
application/pdf
>>> print(metadata.media_filename)
document.pdf
download(original=False)[source]

Download the document file.

Downloads either the archived version (default) or the original version of the document from the Paperless-ngx server.

Parameters:

original (bool, optional) – Whether to download the original file instead of the archived version. Defaults to False (download the archived version).

Returns:

An object containing the downloaded document content

and metadata.

Return type:

DownloadedDocument

Examples

>>> # Download archived version
>>> download = document.download()
>>> with open(download.disposition_filename, 'wb') as f:
...     f.write(download.content)
>>> # Download original version
>>> original = document.download(original=True)
>>> print(f"Downloaded {len(original.content)} bytes")
Downloaded 245367 bytes
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

We need to both initialize private attributes and call the user-defined model_post_init method.

Parameters:
Return type:

None

preview(original=False)[source]

Get a preview of the document.

Retrieves a preview version of the document from the Paperless-ngx server. This is typically a web-friendly version (e.g., PDF) that can be displayed in a browser.

Parameters:

original (bool, optional) – Whether to preview the original file instead of the archived version. Defaults to False (preview the archived version).

Returns:

An object containing the preview document content

and metadata.

Return type:

DownloadedDocument

Example

>>> preview = document.preview()
>>> with open('preview.pdf', 'wb') as f:
...     f.write(preview.content)
id: int
thumbnail(original=False)[source]

Get the document thumbnail.

Retrieves a thumbnail image of the document from the Paperless-ngx server. This is typically a small image representation of the first page.

Parameters:

original (bool, optional) – Whether to get the thumbnail of the original file instead of the archived version. Defaults to False (get thumbnail of archived version).

Returns:

An object containing the thumbnail image content

and metadata.

Return type:

DownloadedDocument

Example

>>> thumbnail = document.thumbnail()
>>> with open('thumbnail.png', 'wb') as f:
...     f.write(thumbnail.content)
get_suggestions()[source]

Get suggestions for this document.

Retrieves AI-generated suggestions for document metadata from the Paperless-ngx server. This can include suggested tags, correspondent, document type, and other metadata based on the document’s content.

Returns:

An object containing suggested metadata for the document.

Return type:

DocumentSuggestions

Examples

>>> suggestions = document.get_suggestions()
>>> print(f"Suggested tags: {suggestions.tags}")
Suggested tags: [{'name': 'Invoice', 'score': 0.95}, {'name': 'Utility', 'score': 0.87}]
>>> print(f"Suggested correspondent: {suggestions.correspondent}")
Suggested correspondent: {'name': 'Electric Company', 'score': 0.92}
>>> print(f"Suggested document type: {suggestions.document_type}")
Suggested document type: {'name': 'Bill', 'score': 0.89}
append_content(value)[source]

Append content to the document.

Adds the specified text to the end of the document’s content, separated by a newline.

Parameters:

value (str) – The content to append.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> document.append_content("Additional notes about this document")
>>> document.save()
update_locally(from_db=None, **kwargs)[source]

Update the document locally with the provided data.

Updates the document’s attributes with the provided data without sending an API request. Handles special cases for notes and tags, which cannot be set to None in Paperless-ngx if they already have values.

Parameters:
  • from_db (bool | None, optional) – Whether the update is coming from the database. If True, bypasses certain validation checks. Defaults to None.

  • **kwargs (Any) – Additional data to update the document with.

Raises:

NotImplementedError – If attempting to set notes or tags to None when they are not already None and from_db is False.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> document.update_locally(title="New Title", correspondent_id=5)
>>> document.save()