paperap.models.document package

class paperap.models.document.Document(**data)[source]

Bases: StandardModel

Represent a Paperless-ngx document.

This class models documents stored in the Paperless-ngx system, providing access to document metadata, content, and related objects. It supports operations like downloading, updating metadata, and managing tags and custom fields.

added

Timestamp when the document was added to the system.

Type:

datetime | None

archive_checksum

Checksum of the archived version of the document.

Type:

str | None

archive_filename

Filename of the archived version.

Type:

str | None

archive_serial_number

Serial number in the archive system.

Type:

int | None

archived_file_name

Original name of the archived file.

Type:

str | None

checksum

Checksum of the original document.

Type:

str | None

content

Full text content of the document.

Type:

str

correspondent_id

ID of the associated correspondent.

Type:

int | None

created

Timestamp when the document was created.

Type:

datetime | None

created_date

Creation date as a string.

Type:

str | None

custom_field_dicts

Custom fields associated with the document.

Type:

list[CustomFieldValues]

deleted_at

Timestamp when the document was deleted, or None.

Type:

datetime | None

document_type_id

ID of the document type.

Type:

int | None

filename

Current filename in the system.

Type:

str | None

is_shared_by_requester

Whether the document is shared by the requester.

Type:

bool

notes

Notes attached to this document.

Type:

list[DocumentNote]

original_filename

Original filename when uploaded.

Type:

str | None

owner

ID of the document owner.

Type:

int | None

page_count

Number of pages in the document.

Type:

int | None

storage_path_id

ID of the storage path.

Type:

int | None

storage_type

Type of storage used.

Type:

DocumentStorageType | None

tag_ids

List of tag IDs associated with this document.

Type:

list[int]

title

Title of the document.

Type:

str

user_can_change

Whether the current user can modify this document.

Type:

bool | None

Examples

>>> document = client.documents().get(pk=1)
>>> document.title = 'Example Document'
>>> document.save()
>>> document.title
'Example Document'

# Get document metadata >>> metadata = document.get_metadata() >>> print(metadata.original_mime_type) ‘application/pdf’

# Download document >>> download = document.download() >>> with open(download.disposition_filename, ‘wb’) as f: … f.write(download.content)

# Get document suggestions >>> suggestions = document.get_suggestions() >>> print(suggestions.tags) [‘Invoice’, ‘Tax’, ‘2023’]

Parameters:

data (Any)

class Meta(model)[source]

Bases: Meta

Parameters:

model (type[_Self])

blacklist_filtering_params: ClassVar[set[str]] = {}
field_map: dict[str, str] = {'correspondent': 'correspondent_id', 'custom_fields': 'custom_field_dicts', 'document_type': 'document_type_id', 'storage_path': 'storage_path_id', 'tags': 'tag_ids'}
filtering_disabled: ClassVar[set[str]] = {'deleted_at', 'is_shared_by_requester', 'page_count'}
filtering_fields: ClassVar[set[str]] = {'__search_hit__', '_correspondent', '_document_type', '_resource', '_storage_path', 'added', 'archive_checksum', 'archive_filename', 'archive_serial_number', 'archived_file_name', 'checksum', 'content', 'correspondent_id', 'created', 'created_date', 'custom_field_dicts', 'document_type_id', 'filename', 'id', 'notes', 'original_filename', 'owner', 'storage_path_id', 'storage_type', 'tag_ids', 'title', 'user_can_change'}
filtering_strategies: ClassVar[set[FilteringStrategies]] = {FilteringStrategies.WHITELIST}
read_only_fields: ClassVar[set[str]] = {'archived_file_name', 'deleted_at', 'id', 'is_shared_by_requester', 'page_count'}
supported_filtering_params: ClassVar[set[str]] = {'added__date__gt', 'added__date__lt', 'added__day', 'added__gt', 'added__lt', 'added__month', 'added__year', 'archive_serial_number', 'archive_serial_number__gt', 'archive_serial_number__gte', 'archive_serial_number__isnull', 'archive_serial_number__lt', 'archive_serial_number__lte', 'checksum__icontains', 'checksum__iendswith', 'checksum__iexact', 'checksum__istartswith', 'content__contains', 'content__icontains', 'content__iendswith', 'content__iexact', 'content__istartswith', 'correspondent__id', 'correspondent__id__in', 'correspondent__id__none', 'correspondent__isnull', 'correspondent__name__icontains', 'correspondent__name__iendswith', 'correspondent__name__iexact', 'correspondent__name__istartswith', 'correspondent__slug__iexact', 'created__date__gt', 'created__date__lt', 'created__day', 'created__gt', 'created__lt', 'created__month', 'created__year', 'custom_field_query', 'custom_fields__icontains', 'custom_fields__id__all', 'custom_fields__id__in', 'custom_fields__id__none', 'document_type__id', 'document_type__id__in', 'document_type__id__none', 'document_type__isnull', 'document_type__name__icontains', 'document_type__name__iendswith', 'document_type__name__iexact', 'document_type__name__istartswith', 'has_custom_fields', 'id', 'id__in', 'is_in_inbox', 'is_tagged', 'limit', 'original_filename__icontains', 'original_filename__iendswith', 'original_filename__iexact', 'original_filename__istartswith', 'owner__id', 'owner__id__in', 'owner__id__none', 'owner__isnull', 'shared_by__id', 'shared_by__id__in', 'storage_path__id', 'storage_path__id__in', 'storage_path__id__none', 'storage_path__isnull', 'storage_path__name__icontains', 'storage_path__name__iendswith', 'storage_path__name__iexact', 'storage_path__name__istartswith', 'tags__id', 'tags__id__all', 'tags__id__in', 'tags__id__none', 'tags__name__icontains', 'tags__name__iendswith', 'tags__name__iexact', 'tags__name__istartswith', 'title__icontains', 'title__iendswith', 'title__iexact', 'title__istartswith', 'title_content'}
model: type[_Self]
name: str
add_tag(tag)[source]

Add a tag to the document.

Adds a tag to the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.

Parameters:

tag (Tag | int | str) – The tag to add. Can be a Tag object, a tag ID, or a tag name.

Raises:
  • TypeError – If the input value is not a Tag object, an integer, or a string.

  • ResourceNotFoundError – If a tag name is provided but no matching tag is found.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> # Add tag by ID
>>> document.add_tag(5)
>>> # Add tag by object
>>> tag = client.tags().get(3)
>>> document.add_tag(tag)
>>> # Add tag by name
>>> document.add_tag("Invoice")
append_content(value)[source]

Append content to the document.

Adds the specified text to the end of the document’s content, separated by a newline.

Parameters:

value (str) – The content to append.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> document.append_content("Additional notes about this document")
>>> document.save()
property correspondent: Correspondent | None

Get the correspondent for this document.

Retrieves the Correspondent object associated with this document. Uses caching to minimize API requests when accessing the same correspondent multiple times.

Returns:

The correspondent object or None if not set.

Return type:

Correspondent | None

Examples

>>> document = client.documents().get(pk=1)
>>> if document.correspondent:
...     print(document.correspondent.name)
Example Correspondent
property custom_field_ids: list[int]

Get the IDs of the custom fields for this document.

Returns:

A list of custom field IDs associated with this document.

Return type:

list[int]

Example

>>> document = client.documents().get(1)
>>> field_ids = document.custom_field_ids
>>> print(field_ids)
[1, 3, 5]
custom_field_value(field_id, default=None, *, raise_errors=False)[source]

Get the value of a custom field by ID.

Retrieves the value of a specific custom field associated with this document.

Parameters:
  • field_id (int) – The ID of the custom field to retrieve.

  • default (Any, optional) – The value to return if the field is not found. Defaults to None.

  • raise_errors (bool, optional) – Whether to raise an error if the field is not found. Defaults to False.

Returns:

The value of the custom field or the default value if not found.

Return type:

Any

Raises:

ValueError – If raise_errors is True and the field is not found.

Example

>>> document = client.documents().get(1)
>>> # Get value with default
>>> due_date = document.custom_field_value(3, default="Not set")
>>> # Get value with error handling
>>> try:
...     reference = document.custom_field_value(5, raise_errors=True)
... except ValueError:
...     print("Reference field not found")
Reference field not found
property custom_field_values: list[Any]

Get the values of the custom fields for this document.

Returns:

A list of values for the custom fields associated with this document.

Return type:

list[Any]

Example

>>> document = client.documents().get(1)
>>> values = document.custom_field_values
>>> print(values)
['2023-01-15', 'INV-12345', True]
property custom_fields: CustomFieldQuerySet

Get the custom fields for this document.

Returns a QuerySet of CustomField objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the custom fields are actually accessed.

Returns:

QuerySet of custom fields associated with this document.

Return type:

CustomFieldQuerySet

Example

>>> document = client.documents().get(1)
>>> for field in document.custom_fields:
...     print(f'{field.name}: {field.value}')
Due Date: 2023-04-15
Reference: INV-12345
property document_type: DocumentType | None

Get the document type for this document.

Retrieves the DocumentType object associated with this document. Uses caching to minimize API requests when accessing the same document type multiple times.

Returns:

The document type object or None if not set.

Return type:

DocumentType | None

Examples

>>> document = client.documents().get(pk=1)
>>> if document.document_type:
...     print(document.document_type.name)
Example Document Type
download(original=False)[source]

Download the document file.

Downloads either the archived version (default) or the original version of the document from the Paperless-ngx server.

Parameters:

original (bool, optional) – Whether to download the original file instead of the archived version. Defaults to False (download the archived version).

Returns:

An object containing the downloaded document content

and metadata.

Return type:

DownloadedDocument

Examples

>>> # Download archived version
>>> download = document.download()
>>> with open(download.disposition_filename, 'wb') as f:
...     f.write(download.content)
>>> # Download original version
>>> original = document.download(original=True)
>>> print(f"Downloaded {len(original.content)} bytes")
Downloaded 245367 bytes
get_metadata()[source]

Get the metadata for this document.

Retrieves detailed metadata about the document from the Paperless-ngx API. This includes information like the original file format, creation date, modification date, and other technical details.

Returns:

The document metadata object.

Return type:

DocumentMetadata

Examples

>>> metadata = document.get_metadata()
>>> print(metadata.original_mime_type)
application/pdf
>>> print(metadata.media_filename)
document.pdf
get_suggestions()[source]

Get suggestions for this document.

Retrieves AI-generated suggestions for document metadata from the Paperless-ngx server. This can include suggested tags, correspondent, document type, and other metadata based on the document’s content.

Returns:

An object containing suggested metadata for the document.

Return type:

DocumentSuggestions

Examples

>>> suggestions = document.get_suggestions()
>>> print(f"Suggested tags: {suggestions.tags}")
Suggested tags: [{'name': 'Invoice', 'score': 0.95}, {'name': 'Utility', 'score': 0.87}]
>>> print(f"Suggested correspondent: {suggestions.correspondent}")
Suggested correspondent: {'name': 'Electric Company', 'score': 0.92}
>>> print(f"Suggested document type: {suggestions.document_type}")
Suggested document type: {'name': 'Bill', 'score': 0.89}
property has_search_hit: bool

Check if this document has search hit information.

Returns:

True if this document was returned as part of a search result

and has search hit information, False otherwise.

Return type:

bool

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

We need to both initialize private attributes and call the user-defined model_post_init method.

Parameters:
Return type:

None

preview(original=False)[source]

Get a preview of the document.

Retrieves a preview version of the document from the Paperless-ngx server. This is typically a web-friendly version (e.g., PDF) that can be displayed in a browser.

Parameters:

original (bool, optional) – Whether to preview the original file instead of the archived version. Defaults to False (preview the archived version).

Returns:

An object containing the preview document content

and metadata.

Return type:

DownloadedDocument

Example

>>> preview = document.preview()
>>> with open('preview.pdf', 'wb') as f:
...     f.write(preview.content)
remove_tag(tag)[source]

Remove a tag from the document.

Removes a tag from the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.

Parameters:

tag (Tag | int | str) – The tag to remove. Can be a Tag object, a tag ID, or a tag name.

Raises:
  • TypeError – If the input value is not a Tag object, an integer, or a string.

  • ResourceNotFoundError – If a tag name is provided but no matching tag is found.

  • ValueError – If the tag is not associated with this document.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> # Remove tag by ID
>>> document.remove_tag(5)
>>> # Remove tag by object
>>> tag = client.tags().get(3)
>>> document.remove_tag(tag)
>>> # Remove tag by name
>>> document.remove_tag("Invoice")
property search_hit: dict[str, Any] | None

Get the search hit information for this document.

When a document is returned as part of a search result, this property contains additional information about the search match.

Returns:

Dictionary with search hit information or None

if this document was not part of a search result.

Return type:

dict[str, Any] | None

serialize_datetime(value)[source]

Serialize datetime fields to ISO format.

Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.

Parameters:

value (datetime | None) – The datetime value to serialize.

Returns:

The serialized datetime value as an ISO 8601 string, or None.

Return type:

str | None

serialize_notes(value)[source]

Serialize notes to a list of dictionaries.

Converts DocumentNote objects to dictionaries for JSON serialization. Returns an empty list if the input value is None or empty.

Parameters:

value (list[DocumentNote]) – The list of DocumentNote objects to serialize.

Returns:

A list of dictionaries representing the notes.

Return type:

list[dict[str, Any]]

property storage_path: StoragePath | None

Get the storage path for this document.

Retrieves the StoragePath object associated with this document. Uses caching to minimize API requests when accessing the same storage path multiple times.

Returns:

The storage path object or None if not set.

Return type:

StoragePath | None

Examples

>>> document = client.documents().get(pk=1)
>>> if document.storage_path:
...     print(document.storage_path.name)
Example Storage Path
property tag_names: list[str]

Get the names of the tags for this document.

Returns:

A list of tag names associated with this document.

Return type:

list[str]

Example

>>> document = client.documents().get(1)
>>> names = document.tag_names
>>> print(names)
['Invoice', 'Tax', 'Important']
property tags: TagQuerySet

Get the tags for this document.

Returns a QuerySet of Tag objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the tags are actually accessed.

Returns:

QuerySet of tags associated with this document.

Return type:

TagQuerySet

Examples

>>> document = client.documents().get(pk=1)
>>> for tag in document.tags:
...     print(f'{tag.name} # {tag.id}')
Tag 1 # 1
Tag 2 # 2
Tag 3 # 3
>>> if 5 in document.tags:
...     print('Tag ID #5 is associated with this document')
>>> tag = client.tags().get(pk=1)
>>> if tag in document.tags:
...     print('Tag ID #1 is associated with this document')
>>> filtered_tags = document.tags.filter(name__icontains='example')
>>> for tag in filtered_tags:
...     print(f'{tag.name} # {tag.id}')
thumbnail(original=False)[source]

Get the document thumbnail.

Retrieves a thumbnail image of the document from the Paperless-ngx server. This is typically a small image representation of the first page.

Parameters:

original (bool, optional) – Whether to get the thumbnail of the original file instead of the archived version. Defaults to False (get thumbnail of archived version).

Returns:

An object containing the thumbnail image content

and metadata.

Return type:

DownloadedDocument

Example

>>> thumbnail = document.thumbnail()
>>> with open('thumbnail.png', 'wb') as f:
...     f.write(thumbnail.content)
update_locally(from_db=None, **kwargs)[source]

Update the document locally with the provided data.

Updates the document’s attributes with the provided data without sending an API request. Handles special cases for notes and tags, which cannot be set to None in Paperless-ngx if they already have values.

Parameters:
  • from_db (bool | None, optional) – Whether the update is coming from the database. If True, bypasses certain validation checks. Defaults to None.

  • **kwargs (Any) – Additional data to update the document with.

Raises:

NotImplementedError – If attempting to set notes or tags to None when they are not already None and from_db is False.

Return type:

None

Example

>>> document = client.documents().get(1)
>>> document.update_locally(title="New Title", correspondent_id=5)
>>> document.save()
classmethod validate_custom_fields(value)[source]

Validate and return custom field dictionaries.

Ensures custom fields are properly formatted as a list of CustomFieldValues. Returns an empty list if the input value is None.

Parameters:

value (Any) – The list of custom field dictionaries to validate.

Returns:

A list of validated custom field dictionaries.

Return type:

list[CustomFieldValues]

Raises:

TypeError – If the input value is not None or a list.

classmethod validate_is_shared_by_requester(value)[source]

Validate and return the is_shared_by_requester flag.

Ensures the is_shared_by_requester flag is properly formatted as a boolean. Returns False if the input value is None.

Parameters:

value (Any) – The flag to validate.

Returns:

The validated flag.

Return type:

bool

Raises:

TypeError – If the input value is not None or a boolean.

classmethod validate_notes(value)[source]

Validate and return the list of notes.

Ensures notes are properly formatted as a list of DocumentNote objects. Handles various input formats including None, single DocumentNote objects, and lists.

Parameters:

value (Any) – The list of notes to validate.

Returns:

The validated list of notes.

Return type:

list[Any]

Raises:

TypeError – If the input value is not None, a DocumentNote, or a list.

classmethod validate_tags(value)[source]

Validate and convert tag IDs to a list of integers.

Ensures tag IDs are properly formatted as a list of integers. Handles various input formats including None, single integers, and lists.

Parameters:

value (Any) – The tag IDs to validate, which can be None, an integer, or a list.

Returns:

A list of validated tag IDs.

Return type:

list[int]

Raises:

TypeError – If the input value is not None, an integer, or a list.

Examples

>>> Document.validate_tags(None)
[]
>>> Document.validate_tags(5)
[5]
>>> Document.validate_tags([1, 2, 3])
[1, 2, 3]
classmethod validate_text(value)[source]

Validate and return a text field.

Ensures text fields are properly formatted as strings. Converts integers to strings and returns an empty string if the input value is None.

Parameters:

value (Any) – The value of the text field to validate.

Returns:

The validated text value.

Return type:

str

Raises:

TypeError – If the input value is not None, a string, or an integer.

Examples

>>> Document.validate_text(None)
''
>>> Document.validate_text("Hello")
'Hello'
>>> Document.validate_text(123)
'123'
added: datetime | None
archive_checksum: str | None
archive_filename: str | None
archive_serial_number: int | None
archived_file_name: str | None
checksum: str | None
content: str
correspondent_id: int | None
created: datetime | None
created_date: str | None
custom_field_dicts: Annotated[list[CustomFieldValues], Field(default_factory=list)]
deleted_at: datetime | None
document_type_id: int | None
filename: str | None
is_shared_by_requester: bool
notes: list[DocumentNote]
original_filename: str | None
owner: int | None
page_count: int | None
storage_path_id: int | None
storage_type: DocumentStorageType | None
tag_ids: Annotated[list[int], Field(default_factory=list)]
title: str
user_can_change: bool | None
id: int
class paperap.models.document.DocumentNote(**data)[source]

Bases: StandardModel

Represent a note on a Paperless-ngx document.

This class models user-created notes that can be attached to documents in the Paperless-ngx system. Notes include information about when they were created, who created them, and their content.

deleted_at

Timestamp when the note was deleted, or None if not deleted.

Type:

datetime | None

restored_at

Timestamp when the note was restored after deletion, or None.

Type:

datetime | None

transaction_id

ID of the transaction that created or modified this note.

Type:

int | None

note

The text content of the note.

Type:

str

created

Timestamp when the note was created.

Type:

datetime

document

ID of the document this note is attached to.

Type:

int

user

ID of the user who created this note.

Type:

int

Examples

>>> note = client.document_notes().get(1)
>>> print(note.note)
'This is an important document'
>>> print(note.created)
2023-01-15 14:30:22
Parameters:

data (Any)

class Meta(model)[source]

Bases: Meta

Parameters:

model (type[_Self])

blacklist_filtering_params: ClassVar[set[str]] = {}
field_map: dict[str, str] = {}
filtering_disabled: ClassVar[set[str]] = {}
filtering_fields: ClassVar[set[str]] = {'_resource', 'created', 'deleted_at', 'document', 'id', 'note', 'restored_at', 'transaction_id', 'user'}
read_only_fields: ClassVar[set[str]] = {'created', 'deleted_at', 'id', 'restored_at', 'transaction_id'}
supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
model: type[_Self]
name: str
get_document()[source]

Get the document associated with this note.

Retrieves the full Document object that this note is attached to by making an API request using the document ID.

Returns:

The document associated with this note.

Return type:

Document

Example

>>> note = client.document_notes().get(1)
>>> document = note.get_document()
>>> print(document.title)
'Invoice #12345'
get_user()[source]

Get the user who created this note.

Retrieves the full User object for the user who created this note by making an API request using the user ID.

Returns:

The user who created this note.

Return type:

User

Example

>>> note = client.document_notes().get(1)
>>> user = note.get_user()
>>> print(user.username)
'admin'
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

We need to both initialize private attributes and call the user-defined model_post_init method.

Parameters:
Return type:

None

serialize_datetime(value)[source]

Serialize datetime fields to ISO format.

Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.

Parameters:

value (datetime | None) – The datetime value to serialize.

Returns:

The serialized datetime value as an ISO 8601 string, or None if the value is None.

Return type:

str | None

deleted_at: datetime | None
restored_at: datetime | None
transaction_id: int | None
note: str
created: datetime
document: int
user: int
id: int
class paperap.models.document.DocumentQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]

Bases: StandardQuerySet[Document], HasOwner

QuerySet for Paperless-ngx documents with specialized filtering methods.

This class extends StandardQuerySet to provide document-specific filtering, searching, and bulk operations. It includes methods for filtering by document metadata, content, custom fields, and more, as well as bulk operations like merging, rotating, and updating document properties.

Examples

>>> # Search for documents
>>> docs = client.documents().search("invoice")
>>> for doc in docs:
...     print(doc.title)
>>> # Find documents similar to a specific document
>>> similar_docs = client.documents().more_like(42)
>>> for doc in similar_docs:
...     print(doc.title)
>>> # Filter by correspondent and document type
>>> filtered_docs = client.documents().correspondent(5).document_type("Invoice")
>>> for doc in filtered_docs:
...     print(f"{doc.title} - {doc.created}")
Parameters:
  • resource (BaseResource[_Model, Self])

  • filters (dict[str, Any] | None)

  • _cache (list[_Model] | None)

  • _fetch_all (bool)

  • _next_url (str | None)

  • _last_response (ClientResponse)

  • _iter (Iterator[_Model] | None)

  • _urls_fetched (list[str] | None)

add_tag(tag_id)[source]

Add a tag to all documents in the current queryset.

This is a convenience method that calls modify_tags with a single tag ID to add.

Parameters:

tag_id (int) – Tag ID to add to all documents in the queryset.

Returns:

The current queryset for method chaining.

Return type:

Self

Examples

>>> # Add tag 3 to all documents with "invoice" in title
>>> client.documents().title("invoice", exact=False).add_tag(3)
>>>
>>> # Add tag to documents from a specific correspondent
>>> client.documents().correspondent_name("Electric Company").add_tag(5)
added_after(date_str)[source]

Filter documents added after the specified date.

Parameters:

date_str (str) – ISO format date string (YYYY-MM-DD).

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents added after January 1, 2023
>>> docs = client.documents().added_after("2023-01-01")
added_before(date_str)[source]

Filter documents added before the specified date.

Parameters:

date_str (str) – ISO format date string (YYYY-MM-DD).

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents added before December 31, 2022
>>> docs = client.documents().added_before("2022-12-31")
asn(value, *, exact=True, case_insensitive=True)[source]

Filter documents by archive serial number (ASN).

The archive serial number is a unique identifier assigned to documents in Paperless-ngx, often used for referencing physical documents.

Parameters:
  • value (str) – The archive serial number to filter by.

  • exact (bool) – If True, match the exact value, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().asn("2023-0042")
>>>
>>> # Contains match
>>> docs = client.documents().asn("2023", exact=False)
content(text)[source]

Filter documents whose content contains the specified text.

This method searches the OCR-extracted text content of documents.

Parameters:

text (str) – The text to search for in document content.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents containing "invoice amount"
>>> docs = client.documents().content("invoice amount")
>>>
>>> # Chain with other filters
>>> recent_with_text = client.documents().content("tax").created_after("2023-01-01")
correspondent(value=None, *, exact=True, case_insensitive=True, **kwargs)[source]

Filter documents by correspondent.

This method provides a flexible interface for filtering documents by correspondent, supporting filtering by ID, name, or slug, with options for exact or partial matching.

Any number of filter arguments can be provided, but at least one must be specified.

Parameters:
  • value (int | str | None) – The correspondent ID or name to filter by.

  • exact (bool) – If True, match the exact value, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching for string values.

  • **kwargs (Any) – Additional filters (slug, id, name).

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Raises:
  • ValueError – If no valid filters are provided.

  • TypeError – If value is not an int or str.

Examples

>>> # Filter by ID
>>> client.documents().correspondent(1)
>>> client.documents().correspondent(id=1)
>>>
>>> # Filter by name
>>> client.documents().correspondent("John Doe")
>>> client.documents().correspondent(name="John Doe")
>>>
>>> # Filter by name (partial match)
>>> client.documents().correspondent("John", exact=False)
>>>
>>> # Filter by slug
>>> client.documents().correspondent(slug="john-doe")
>>>
>>> # Filter by ID and name
>>> client.documents().correspondent(1, name="John Doe")
correspondent_id(correspondent_id)[source]

Filter documents by correspondent ID.

Parameters:

correspondent_id (int) – The correspondent ID to filter by.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Filter documents with correspondent ID 5
>>> docs = client.documents().correspondent_id(5)
correspondent_name(name, *, exact=True, case_insensitive=True)[source]

Filter documents by correspondent name.

Parameters:
  • name (str) – The correspondent name to filter by.

  • exact (bool) – If True, match the exact name, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().correspondent_name("Electric Company")
>>>
>>> # Contains match
>>> docs = client.documents().correspondent_name("Electric", exact=False)
correspondent_slug(slug, *, exact=True, case_insensitive=True)[source]

Filter documents by correspondent slug.

Parameters:
  • slug (str) – The correspondent slug to filter by.

  • exact (bool) – If True, match the exact slug, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().correspondent_slug("electric-company")
>>>
>>> # Contains match
>>> docs = client.documents().correspondent_slug("electric", exact=False)
created_after(date)[source]

Filter models created after a given date.

This method filters documents based on their creation date in Paperless-ngx.

Parameters:

date (datetime | str) – The date to filter by. Can be a datetime object or an ISO format string.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Using string date
>>> docs = client.documents().created_after("2023-01-01")
>>>
>>> # Using datetime object
>>> from datetime import datetime
>>> date = datetime(2023, 1, 1)
>>> docs = client.documents().created_after(date)
created_before(date)[source]

Filter models created before a given date.

This method filters documents based on their creation date in Paperless-ngx.

Parameters:

date (datetime | str) – The date to filter by. Can be a datetime object or an ISO format string.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Using string date
>>> docs = client.documents().created_before("2023-01-01")
>>>
>>> # Using datetime object
>>> from datetime import datetime
>>> date = datetime(2023, 1, 1)
>>> docs = client.documents().created_before(date)
created_between(start, end)[source]

Filter models created between two dates.

This method filters documents with creation dates falling within the specified range.

Parameters:
  • start (datetime | str) – The start date to filter by. Can be a datetime object or an ISO format string.

  • end (datetime | str) – The end date to filter by. Can be a datetime object or an ISO format string.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Using string dates
>>> docs = client.documents().created_between("2023-01-01", "2023-12-31")
>>>
>>> # Using datetime objects
>>> from datetime import datetime
>>> start = datetime(2023, 1, 1)
>>> end = datetime(2023, 12, 31)
>>> docs = client.documents().created_between(start, end)
custom_field(field, value, *, exact=False, case_insensitive=True)[source]

Filter documents by custom field.

This method filters documents based on a specific custom field’s value.

Parameters:
  • field (str) – The name of the custom field to filter by.

  • value (Any) – The value to filter by.

  • exact (bool) – If True, match the exact value, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching for string values.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Contains match (default)
>>> docs = client.documents().custom_field("Reference Number", "INV")
>>>
>>> # Exact match
>>> docs = client.documents().custom_field("Status", "Paid", exact=True)
>>>
>>> # Case-sensitive contains match
>>> docs = client.documents().custom_field("Notes", "Important",
...                                      case_insensitive=False)
custom_field_contains(field, values)[source]

Filter documents with a custom field that contains all specified values.

This method is useful for custom fields that can hold multiple values, such as array or list-type fields.

Parameters:
  • field (str) – The name of the custom field.

  • values (list[Any]) – The list of values that the field should contain.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents with tags field containing both "important" and "tax"
>>> docs = client.documents().custom_field_contains("Tags", ["important", "tax"])
>>>
>>> # Find documents with categories containing specific values
>>> docs = client.documents().custom_field_contains("Categories", ["Finance", "Tax"])
custom_field_exact(field, value)[source]

Filter documents with a custom field value that matches exactly.

This method is a shorthand for custom_field_query with the “exact” operation.

Parameters:
  • field (str) – The name of the custom field.

  • value (Any) – The exact value to match.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Match exact status
>>> docs = client.documents().custom_field_exact("Status", "Paid")
>>>
>>> # Match exact date
>>> docs = client.documents().custom_field_exact("Due Date", "2023-04-15")
custom_field_exists(field, exists=True)[source]

Filter documents based on the existence of a custom field.

This method is a shorthand for custom_field_query with the “exists” operation.

Parameters:
  • field (str) – The name of the custom field.

  • exists (bool) – True to filter documents where the field exists, False otherwise.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents that have the "Priority" field
>>> docs = client.documents().custom_field_exists("Priority")
>>>
>>> # Find documents that don't have the "Review Date" field
>>> docs = client.documents().custom_field_exists("Review Date", exists=False)
custom_field_fullsearch(value, *, case_insensitive=True)[source]

Filter documents by searching through both custom field name and value.

This method searches across all custom fields (both names and values) for the specified text.

Parameters:
  • value (str) – The search string to look for in custom fields.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Raises:

NotImplementedError – If case_insensitive is False, as Paperless NGX doesn’t support case-sensitive custom field search.

Examples

>>> # Find documents with custom fields containing "reference"
>>> docs = client.documents().custom_field_fullsearch("reference")
custom_field_in(field, values)[source]

Filter documents with a custom field value in a list of values.

This method is a shorthand for custom_field_query with the “in” operation.

Parameters:
  • field (str) – The name of the custom field.

  • values (list[Any]) – The list of values to match against.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Match documents with status in a list
>>> docs = client.documents().custom_field_in("Status", ["Paid", "Pending"])
>>>
>>> # Match documents with specific reference numbers
>>> docs = client.documents().custom_field_in("Reference", ["INV-001", "INV-002", "INV-003"])
custom_field_isnull(field)[source]

Filter documents with a custom field that is null or empty.

This method finds documents where the specified custom field either doesn’t exist or has an empty value.

Parameters:

field (str) – The name of the custom field.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents with missing or empty "Status" field
>>> docs = client.documents().custom_field_isnull("Status")
>>>
>>> # Chain with other filters
>>> recent_missing_field = client.documents().custom_field_isnull("Priority").created_after("2023-01-01")
custom_field_query(*args, **kwargs)[source]

Filter documents by custom field query.

Parameters:
Return type:

Self

custom_field_range(field, start, end)[source]

Filter documents with a custom field value within a specified range.

This is particularly useful for date or numeric custom fields.

Parameters:
  • field (str) – The name of the custom field.

  • start (str) – The start value of the range.

  • end (str) – The end value of the range.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Date range
>>> docs = client.documents().custom_field_range("Invoice Date", "2023-01-01", "2023-12-31")
>>>
>>> # Numeric range
>>> docs = client.documents().custom_field_range("Amount", "100", "500")
delete()[source]

Delete all documents in the current queryset.

This method performs a bulk delete operation on all documents matching the current queryset filters.

Returns:

The API response from the bulk delete operation. None if there are no documents to delete.

Return type:

ClientResponse

Examples

>>> # Delete all documents with "invoice" in title
>>> client.documents().title("invoice", exact=False).delete()
>>>
>>> # Delete old documents
>>> client.documents().created_before("2020-01-01").delete()
document_type(value=None, *, exact=True, case_insensitive=True, **kwargs)[source]

Filter documents by document type.

This method provides a flexible interface for filtering documents by document type, supporting filtering by ID or name, with options for exact or partial matching.

Any number of filter arguments can be provided, but at least one must be specified.

Parameters:
  • value (int | str | None) – The document type ID or name to filter by.

  • exact (bool) – If True, match the exact value, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching for string values.

  • **kwargs (Any) – Additional filters (id, name).

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Raises:
  • ValueError – If no valid filters are provided.

  • TypeError – If value is not an int or str.

Examples

>>> # Filter by ID
>>> client.documents().document_type(1)
>>> client.documents().document_type(id=1)
>>>
>>> # Filter by name
>>> client.documents().document_type("Invoice")
>>> client.documents().document_type(name="Invoice")
>>>
>>> # Filter by name (partial match)
>>> client.documents().document_type("Inv", exact=False)
>>>
>>> # Filter by ID and name
>>> client.documents().document_type(1, name="Invoice")
document_type_id(document_type_id)[source]

Filter documents by document type ID.

Parameters:

document_type_id (int) – The document type ID to filter by.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Filter documents with document type ID 3
>>> docs = client.documents().document_type_id(3)
document_type_name(name, *, exact=True, case_insensitive=True)[source]

Filter documents by document type name.

Parameters:
  • name (str) – The document type name to filter by.

  • exact (bool) – If True, match the exact name, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().document_type_name("Invoice")
>>>
>>> # Contains match
>>> docs = client.documents().document_type_name("bill", exact=False)
has_custom_field_id(pk, *, exact=False)[source]

Filter documents that have a custom field with the specified ID(s).

This method filters documents based on the presence of specific custom fields, regardless of their values.

Parameters:
  • pk (int | list[int]) – A single custom field ID or list of custom field IDs.

  • exact (bool) – If True, return results that have exactly these IDs and no others. If False, return results that have at least these IDs.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Documents with custom field ID 5
>>> docs = client.documents().has_custom_field_id(5)
>>>
>>> # Documents with custom field IDs 5 and 7
>>> docs = client.documents().has_custom_field_id([5, 7])
>>>
>>> # Documents with exactly custom field IDs 5 and 7 and no others
>>> docs = client.documents().has_custom_field_id([5, 7], exact=True)
has_custom_fields()[source]

Filter documents that have any custom fields.

This method returns documents that have at least one custom field defined, regardless of the field name or value.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find all documents with any custom fields
>>> docs = client.documents().has_custom_fields()
merge(metadata_document_id=None, delete_originals=False)[source]

Merge all documents in the current queryset into a single document.

This method combines multiple documents into a single PDF document.

Parameters:
  • metadata_document_id (int | None) – Apply metadata from this document to the merged document. If None, metadata from the first document will be used.

  • delete_originals (bool) – Whether to delete the original documents after merging.

Returns:

True if submitting the merge succeeded, False if there are no documents to merge.

Return type:

bool

Raises:

Examples

>>> # Merge all documents with tag "merge_me"
>>> client.documents().tag_name("merge_me").merge(delete_originals=True)
>>>
>>> # Merge documents and use metadata from document ID 42
>>> client.documents().correspondent_name("Electric Company").merge(
...     metadata_document_id=42, delete_originals=False
... )
modify_custom_fields(add_custom_fields=None, remove_custom_fields=None)[source]

Modify custom fields on all documents in the current queryset.

This method allows for adding, updating, and removing custom fields in bulk for all documents matching the current queryset filters.

Parameters:
  • add_custom_fields (dict[int, Any] | None) – Dictionary of custom field ID to value pairs to add or update.

  • remove_custom_fields (list[int] | None) – List of custom field IDs to remove.

Returns:

The current queryset for method chaining.

Return type:

Self

Examples

>>> # Add a custom field to documents with "invoice" in title
>>> client.documents().title("invoice", exact=False).modify_custom_fields(
...     add_custom_fields={5: "Processed"}
... )
>>>
>>> # Add one field and remove another
>>> client.documents().correspondent_id(3).modify_custom_fields(
...     add_custom_fields={7: "2023-04-15"},
...     remove_custom_fields=[9]
... )
modify_tags(add_tags=None, remove_tags=None)[source]

Modify tags on all documents in the current queryset.

This method allows for adding and removing tags in bulk for all documents matching the current queryset filters.

Parameters:
  • add_tags (list[int] | None) – List of tag IDs to add to the documents.

  • remove_tags (list[int] | None) – List of tag IDs to remove from the documents.

Returns:

The current queryset for method chaining.

Return type:

Self

Examples

>>> # Add tag 3 and remove tag 4 from all documents with "invoice" in title
>>> client.documents().title("invoice", exact=False).modify_tags(
...     add_tags=[3], remove_tags=[4]
... )
>>>
>>> # Add multiple tags to recent documents
>>> from datetime import datetime, timedelta
>>> month_ago = datetime.now() - timedelta(days=30)
>>> client.documents().created_after(month_ago.strftime("%Y-%m-%d")).modify_tags(
...     add_tags=[5, 7, 9]
... )
more_like(document_id)[source]

Find documents similar to the specified document.

Uses Paperless-ngx’s similarity algorithm to find documents with content or metadata similar to the specified document.

Parameters:

document_id (int) – The ID of the document to find similar documents for.

Return type:

DocumentQuerySet

Returns:

A queryset with similar documents.

Examples

>>> # Find documents similar to document with ID 42
>>> similar_docs = client.documents().more_like(42)
>>> for doc in similar_docs:
...     print(doc.title)
>>>
>>> # Chain with other filters
>>> recent_similar = client.documents().more_like(42).created_after("2023-01-01")
no_custom_fields()[source]

Filter documents that do not have any custom fields.

This method returns documents that have no custom fields defined.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find all documents without any custom fields
>>> docs = client.documents().no_custom_fields()
notes(text)[source]

Filter documents whose notes contain the specified text.

This method searches through the document notes for the specified text.

Parameters:

text (str) – The text to search for in document notes.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents with "follow up" in notes
>>> docs = client.documents().notes("follow up")
>>>
>>> # Chain with other filters
>>> important_notes = client.documents().notes("important").tag_name("tax")
original_filename(name, *, exact=True, case_insensitive=True)[source]

Filter documents by original file name.

This filters based on the original filename of the document when it was uploaded.

Parameters:
  • name (str) – The original file name to filter by.

  • exact (bool) – If True, match the exact name, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().original_filename("scan_001.pdf")
>>>
>>> # Contains match
>>> docs = client.documents().original_filename("invoice", exact=False)
remove_tag(tag_id)[source]

Remove a tag from all documents in the current queryset.

This is a convenience method that calls modify_tags with a single tag ID to remove.

Parameters:

tag_id (int) – Tag ID to remove from all documents in the queryset.

Returns:

The current queryset for method chaining.

Return type:

Self

Examples

>>> # Remove tag 4 from all documents with "invoice" in title
>>> client.documents().title("invoice", exact=False).remove_tag(4)
>>>
>>> # Remove tag from old documents
>>> client.documents().created_before("2020-01-01").remove_tag(7)
reprocess()[source]

Reprocess all documents in the current queryset.

This method triggers Paperless-ngx to re-run OCR and classification on all documents matching the current queryset filters.

Returns:

The API response from the bulk reprocess operation. None if there are no documents to reprocess.

Return type:

ClientResponse

Examples

>>> # Reprocess documents added in the last week
>>> from datetime import datetime, timedelta
>>> week_ago = datetime.now() - timedelta(days=7)
>>> client.documents().added_after(week_ago.strftime("%Y-%m-%d")).reprocess()
>>>
>>> # Reprocess documents with empty content
>>> client.documents().filter(content="").reprocess()
rotate(degrees)[source]

Rotate all documents in the current queryset.

This method rotates the PDF files of all documents matching the current queryset filters.

Parameters:

degrees (int) – Degrees to rotate (must be 90, 180, or 270).

Returns:

The API response from the bulk rotate operation. None if there are no documents to rotate.

Return type:

ClientResponse

Examples

>>> # Rotate all documents with "sideways" in title by 90 degrees
>>> client.documents().title("sideways", exact=False).rotate(90)
>>>
>>> # Rotate upside-down documents by 180 degrees
>>> client.documents().tag_name("upside-down").rotate(180)
search(query)[source]

Search for documents using a query string.

This method uses the Paperless-ngx full-text search functionality to find documents matching the query string in their content or metadata.

Parameters:

query (str) – The search query string.

Return type:

DocumentQuerySet

Returns:

A queryset with the search results.

Examples

>>> # Search for documents containing "invoice"
>>> docs = client.documents().search("invoice")
>>> for doc in docs:
...     print(doc.title)
>>>
>>> # Search with multiple terms
>>> docs = client.documents().search("electric bill 2023")
set_permissions(permissions=None, owner_id=None, merge=False)[source]

Set permissions for all documents in the current queryset.

This method allows for updating document permissions in bulk for all documents matching the current queryset filters.

Parameters:
  • permissions (dict[str, Any] | None) – Permissions object defining user and group permissions.

  • owner_id (int | None) – Owner ID to assign to the documents.

  • merge (bool) – Whether to merge with existing permissions (True) or replace them (False).

Returns:

The current queryset for method chaining.

Return type:

Self

Examples

>>> # Set owner to user 2 for all documents with "invoice" in title
>>> client.documents().title("invoice", exact=False).set_permissions(owner_id=2)
>>>
>>> # Set complex permissions
>>> permissions = {
...     "view": {"users": [1, 2], "groups": [1]},
...     "change": {"users": [1], "groups": []}
... }
>>> client.documents().tag_name("confidential").set_permissions(
...     permissions=permissions,
...     owner_id=1,
...     merge=True
... )
storage_path(value=None, *, exact=True, case_insensitive=True, **kwargs)[source]

Filter documents by storage path.

This method provides a flexible interface for filtering documents by storage path, supporting filtering by ID or name, with options for exact or partial matching.

Any number of filter arguments can be provided, but at least one must be specified.

Parameters:
  • value (int | str | None) – The storage path ID or name to filter by.

  • exact (bool) – If True, match the exact value, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching for string values.

  • **kwargs (Any) – Additional filters (id, name).

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Raises:
  • ValueError – If no valid filters are provided.

  • TypeError – If value is not an int or str.

Examples

>>> # Filter by ID
>>> client.documents().storage_path(1)
>>> client.documents().storage_path(id=1)
>>>
>>> # Filter by name
>>> client.documents().storage_path("Invoices")
>>> client.documents().storage_path(name="Invoices")
>>>
>>> # Filter by name (partial match)
>>> client.documents().storage_path("Tax", exact=False)
>>>
>>> # Filter by ID and name
>>> client.documents().storage_path(1, name="Invoices")
storage_path_id(storage_path_id)[source]

Filter documents by storage path ID.

Parameters:

storage_path_id (int) – The storage path ID to filter by.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Filter documents with storage path ID 2
>>> docs = client.documents().storage_path_id(2)
storage_path_name(name, *, exact=True, case_insensitive=True)[source]

Filter documents by storage path name.

Parameters:
  • name (str) – The storage path name to filter by.

  • exact (bool) – If True, match the exact name, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().storage_path_name("Tax Documents")
>>>
>>> # Contains match
>>> docs = client.documents().storage_path_name("Tax", exact=False)
tag_id(tag_id)[source]

Filter documents that have the specified tag ID(s).

Parameters:

tag_id (int | list[int]) – A single tag ID or list of tag IDs.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Filter by a single tag ID
>>> docs = client.documents().tag_id(5)
>>>
>>> # Filter by multiple tag IDs
>>> docs = client.documents().tag_id([5, 7, 9])
tag_name(tag_name, *, exact=True, case_insensitive=True)[source]

Filter documents that have a tag with the specified name.

Parameters:
  • tag_name (str) – The name of the tag to filter by.

  • exact (bool) – If True, match the exact tag name, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().tag_name("Tax")
>>>
>>> # Contains match
>>> docs = client.documents().tag_name("invoice", exact=False)
>>>
>>> # Case-sensitive match
>>> docs = client.documents().tag_name("Receipt", case_insensitive=False)
title(title, *, exact=True, case_insensitive=True)[source]

Filter documents by title.

Parameters:
  • title (str) – The document title to filter by.

  • exact (bool) – If True, match the exact title, otherwise use contains.

  • case_insensitive (bool) – If True, perform case-insensitive matching.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Exact match (default)
>>> docs = client.documents().title("Electric Bill March 2023")
>>>
>>> # Contains match
>>> docs = client.documents().title("invoice", exact=False)
update(*, correspondent=None, document_type=None, storage_path=None, owner=None)[source]

Perform bulk updates on all documents in the current queryset.

This method allows for updating multiple document metadata fields in a single API call for all documents matching the current queryset filters.

Parameters:
  • correspondent (Correspondent | int | None) – Set correspondent for all documents. Can be a Correspondent object or ID.

  • document_type (DocumentType | int | None) – Set document type for all documents. Can be a DocumentType object or ID.

  • storage_path (StoragePath | int | None) – Set storage path for all documents. Can be a StoragePath object or ID.

  • owner (int | None) – Owner ID to assign to all documents.

Returns:

The current queryset for method chaining.

Return type:

Self

Examples

>>> # Update correspondent for all invoices
>>> client.documents().title("invoice", exact=False).update(
...     correspondent=5,
...     document_type=3
... )
>>>
>>> # Set owner for documents without an owner
>>> client.documents().filter(owner__isnull=True).update(owner=1)
user_can_change(value)[source]

Filter documents by user change permission.

This filter is useful for finding documents that the current authenticated user has permission to modify.

Parameters:

value (bool) – True to filter documents the user can change, False for those they cannot.

Return type:

Self

Returns:

Filtered DocumentQuerySet.

Examples

>>> # Find documents the current user can change
>>> docs = client.documents().user_can_change(True)
>>>
>>> # Find documents the current user cannot change
>>> docs = client.documents().user_can_change(False)
resource: DocumentResource
filters: dict[str, Any]
class paperap.models.document.DocumentNoteQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]

Bases: StandardQuerySet[DocumentNote]

QuerySet for document notes.

Provides standard querying capabilities for DocumentNote objects. Inherits all functionality from StandardQuerySet without additional specialization.

Parameters:
  • resource (BaseResource[_Model, Self])

  • filters (dict[str, Any] | None)

  • _cache (list[_Model] | None)

  • _fetch_all (bool)

  • _next_url (str | None)

  • _last_response (ClientResponse)

  • _iter (Iterator[_Model] | None)

  • _urls_fetched (list[str] | None)

class paperap.models.document.DownloadedDocument(**data)[source]

Bases: StandardModel

Represents a downloaded Paperless-NgX document file.

This model stores both the binary content of a downloaded document file and metadata about the file, such as its content type and suggested filename. It is typically used as a return value from document download operations.

mode

The retrieval mode used (download, preview, or thumbnail). Determines which endpoint was used to retrieve the file.

Type:

RetrieveFileMode | None

original

Whether to retrieve the original file (True) or the archived version (False). Only applicable for DOWNLOAD mode.

Type:

bool

content

The binary content of the downloaded file.

Type:

bytes | None

content_type

The MIME type of the file (e.g., “application/pdf”).

Type:

str | None

disposition_filename

The suggested filename from the Content-Disposition header.

Type:

str | None

disposition_type

The disposition type from the Content-Disposition header (typically “attachment” or “inline”).

Type:

str | None

Examples

>>> # Download a document
>>> doc = client.documents.get(123)
>>> downloaded = doc.download_content()
>>> print(f"Downloaded {len(downloaded.content)} bytes")
>>> print(f"File type: {downloaded.content_type}")
>>> print(f"Filename: {downloaded.disposition_filename}")
Parameters:

data (Any)

class Meta(model)[source]

Bases: Meta

Metadata for the DownloadedDocument model.

Defines which fields are read-only and should not be modified by the client.

Parameters:

model (type[_Self])

blacklist_filtering_params: ClassVar[set[str]] = {}
field_map: dict[str, str] = {}
filtering_disabled: ClassVar[set[str]] = {}
filtering_fields: ClassVar[set[str]] = {'_resource', 'content', 'content_type', 'disposition_filename', 'disposition_type', 'id', 'mode', 'original'}
read_only_fields: ClassVar[set[str]] = {'content', 'content_type', 'disposition_filename', 'disposition_type', 'id'}
supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

We need to both initialize private attributes and call the user-defined model_post_init method.

Parameters:
Return type:

None

mode: RetrieveFileMode | None
original: bool
content: bytes | None
content_type: str | None
disposition_filename: str | None
disposition_type: str | None
class paperap.models.document.DownloadedDocumentQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]

Bases: StandardQuerySet[DownloadedDocument]

A specialized queryset for handling downloaded document operations.

This queryset extends StandardQuerySet to provide functionality specific to downloaded documents from Paperless-NgX. It enables efficient querying, filtering, and manipulation of document download operations.

The queryset is lazy-loaded, meaning API requests are only made when data is actually needed, improving performance when working with large document collections.

Examples

>>> # Download original documents
>>> client.documents.filter(title__contains="invoice").download("invoices/")
>>>
>>> # Download archive versions
>>> client.documents.filter(archived=True).download(
...     "archives/", archive_version=True
... )
Parameters:
  • resource (BaseResource[_Model, Self])

  • filters (dict[str, Any] | None)

  • _cache (list[_Model] | None)

  • _fetch_all (bool)

  • _next_url (str | None)

  • _last_response (ClientResponse)

  • _iter (Iterator[_Model] | None)

  • _urls_fetched (list[str] | None)

class paperap.models.document.DocumentMetadata(**data)[source]

Bases: StandardModel

Represents comprehensive metadata for a Paperless-NgX document.

This model encapsulates all metadata associated with a document in Paperless-NgX, including information about both the original document and its archived version (if available). It provides access to file properties such as checksums, sizes, MIME types, and extracted metadata elements.

The metadata is primarily read-only as it is generated by the Paperless-NgX system during document processing.

original_checksum

The SHA256 checksum of the original document file.

original_size

The size of the original document in bytes.

original_mime_type

The MIME type of the original document (e.g., “application/pdf”).

media_filename

The filename of the document in the Paperless-NgX media storage.

has_archive_version

Whether the document has an archived version (typically a PDF/A).

original_metadata

List of metadata elements extracted from the original document.

archive_checksum

The SHA256 checksum of the archived document version.

archive_media_filename

The filename of the archived version in media storage.

original_filename

The original filename of the document when it was uploaded.

lang

The detected language code of the document content.

archive_size

The size of the archived document version in bytes.

archive_metadata

List of metadata elements extracted from the archived version.

Examples

>>> # Access document metadata
>>> metadata = client.documents.get(123).metadata
>>> print(f"Original file: {metadata.original_filename}")
>>> print(f"Size: {metadata.original_size} bytes")
>>> print(f"MIME type: {metadata.original_mime_type}")
>>>
>>> # Iterate through extracted metadata elements
>>> for element in metadata.original_metadata:
...     print(f"{element.key}: {element.value}")
Parameters:

data (Any)

class Meta(model)[source]

Bases: Meta

Metadata configuration for the DocumentMetadata model.

This class defines metadata properties for the DocumentMetadata model, particularly specifying which fields are read-only.

Parameters:

model (type[_Self])

blacklist_filtering_params: ClassVar[set[str]] = {}
field_map: dict[str, str] = {}
filtering_disabled: ClassVar[set[str]] = {}
filtering_fields: ClassVar[set[str]] = {'_resource', 'archive_checksum', 'archive_media_filename', 'archive_metadata', 'archive_size', 'has_archive_version', 'id', 'lang', 'media_filename', 'original_checksum', 'original_filename', 'original_metadata', 'original_mime_type', 'original_size'}
read_only_fields: ClassVar[set[str]] = {'archive_checksum', 'archive_media_filename', 'archive_metadata', 'archive_size', 'has_archive_version', 'id', 'lang', 'media_filename', 'original_checksum', 'original_filename', 'original_metadata', 'original_mime_type', 'original_size'}
supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

We need to both initialize private attributes and call the user-defined model_post_init method.

Parameters:
Return type:

None

original_checksum: str | None
original_size: int | None
original_mime_type: str | None
media_filename: str | None
has_archive_version: bool | None
original_metadata: list[MetadataElement]
archive_checksum: str | None
archive_media_filename: str | None
original_filename: str | None
lang: str | None
archive_size: int | None
archive_metadata: list[MetadataElement]
class paperap.models.document.DocumentMetadataQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]

Bases: StandardQuerySet[DocumentMetadata]

A specialized queryset for interacting with Paperless-NGX document metadata.

This queryset extends StandardQuerySet to provide document metadata-specific filtering methods, making it easier to query metadata by their attributes.

Document metadata contains information about documents such as original filename, media information, archive metadata, and other system-level properties that aren’t part of the document’s content or user-assigned metadata.

The queryset is lazy-loaded, meaning API requests are only made when data is actually needed (when iterating, counting, or accessing specific items).

Examples

>>> # Get metadata for a specific document
>>> metadata = client.document_metadata.filter(document=123).first()
>>> print(f"Original filename: {metadata.original_filename}")
>>>
>>> # Get metadata for documents with specific archive information
>>> archived = client.document_metadata.filter(archive_checksum__isnull=False)
Parameters:
  • resource (BaseResource[_Model, Self])

  • filters (dict[str, Any] | None)

  • _cache (list[_Model] | None)

  • _fetch_all (bool)

  • _next_url (str | None)

  • _last_response (ClientResponse)

  • _iter (Iterator[_Model] | None)

  • _urls_fetched (list[str] | None)

class paperap.models.document.MetadataElement(**data)[source]

Bases: BaseModel

Represents a key-value pair of document metadata in Paperless-NgX.

This model represents individual metadata elements extracted from document files, such as author, creation date, or other file-specific properties. Each element consists of a key and its corresponding value.

key

The metadata field name or identifier.

value

The value associated with the metadata field.

Examples

>>> metadata = MetadataElement(key="Author", value="John Doe")
>>> print(f"{metadata.key}: {metadata.value}")
Author: John Doe
Parameters:

data (Any)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

key: str
value: str
class paperap.models.document.CustomFieldValues(**data)[source]

Bases: ConstModel

Model for custom field values associated with a document.

field

The ID of the custom field.

value

The value of the custom field, which can be of any type depending on the field’s data_type.

Parameters:

data (Any)

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'from_attributes': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

field: int
value: Any
class paperap.models.document.DocumentSuggestions(**data)[source]

Bases: StandardModel

Represents AI-generated suggestions for a Paperless-NgX document.

The DocumentSuggestions model contains lists of suggested metadata IDs that Paperless-NgX’s AI has determined might be appropriate for a document based on its content analysis. These suggestions can be used to quickly apply metadata to documents during processing.

All fields in this model are read-only as they are generated by the Paperless-NgX server and cannot be modified by clients.

correspondents

List of suggested correspondent IDs that might be associated with this document.

Type:

list[int]

tags

List of suggested tag IDs that might be relevant to this document’s content.

Type:

list[int]

document_types

List of suggested document type IDs that might categorize this document.

Type:

list[int]

storage_paths

List of suggested storage path IDs where this document might be stored.

Type:

list[int]

dates

List of suggested relevant dates extracted from the document content.

Type:

list[date]

Examples

>>> # Get suggestions for a document
>>> doc = client.documents.get(123)
>>> suggestions = client.document_suggestions.get(doc.id)
>>>
>>> # Apply suggested tags to the document
>>> if suggestions.tags:
...     doc.tags.extend(suggestions.tags)
...     doc.save()
Parameters:

data (Any)

class Meta(model)[source]

Bases: Meta

Metadata for the DocumentSuggestions model.

This class defines metadata for the DocumentSuggestions model, including which fields are read-only.

Parameters:

model (type[_Self])

blacklist_filtering_params: ClassVar[set[str]] = {}
field_map: dict[str, str] = {}
filtering_disabled: ClassVar[set[str]] = {}
filtering_fields: ClassVar[set[str]] = {'_resource', 'correspondents', 'dates', 'document_types', 'id', 'storage_paths', 'tags'}
read_only_fields: ClassVar[set[str]] = {'correspondents', 'dates', 'document_types', 'id', 'storage_paths', 'tags'}
supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

We need to both initialize private attributes and call the user-defined model_post_init method.

Parameters:
Return type:

None

correspondents: list[int]
tags: list[int]
document_types: list[int]
storage_paths: list[int]
dates: list[date]
class paperap.models.document.DocumentSuggestionsQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]

Bases: StandardQuerySet[DocumentSuggestions]

QuerySet for interacting with document suggestions in Paperless-NgX.

This class extends StandardQuerySet to provide specialized functionality for retrieving and filtering document suggestions. Document suggestions are recommendations for metadata (correspondents, document types, tags) that Paperless-NgX generates based on document content analysis.

The queryset is lazy-loaded, meaning API requests are only made when data is actually accessed, improving performance when working with large datasets.

Examples

>>> # Get all suggestions for a document
>>> suggestions = client.document_suggestions.filter(document=123)
>>>
>>> # Get suggestions with high confidence scores
>>> high_confidence = client.document_suggestions.filter(
...     document=123,
...     confidence__gte=0.8
... )
Parameters:
  • resource (BaseResource[_Model, Self])

  • filters (dict[str, Any] | None)

  • _cache (list[_Model] | None)

  • _fetch_all (bool)

  • _next_url (str | None)

  • _last_response (ClientResponse)

  • _iter (Iterator[_Model] | None)

  • _urls_fetched (list[str] | None)

Subpackages

Submodules