paperap.models.document package
- class paperap.models.document.Document(**data)[source]
Bases:
StandardModel
Represent a Paperless-ngx document.
This class models documents stored in the Paperless-ngx system, providing access to document metadata, content, and related objects. It supports operations like downloading, updating metadata, and managing tags and custom fields.
- added
Timestamp when the document was added to the system.
- Type:
datetime | None
- archive_checksum
Checksum of the archived version of the document.
- Type:
str | None
- archive_filename
Filename of the archived version.
- Type:
str | None
- archive_serial_number
Serial number in the archive system.
- Type:
int | None
- archived_file_name
Original name of the archived file.
- Type:
str | None
- checksum
Checksum of the original document.
- Type:
str | None
- correspondent_id
ID of the associated correspondent.
- Type:
int | None
- created
Timestamp when the document was created.
- Type:
datetime | None
- created_date
Creation date as a string.
- Type:
str | None
- custom_field_dicts
Custom fields associated with the document.
- Type:
list[CustomFieldValues]
- deleted_at
Timestamp when the document was deleted, or None.
- Type:
datetime | None
- document_type_id
ID of the document type.
- Type:
int | None
- filename
Current filename in the system.
- Type:
str | None
Whether the document is shared by the requester.
- Type:
- notes
Notes attached to this document.
- Type:
list[DocumentNote]
- original_filename
Original filename when uploaded.
- Type:
str | None
- owner
ID of the document owner.
- Type:
int | None
- page_count
Number of pages in the document.
- Type:
int | None
- storage_path_id
ID of the storage path.
- Type:
int | None
- storage_type
Type of storage used.
- Type:
DocumentStorageType | None
- tag_ids
List of tag IDs associated with this document.
- Type:
list[int]
- user_can_change
Whether the current user can modify this document.
- Type:
bool | None
Examples
>>> document = client.documents().get(pk=1) >>> document.title = 'Example Document' >>> document.save() >>> document.title 'Example Document'
# Get document metadata >>> metadata = document.get_metadata() >>> print(metadata.original_mime_type) ‘application/pdf’
# Download document >>> download = document.download() >>> with open(download.disposition_filename, ‘wb’) as f: … f.write(download.content)
# Get document suggestions >>> suggestions = document.get_suggestions() >>> print(suggestions.tags) [‘Invoice’, ‘Tax’, ‘2023’]
- Parameters:
data (
Any
)
- class Meta(model)[source]
Bases:
Meta
- Parameters:
model (type[_Self])
- blacklist_filtering_params: ClassVar[set[str]] = {}
- field_map: dict[str, str] = {'correspondent': 'correspondent_id', 'custom_fields': 'custom_field_dicts', 'document_type': 'document_type_id', 'storage_path': 'storage_path_id', 'tags': 'tag_ids'}
- filtering_disabled: ClassVar[set[str]] = {'deleted_at', 'is_shared_by_requester', 'page_count'}
- filtering_fields: ClassVar[set[str]] = {'__search_hit__', '_correspondent', '_document_type', '_resource', '_storage_path', 'added', 'archive_checksum', 'archive_filename', 'archive_serial_number', 'archived_file_name', 'checksum', 'content', 'correspondent_id', 'created', 'created_date', 'custom_field_dicts', 'document_type_id', 'filename', 'id', 'notes', 'original_filename', 'owner', 'storage_path_id', 'storage_type', 'tag_ids', 'title', 'user_can_change'}
- filtering_strategies: ClassVar[set[FilteringStrategies]] = {FilteringStrategies.WHITELIST}
- read_only_fields: ClassVar[set[str]] = {'archived_file_name', 'deleted_at', 'id', 'is_shared_by_requester', 'page_count'}
- supported_filtering_params: ClassVar[set[str]] = {'added__date__gt', 'added__date__lt', 'added__day', 'added__gt', 'added__lt', 'added__month', 'added__year', 'archive_serial_number', 'archive_serial_number__gt', 'archive_serial_number__gte', 'archive_serial_number__isnull', 'archive_serial_number__lt', 'archive_serial_number__lte', 'checksum__icontains', 'checksum__iendswith', 'checksum__iexact', 'checksum__istartswith', 'content__contains', 'content__icontains', 'content__iendswith', 'content__iexact', 'content__istartswith', 'correspondent__id', 'correspondent__id__in', 'correspondent__id__none', 'correspondent__isnull', 'correspondent__name__icontains', 'correspondent__name__iendswith', 'correspondent__name__iexact', 'correspondent__name__istartswith', 'correspondent__slug__iexact', 'created__date__gt', 'created__date__lt', 'created__day', 'created__gt', 'created__lt', 'created__month', 'created__year', 'custom_field_query', 'custom_fields__icontains', 'custom_fields__id__all', 'custom_fields__id__in', 'custom_fields__id__none', 'document_type__id', 'document_type__id__in', 'document_type__id__none', 'document_type__isnull', 'document_type__name__icontains', 'document_type__name__iendswith', 'document_type__name__iexact', 'document_type__name__istartswith', 'has_custom_fields', 'id', 'id__in', 'is_in_inbox', 'is_tagged', 'limit', 'original_filename__icontains', 'original_filename__iendswith', 'original_filename__iexact', 'original_filename__istartswith', 'owner__id', 'owner__id__in', 'owner__id__none', 'owner__isnull', 'shared_by__id', 'shared_by__id__in', 'storage_path__id', 'storage_path__id__in', 'storage_path__id__none', 'storage_path__isnull', 'storage_path__name__icontains', 'storage_path__name__iendswith', 'storage_path__name__iexact', 'storage_path__name__istartswith', 'tags__id', 'tags__id__all', 'tags__id__in', 'tags__id__none', 'tags__name__icontains', 'tags__name__iendswith', 'tags__name__iexact', 'tags__name__istartswith', 'title__icontains', 'title__iendswith', 'title__iexact', 'title__istartswith', 'title_content'}
- model: type[_Self]
- name: str
- add_tag(tag)[source]
Add a tag to the document.
Adds a tag to the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.
- Parameters:
tag (
Tag | int | str
) – The tag to add. Can be a Tag object, a tag ID, or a tag name.- Raises:
TypeError – If the input value is not a Tag object, an integer, or a string.
ResourceNotFoundError – If a tag name is provided but no matching tag is found.
- Return type:
Example
>>> document = client.documents().get(1) >>> # Add tag by ID >>> document.add_tag(5) >>> # Add tag by object >>> tag = client.tags().get(3) >>> document.add_tag(tag) >>> # Add tag by name >>> document.add_tag("Invoice")
- append_content(value)[source]
Append content to the document.
Adds the specified text to the end of the document’s content, separated by a newline.
Example
>>> document = client.documents().get(1) >>> document.append_content("Additional notes about this document") >>> document.save()
- property correspondent: Correspondent | None
Get the correspondent for this document.
Retrieves the Correspondent object associated with this document. Uses caching to minimize API requests when accessing the same correspondent multiple times.
- Returns:
The correspondent object or None if not set.
- Return type:
Correspondent | None
Examples
>>> document = client.documents().get(pk=1) >>> if document.correspondent: ... print(document.correspondent.name) Example Correspondent
- property custom_field_ids: list[int]
Get the IDs of the custom fields for this document.
Example
>>> document = client.documents().get(1) >>> field_ids = document.custom_field_ids >>> print(field_ids) [1, 3, 5]
- custom_field_value(field_id, default=None, *, raise_errors=False)[source]
Get the value of a custom field by ID.
Retrieves the value of a specific custom field associated with this document.
- Parameters:
- Returns:
The value of the custom field or the default value if not found.
- Return type:
Any
- Raises:
ValueError – If raise_errors is True and the field is not found.
Example
>>> document = client.documents().get(1) >>> # Get value with default >>> due_date = document.custom_field_value(3, default="Not set") >>> # Get value with error handling >>> try: ... reference = document.custom_field_value(5, raise_errors=True) ... except ValueError: ... print("Reference field not found") Reference field not found
- property custom_field_values: list[Any]
Get the values of the custom fields for this document.
- Returns:
A list of values for the custom fields associated with this document.
- Return type:
list[Any]
Example
>>> document = client.documents().get(1) >>> values = document.custom_field_values >>> print(values) ['2023-01-15', 'INV-12345', True]
- property custom_fields: CustomFieldQuerySet
Get the custom fields for this document.
Returns a QuerySet of CustomField objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the custom fields are actually accessed.
- Returns:
QuerySet of custom fields associated with this document.
- Return type:
Example
>>> document = client.documents().get(1) >>> for field in document.custom_fields: ... print(f'{field.name}: {field.value}') Due Date: 2023-04-15 Reference: INV-12345
- property document_type: DocumentType | None
Get the document type for this document.
Retrieves the DocumentType object associated with this document. Uses caching to minimize API requests when accessing the same document type multiple times.
- Returns:
The document type object or None if not set.
- Return type:
DocumentType | None
Examples
>>> document = client.documents().get(pk=1) >>> if document.document_type: ... print(document.document_type.name) Example Document Type
- download(original=False)[source]
Download the document file.
Downloads either the archived version (default) or the original version of the document from the Paperless-ngx server.
- Parameters:
original (
bool
, optional) – Whether to download the original file instead of the archived version. Defaults to False (download the archived version).- Returns:
- An object containing the downloaded document content
and metadata.
- Return type:
Examples
>>> # Download archived version >>> download = document.download() >>> with open(download.disposition_filename, 'wb') as f: ... f.write(download.content)
>>> # Download original version >>> original = document.download(original=True) >>> print(f"Downloaded {len(original.content)} bytes") Downloaded 245367 bytes
- get_metadata()[source]
Get the metadata for this document.
Retrieves detailed metadata about the document from the Paperless-ngx API. This includes information like the original file format, creation date, modification date, and other technical details.
- Returns:
The document metadata object.
- Return type:
Examples
>>> metadata = document.get_metadata() >>> print(metadata.original_mime_type) application/pdf >>> print(metadata.media_filename) document.pdf
- get_suggestions()[source]
Get suggestions for this document.
Retrieves AI-generated suggestions for document metadata from the Paperless-ngx server. This can include suggested tags, correspondent, document type, and other metadata based on the document’s content.
- Returns:
An object containing suggested metadata for the document.
- Return type:
Examples
>>> suggestions = document.get_suggestions() >>> print(f"Suggested tags: {suggestions.tags}") Suggested tags: [{'name': 'Invoice', 'score': 0.95}, {'name': 'Utility', 'score': 0.87}] >>> print(f"Suggested correspondent: {suggestions.correspondent}") Suggested correspondent: {'name': 'Electric Company', 'score': 0.92} >>> print(f"Suggested document type: {suggestions.document_type}") Suggested document type: {'name': 'Bill', 'score': 0.89}
- property has_search_hit: bool
Check if this document has search hit information.
- Returns:
- True if this document was returned as part of a search result
and has search hit information, False otherwise.
- Return type:
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- preview(original=False)[source]
Get a preview of the document.
Retrieves a preview version of the document from the Paperless-ngx server. This is typically a web-friendly version (e.g., PDF) that can be displayed in a browser.
- Parameters:
original (
bool
, optional) – Whether to preview the original file instead of the archived version. Defaults to False (preview the archived version).- Returns:
- An object containing the preview document content
and metadata.
- Return type:
Example
>>> preview = document.preview() >>> with open('preview.pdf', 'wb') as f: ... f.write(preview.content)
- remove_tag(tag)[source]
Remove a tag from the document.
Removes a tag from the document’s tag_ids list. The tag can be specified as a Tag object, a tag ID, or a tag name. If a tag name is provided, the method will look up the corresponding tag ID.
- Parameters:
tag (
Tag | int | str
) – The tag to remove. Can be a Tag object, a tag ID, or a tag name.- Raises:
TypeError – If the input value is not a Tag object, an integer, or a string.
ResourceNotFoundError – If a tag name is provided but no matching tag is found.
ValueError – If the tag is not associated with this document.
- Return type:
Example
>>> document = client.documents().get(1) >>> # Remove tag by ID >>> document.remove_tag(5) >>> # Remove tag by object >>> tag = client.tags().get(3) >>> document.remove_tag(tag) >>> # Remove tag by name >>> document.remove_tag("Invoice")
- property search_hit: dict[str, Any] | None
Get the search hit information for this document.
When a document is returned as part of a search result, this property contains additional information about the search match.
- serialize_datetime(value)[source]
Serialize datetime fields to ISO format.
Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.
- Parameters:
value (
datetime | None
) – The datetime value to serialize.- Returns:
The serialized datetime value as an ISO 8601 string, or None.
- Return type:
str | None
- serialize_notes(value)[source]
Serialize notes to a list of dictionaries.
Converts DocumentNote objects to dictionaries for JSON serialization. Returns an empty list if the input value is None or empty.
- property storage_path: StoragePath | None
Get the storage path for this document.
Retrieves the StoragePath object associated with this document. Uses caching to minimize API requests when accessing the same storage path multiple times.
- Returns:
The storage path object or None if not set.
- Return type:
StoragePath | None
Examples
>>> document = client.documents().get(pk=1) >>> if document.storage_path: ... print(document.storage_path.name) Example Storage Path
- property tag_names: list[str]
Get the names of the tags for this document.
Example
>>> document = client.documents().get(1) >>> names = document.tag_names >>> print(names) ['Invoice', 'Tax', 'Important']
- property tags: TagQuerySet
Get the tags for this document.
Returns a QuerySet of Tag objects associated with this document. The QuerySet is lazily loaded, so API requests are only made when the tags are actually accessed.
- Returns:
QuerySet of tags associated with this document.
- Return type:
Examples
>>> document = client.documents().get(pk=1) >>> for tag in document.tags: ... print(f'{tag.name} # {tag.id}') Tag 1 # 1 Tag 2 # 2 Tag 3 # 3
>>> if 5 in document.tags: ... print('Tag ID #5 is associated with this document')
>>> tag = client.tags().get(pk=1) >>> if tag in document.tags: ... print('Tag ID #1 is associated with this document')
>>> filtered_tags = document.tags.filter(name__icontains='example') >>> for tag in filtered_tags: ... print(f'{tag.name} # {tag.id}')
- thumbnail(original=False)[source]
Get the document thumbnail.
Retrieves a thumbnail image of the document from the Paperless-ngx server. This is typically a small image representation of the first page.
- Parameters:
original (
bool
, optional) – Whether to get the thumbnail of the original file instead of the archived version. Defaults to False (get thumbnail of archived version).- Returns:
- An object containing the thumbnail image content
and metadata.
- Return type:
Example
>>> thumbnail = document.thumbnail() >>> with open('thumbnail.png', 'wb') as f: ... f.write(thumbnail.content)
- update_locally(from_db=None, **kwargs)[source]
Update the document locally with the provided data.
Updates the document’s attributes with the provided data without sending an API request. Handles special cases for notes and tags, which cannot be set to None in Paperless-ngx if they already have values.
- Parameters:
from_db (
bool | None
, optional) – Whether the update is coming from the database. If True, bypasses certain validation checks. Defaults to None.**kwargs (
Any
) – Additional data to update the document with.
- Raises:
NotImplementedError – If attempting to set notes or tags to None when they are not already None and from_db is False.
- Return type:
Example
>>> document = client.documents().get(1) >>> document.update_locally(title="New Title", correspondent_id=5) >>> document.save()
- classmethod validate_custom_fields(value)[source]
Validate and return custom field dictionaries.
Ensures custom fields are properly formatted as a list of CustomFieldValues. Returns an empty list if the input value is None.
- Parameters:
value (
Any
) – The list of custom field dictionaries to validate.- Returns:
A list of validated custom field dictionaries.
- Return type:
- Raises:
TypeError – If the input value is not None or a list.
Validate and return the is_shared_by_requester flag.
Ensures the is_shared_by_requester flag is properly formatted as a boolean. Returns False if the input value is None.
- classmethod validate_notes(value)[source]
Validate and return the list of notes.
Ensures notes are properly formatted as a list of DocumentNote objects. Handles various input formats including None, single DocumentNote objects, and lists.
- classmethod validate_tags(value)[source]
Validate and convert tag IDs to a list of integers.
Ensures tag IDs are properly formatted as a list of integers. Handles various input formats including None, single integers, and lists.
- Parameters:
value (
Any
) – The tag IDs to validate, which can be None, an integer, or a list.- Returns:
A list of validated tag IDs.
- Return type:
- Raises:
TypeError – If the input value is not None, an integer, or a list.
Examples
>>> Document.validate_tags(None) [] >>> Document.validate_tags(5) [5] >>> Document.validate_tags([1, 2, 3]) [1, 2, 3]
- classmethod validate_text(value)[source]
Validate and return a text field.
Ensures text fields are properly formatted as strings. Converts integers to strings and returns an empty string if the input value is None.
- Parameters:
value (
Any
) – The value of the text field to validate.- Returns:
The validated text value.
- Return type:
- Raises:
TypeError – If the input value is not None, a string, or an integer.
Examples
>>> Document.validate_text(None) '' >>> Document.validate_text("Hello") 'Hello' >>> Document.validate_text(123) '123'
- added: datetime | None
- archive_checksum: str | None
- archive_filename: str | None
- archive_serial_number: int | None
- archived_file_name: str | None
- checksum: str | None
- content: str
- correspondent_id: int | None
- created: datetime | None
- created_date: str | None
- custom_field_dicts: Annotated[list[CustomFieldValues], Field(default_factory=list)]
- deleted_at: datetime | None
- document_type_id: int | None
- filename: str | None
- is_shared_by_requester: bool
- notes: list[DocumentNote]
- original_filename: str | None
- owner: int | None
- page_count: int | None
- storage_path_id: int | None
- storage_type: DocumentStorageType | None
- tag_ids: Annotated[list[int], Field(default_factory=list)]
- title: str
- user_can_change: bool | None
- id: int
- class paperap.models.document.DocumentNote(**data)[source]
Bases:
StandardModel
Represent a note on a Paperless-ngx document.
This class models user-created notes that can be attached to documents in the Paperless-ngx system. Notes include information about when they were created, who created them, and their content.
- deleted_at
Timestamp when the note was deleted, or None if not deleted.
- Type:
datetime | None
- restored_at
Timestamp when the note was restored after deletion, or None.
- Type:
datetime | None
- transaction_id
ID of the transaction that created or modified this note.
- Type:
int | None
- created
Timestamp when the note was created.
- Type:
datetime
Examples
>>> note = client.document_notes().get(1) >>> print(note.note) 'This is an important document' >>> print(note.created) 2023-01-15 14:30:22
- Parameters:
data (
Any
)
- class Meta(model)[source]
Bases:
Meta
- Parameters:
model (type[_Self])
- blacklist_filtering_params: ClassVar[set[str]] = {}
- field_map: dict[str, str] = {}
- filtering_disabled: ClassVar[set[str]] = {}
- filtering_fields: ClassVar[set[str]] = {'_resource', 'created', 'deleted_at', 'document', 'id', 'note', 'restored_at', 'transaction_id', 'user'}
- read_only_fields: ClassVar[set[str]] = {'created', 'deleted_at', 'id', 'restored_at', 'transaction_id'}
- supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
- model: type[_Self]
- name: str
- get_document()[source]
Get the document associated with this note.
Retrieves the full Document object that this note is attached to by making an API request using the document ID.
- Returns:
The document associated with this note.
- Return type:
Example
>>> note = client.document_notes().get(1) >>> document = note.get_document() >>> print(document.title) 'Invoice #12345'
- get_user()[source]
Get the user who created this note.
Retrieves the full User object for the user who created this note by making an API request using the user ID.
- Returns:
The user who created this note.
- Return type:
Example
>>> note = client.document_notes().get(1) >>> user = note.get_user() >>> print(user.username) 'admin'
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- serialize_datetime(value)[source]
Serialize datetime fields to ISO format.
Converts datetime objects to ISO 8601 formatted strings for JSON serialization. Returns None if the input value is None.
- Parameters:
value (
datetime | None
) – The datetime value to serialize.- Returns:
The serialized datetime value as an ISO 8601 string, or None if the value is None.
- Return type:
str | None
- deleted_at: datetime | None
- restored_at: datetime | None
- transaction_id: int | None
- note: str
- created: datetime
- document: int
- user: int
- id: int
- class paperap.models.document.DocumentQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]
Bases:
StandardQuerySet
[Document],HasOwner
QuerySet for Paperless-ngx documents with specialized filtering methods.
This class extends StandardQuerySet to provide document-specific filtering, searching, and bulk operations. It includes methods for filtering by document metadata, content, custom fields, and more, as well as bulk operations like merging, rotating, and updating document properties.
Examples
>>> # Search for documents >>> docs = client.documents().search("invoice") >>> for doc in docs: ... print(doc.title)
>>> # Find documents similar to a specific document >>> similar_docs = client.documents().more_like(42) >>> for doc in similar_docs: ... print(doc.title)
>>> # Filter by correspondent and document type >>> filtered_docs = client.documents().correspondent(5).document_type("Invoice") >>> for doc in filtered_docs: ... print(f"{doc.title} - {doc.created}")
- Parameters:
resource (BaseResource[_Model, Self])
filters (dict[str, Any] | None)
_cache (list[_Model] | None)
_fetch_all (bool)
_next_url (str | None)
_last_response (ClientResponse)
_iter (Iterator[_Model] | None)
_urls_fetched (list[str] | None)
- add_tag(tag_id)[source]
Add a tag to all documents in the current queryset.
This is a convenience method that calls modify_tags with a single tag ID to add.
- Parameters:
tag_id (
int
) – Tag ID to add to all documents in the queryset.- Returns:
The current queryset for method chaining.
- Return type:
Self
Examples
>>> # Add tag 3 to all documents with "invoice" in title >>> client.documents().title("invoice", exact=False).add_tag(3) >>> >>> # Add tag to documents from a specific correspondent >>> client.documents().correspondent_name("Electric Company").add_tag(5)
- added_after(date_str)[source]
Filter documents added after the specified date.
- Parameters:
date_str (
str
) – ISO format date string (YYYY-MM-DD).- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents added after January 1, 2023 >>> docs = client.documents().added_after("2023-01-01")
- added_before(date_str)[source]
Filter documents added before the specified date.
- Parameters:
date_str (
str
) – ISO format date string (YYYY-MM-DD).- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents added before December 31, 2022 >>> docs = client.documents().added_before("2022-12-31")
- asn(value, *, exact=True, case_insensitive=True)[source]
Filter documents by archive serial number (ASN).
The archive serial number is a unique identifier assigned to documents in Paperless-ngx, often used for referencing physical documents.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().asn("2023-0042") >>> >>> # Contains match >>> docs = client.documents().asn("2023", exact=False)
- content(text)[source]
Filter documents whose content contains the specified text.
This method searches the OCR-extracted text content of documents.
- Parameters:
text (
str
) – The text to search for in document content.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents containing "invoice amount" >>> docs = client.documents().content("invoice amount") >>> >>> # Chain with other filters >>> recent_with_text = client.documents().content("tax").created_after("2023-01-01")
- correspondent(value=None, *, exact=True, case_insensitive=True, **kwargs)[source]
Filter documents by correspondent.
This method provides a flexible interface for filtering documents by correspondent, supporting filtering by ID, name, or slug, with options for exact or partial matching.
Any number of filter arguments can be provided, but at least one must be specified.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
- Raises:
ValueError – If no valid filters are provided.
TypeError – If value is not an int or str.
Examples
>>> # Filter by ID >>> client.documents().correspondent(1) >>> client.documents().correspondent(id=1) >>> >>> # Filter by name >>> client.documents().correspondent("John Doe") >>> client.documents().correspondent(name="John Doe") >>> >>> # Filter by name (partial match) >>> client.documents().correspondent("John", exact=False) >>> >>> # Filter by slug >>> client.documents().correspondent(slug="john-doe") >>> >>> # Filter by ID and name >>> client.documents().correspondent(1, name="John Doe")
- correspondent_id(correspondent_id)[source]
Filter documents by correspondent ID.
- Parameters:
correspondent_id (
int
) – The correspondent ID to filter by.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Filter documents with correspondent ID 5 >>> docs = client.documents().correspondent_id(5)
- correspondent_name(name, *, exact=True, case_insensitive=True)[source]
Filter documents by correspondent name.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().correspondent_name("Electric Company") >>> >>> # Contains match >>> docs = client.documents().correspondent_name("Electric", exact=False)
- correspondent_slug(slug, *, exact=True, case_insensitive=True)[source]
Filter documents by correspondent slug.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().correspondent_slug("electric-company") >>> >>> # Contains match >>> docs = client.documents().correspondent_slug("electric", exact=False)
- created_after(date)[source]
Filter models created after a given date.
This method filters documents based on their creation date in Paperless-ngx.
- Parameters:
date (
datetime
|str
) – The date to filter by. Can be a datetime object or an ISO format string.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Using string date >>> docs = client.documents().created_after("2023-01-01") >>> >>> # Using datetime object >>> from datetime import datetime >>> date = datetime(2023, 1, 1) >>> docs = client.documents().created_after(date)
- created_before(date)[source]
Filter models created before a given date.
This method filters documents based on their creation date in Paperless-ngx.
- Parameters:
date (
datetime
|str
) – The date to filter by. Can be a datetime object or an ISO format string.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Using string date >>> docs = client.documents().created_before("2023-01-01") >>> >>> # Using datetime object >>> from datetime import datetime >>> date = datetime(2023, 1, 1) >>> docs = client.documents().created_before(date)
- created_between(start, end)[source]
Filter models created between two dates.
This method filters documents with creation dates falling within the specified range.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Using string dates >>> docs = client.documents().created_between("2023-01-01", "2023-12-31") >>> >>> # Using datetime objects >>> from datetime import datetime >>> start = datetime(2023, 1, 1) >>> end = datetime(2023, 12, 31) >>> docs = client.documents().created_between(start, end)
- custom_field(field, value, *, exact=False, case_insensitive=True)[source]
Filter documents by custom field.
This method filters documents based on a specific custom field’s value.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Contains match (default) >>> docs = client.documents().custom_field("Reference Number", "INV") >>> >>> # Exact match >>> docs = client.documents().custom_field("Status", "Paid", exact=True) >>> >>> # Case-sensitive contains match >>> docs = client.documents().custom_field("Notes", "Important", ... case_insensitive=False)
- custom_field_contains(field, values)[source]
Filter documents with a custom field that contains all specified values.
This method is useful for custom fields that can hold multiple values, such as array or list-type fields.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents with tags field containing both "important" and "tax" >>> docs = client.documents().custom_field_contains("Tags", ["important", "tax"]) >>> >>> # Find documents with categories containing specific values >>> docs = client.documents().custom_field_contains("Categories", ["Finance", "Tax"])
- custom_field_exact(field, value)[source]
Filter documents with a custom field value that matches exactly.
This method is a shorthand for custom_field_query with the “exact” operation.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Match exact status >>> docs = client.documents().custom_field_exact("Status", "Paid") >>> >>> # Match exact date >>> docs = client.documents().custom_field_exact("Due Date", "2023-04-15")
- custom_field_exists(field, exists=True)[source]
Filter documents based on the existence of a custom field.
This method is a shorthand for custom_field_query with the “exists” operation.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents that have the "Priority" field >>> docs = client.documents().custom_field_exists("Priority") >>> >>> # Find documents that don't have the "Review Date" field >>> docs = client.documents().custom_field_exists("Review Date", exists=False)
- custom_field_fullsearch(value, *, case_insensitive=True)[source]
Filter documents by searching through both custom field name and value.
This method searches across all custom fields (both names and values) for the specified text.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
- Raises:
NotImplementedError – If case_insensitive is False, as Paperless NGX doesn’t support case-sensitive custom field search.
Examples
>>> # Find documents with custom fields containing "reference" >>> docs = client.documents().custom_field_fullsearch("reference")
- custom_field_in(field, values)[source]
Filter documents with a custom field value in a list of values.
This method is a shorthand for custom_field_query with the “in” operation.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Match documents with status in a list >>> docs = client.documents().custom_field_in("Status", ["Paid", "Pending"]) >>> >>> # Match documents with specific reference numbers >>> docs = client.documents().custom_field_in("Reference", ["INV-001", "INV-002", "INV-003"])
- custom_field_isnull(field)[source]
Filter documents with a custom field that is null or empty.
This method finds documents where the specified custom field either doesn’t exist or has an empty value.
- Parameters:
field (
str
) – The name of the custom field.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents with missing or empty "Status" field >>> docs = client.documents().custom_field_isnull("Status") >>> >>> # Chain with other filters >>> recent_missing_field = client.documents().custom_field_isnull("Priority").created_after("2023-01-01")
- custom_field_range(field, start, end)[source]
Filter documents with a custom field value within a specified range.
This is particularly useful for date or numeric custom fields.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Date range >>> docs = client.documents().custom_field_range("Invoice Date", "2023-01-01", "2023-12-31") >>> >>> # Numeric range >>> docs = client.documents().custom_field_range("Amount", "100", "500")
- delete()[source]
Delete all documents in the current queryset.
This method performs a bulk delete operation on all documents matching the current queryset filters.
- Returns:
The API response from the bulk delete operation. None if there are no documents to delete.
- Return type:
ClientResponse
Examples
>>> # Delete all documents with "invoice" in title >>> client.documents().title("invoice", exact=False).delete() >>> >>> # Delete old documents >>> client.documents().created_before("2020-01-01").delete()
- document_type(value=None, *, exact=True, case_insensitive=True, **kwargs)[source]
Filter documents by document type.
This method provides a flexible interface for filtering documents by document type, supporting filtering by ID or name, with options for exact or partial matching.
Any number of filter arguments can be provided, but at least one must be specified.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
- Raises:
ValueError – If no valid filters are provided.
TypeError – If value is not an int or str.
Examples
>>> # Filter by ID >>> client.documents().document_type(1) >>> client.documents().document_type(id=1) >>> >>> # Filter by name >>> client.documents().document_type("Invoice") >>> client.documents().document_type(name="Invoice") >>> >>> # Filter by name (partial match) >>> client.documents().document_type("Inv", exact=False) >>> >>> # Filter by ID and name >>> client.documents().document_type(1, name="Invoice")
- document_type_id(document_type_id)[source]
Filter documents by document type ID.
- Parameters:
document_type_id (
int
) – The document type ID to filter by.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Filter documents with document type ID 3 >>> docs = client.documents().document_type_id(3)
- document_type_name(name, *, exact=True, case_insensitive=True)[source]
Filter documents by document type name.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().document_type_name("Invoice") >>> >>> # Contains match >>> docs = client.documents().document_type_name("bill", exact=False)
- has_custom_field_id(pk, *, exact=False)[source]
Filter documents that have a custom field with the specified ID(s).
This method filters documents based on the presence of specific custom fields, regardless of their values.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Documents with custom field ID 5 >>> docs = client.documents().has_custom_field_id(5) >>> >>> # Documents with custom field IDs 5 and 7 >>> docs = client.documents().has_custom_field_id([5, 7]) >>> >>> # Documents with exactly custom field IDs 5 and 7 and no others >>> docs = client.documents().has_custom_field_id([5, 7], exact=True)
- has_custom_fields()[source]
Filter documents that have any custom fields.
This method returns documents that have at least one custom field defined, regardless of the field name or value.
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find all documents with any custom fields >>> docs = client.documents().has_custom_fields()
- merge(metadata_document_id=None, delete_originals=False)[source]
Merge all documents in the current queryset into a single document.
This method combines multiple documents into a single PDF document.
- Parameters:
- Returns:
True if submitting the merge succeeded, False if there are no documents to merge.
- Return type:
- Raises:
BadResponseError – If the merge action returns an unexpected response.
APIError – If the merge action fails.
Examples
>>> # Merge all documents with tag "merge_me" >>> client.documents().tag_name("merge_me").merge(delete_originals=True) >>> >>> # Merge documents and use metadata from document ID 42 >>> client.documents().correspondent_name("Electric Company").merge( ... metadata_document_id=42, delete_originals=False ... )
- modify_custom_fields(add_custom_fields=None, remove_custom_fields=None)[source]
Modify custom fields on all documents in the current queryset.
This method allows for adding, updating, and removing custom fields in bulk for all documents matching the current queryset filters.
- Parameters:
- Returns:
The current queryset for method chaining.
- Return type:
Self
Examples
>>> # Add a custom field to documents with "invoice" in title >>> client.documents().title("invoice", exact=False).modify_custom_fields( ... add_custom_fields={5: "Processed"} ... ) >>> >>> # Add one field and remove another >>> client.documents().correspondent_id(3).modify_custom_fields( ... add_custom_fields={7: "2023-04-15"}, ... remove_custom_fields=[9] ... )
- modify_tags(add_tags=None, remove_tags=None)[source]
Modify tags on all documents in the current queryset.
This method allows for adding and removing tags in bulk for all documents matching the current queryset filters.
- Parameters:
- Returns:
The current queryset for method chaining.
- Return type:
Self
Examples
>>> # Add tag 3 and remove tag 4 from all documents with "invoice" in title >>> client.documents().title("invoice", exact=False).modify_tags( ... add_tags=[3], remove_tags=[4] ... ) >>> >>> # Add multiple tags to recent documents >>> from datetime import datetime, timedelta >>> month_ago = datetime.now() - timedelta(days=30) >>> client.documents().created_after(month_ago.strftime("%Y-%m-%d")).modify_tags( ... add_tags=[5, 7, 9] ... )
- more_like(document_id)[source]
Find documents similar to the specified document.
Uses Paperless-ngx’s similarity algorithm to find documents with content or metadata similar to the specified document.
- Parameters:
document_id (
int
) – The ID of the document to find similar documents for.- Return type:
- Returns:
A queryset with similar documents.
Examples
>>> # Find documents similar to document with ID 42 >>> similar_docs = client.documents().more_like(42) >>> for doc in similar_docs: ... print(doc.title) >>> >>> # Chain with other filters >>> recent_similar = client.documents().more_like(42).created_after("2023-01-01")
- no_custom_fields()[source]
Filter documents that do not have any custom fields.
This method returns documents that have no custom fields defined.
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find all documents without any custom fields >>> docs = client.documents().no_custom_fields()
- notes(text)[source]
Filter documents whose notes contain the specified text.
This method searches through the document notes for the specified text.
- Parameters:
text (
str
) – The text to search for in document notes.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents with "follow up" in notes >>> docs = client.documents().notes("follow up") >>> >>> # Chain with other filters >>> important_notes = client.documents().notes("important").tag_name("tax")
- original_filename(name, *, exact=True, case_insensitive=True)[source]
Filter documents by original file name.
This filters based on the original filename of the document when it was uploaded.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().original_filename("scan_001.pdf") >>> >>> # Contains match >>> docs = client.documents().original_filename("invoice", exact=False)
- remove_tag(tag_id)[source]
Remove a tag from all documents in the current queryset.
This is a convenience method that calls modify_tags with a single tag ID to remove.
- Parameters:
tag_id (
int
) – Tag ID to remove from all documents in the queryset.- Returns:
The current queryset for method chaining.
- Return type:
Self
Examples
>>> # Remove tag 4 from all documents with "invoice" in title >>> client.documents().title("invoice", exact=False).remove_tag(4) >>> >>> # Remove tag from old documents >>> client.documents().created_before("2020-01-01").remove_tag(7)
- reprocess()[source]
Reprocess all documents in the current queryset.
This method triggers Paperless-ngx to re-run OCR and classification on all documents matching the current queryset filters.
- Returns:
The API response from the bulk reprocess operation. None if there are no documents to reprocess.
- Return type:
ClientResponse
Examples
>>> # Reprocess documents added in the last week >>> from datetime import datetime, timedelta >>> week_ago = datetime.now() - timedelta(days=7) >>> client.documents().added_after(week_ago.strftime("%Y-%m-%d")).reprocess() >>> >>> # Reprocess documents with empty content >>> client.documents().filter(content="").reprocess()
- rotate(degrees)[source]
Rotate all documents in the current queryset.
This method rotates the PDF files of all documents matching the current queryset filters.
- Parameters:
degrees (
int
) – Degrees to rotate (must be 90, 180, or 270).- Returns:
The API response from the bulk rotate operation. None if there are no documents to rotate.
- Return type:
ClientResponse
Examples
>>> # Rotate all documents with "sideways" in title by 90 degrees >>> client.documents().title("sideways", exact=False).rotate(90) >>> >>> # Rotate upside-down documents by 180 degrees >>> client.documents().tag_name("upside-down").rotate(180)
- search(query)[source]
Search for documents using a query string.
This method uses the Paperless-ngx full-text search functionality to find documents matching the query string in their content or metadata.
- Parameters:
query (
str
) – The search query string.- Return type:
- Returns:
A queryset with the search results.
Examples
>>> # Search for documents containing "invoice" >>> docs = client.documents().search("invoice") >>> for doc in docs: ... print(doc.title) >>> >>> # Search with multiple terms >>> docs = client.documents().search("electric bill 2023")
- set_permissions(permissions=None, owner_id=None, merge=False)[source]
Set permissions for all documents in the current queryset.
This method allows for updating document permissions in bulk for all documents matching the current queryset filters.
- Parameters:
- Returns:
The current queryset for method chaining.
- Return type:
Self
Examples
>>> # Set owner to user 2 for all documents with "invoice" in title >>> client.documents().title("invoice", exact=False).set_permissions(owner_id=2) >>> >>> # Set complex permissions >>> permissions = { ... "view": {"users": [1, 2], "groups": [1]}, ... "change": {"users": [1], "groups": []} ... } >>> client.documents().tag_name("confidential").set_permissions( ... permissions=permissions, ... owner_id=1, ... merge=True ... )
- storage_path(value=None, *, exact=True, case_insensitive=True, **kwargs)[source]
Filter documents by storage path.
This method provides a flexible interface for filtering documents by storage path, supporting filtering by ID or name, with options for exact or partial matching.
Any number of filter arguments can be provided, but at least one must be specified.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
- Raises:
ValueError – If no valid filters are provided.
TypeError – If value is not an int or str.
Examples
>>> # Filter by ID >>> client.documents().storage_path(1) >>> client.documents().storage_path(id=1) >>> >>> # Filter by name >>> client.documents().storage_path("Invoices") >>> client.documents().storage_path(name="Invoices") >>> >>> # Filter by name (partial match) >>> client.documents().storage_path("Tax", exact=False) >>> >>> # Filter by ID and name >>> client.documents().storage_path(1, name="Invoices")
- storage_path_id(storage_path_id)[source]
Filter documents by storage path ID.
- Parameters:
storage_path_id (
int
) – The storage path ID to filter by.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Filter documents with storage path ID 2 >>> docs = client.documents().storage_path_id(2)
- storage_path_name(name, *, exact=True, case_insensitive=True)[source]
Filter documents by storage path name.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().storage_path_name("Tax Documents") >>> >>> # Contains match >>> docs = client.documents().storage_path_name("Tax", exact=False)
- tag_id(tag_id)[source]
Filter documents that have the specified tag ID(s).
- Parameters:
tag_id (
int
|list
[int
]) – A single tag ID or list of tag IDs.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Filter by a single tag ID >>> docs = client.documents().tag_id(5) >>> >>> # Filter by multiple tag IDs >>> docs = client.documents().tag_id([5, 7, 9])
- tag_name(tag_name, *, exact=True, case_insensitive=True)[source]
Filter documents that have a tag with the specified name.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().tag_name("Tax") >>> >>> # Contains match >>> docs = client.documents().tag_name("invoice", exact=False) >>> >>> # Case-sensitive match >>> docs = client.documents().tag_name("Receipt", case_insensitive=False)
- title(title, *, exact=True, case_insensitive=True)[source]
Filter documents by title.
- Parameters:
- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Exact match (default) >>> docs = client.documents().title("Electric Bill March 2023") >>> >>> # Contains match >>> docs = client.documents().title("invoice", exact=False)
- update(*, correspondent=None, document_type=None, storage_path=None, owner=None)[source]
Perform bulk updates on all documents in the current queryset.
This method allows for updating multiple document metadata fields in a single API call for all documents matching the current queryset filters.
- Parameters:
correspondent (
Correspondent
|int
|None
) – Set correspondent for all documents. Can be a Correspondent object or ID.document_type (
DocumentType
|int
|None
) – Set document type for all documents. Can be a DocumentType object or ID.storage_path (
StoragePath
|int
|None
) – Set storage path for all documents. Can be a StoragePath object or ID.
- Returns:
The current queryset for method chaining.
- Return type:
Self
Examples
>>> # Update correspondent for all invoices >>> client.documents().title("invoice", exact=False).update( ... correspondent=5, ... document_type=3 ... ) >>> >>> # Set owner for documents without an owner >>> client.documents().filter(owner__isnull=True).update(owner=1)
- user_can_change(value)[source]
Filter documents by user change permission.
This filter is useful for finding documents that the current authenticated user has permission to modify.
- Parameters:
value (
bool
) – True to filter documents the user can change, False for those they cannot.- Return type:
Self
- Returns:
Filtered DocumentQuerySet.
Examples
>>> # Find documents the current user can change >>> docs = client.documents().user_can_change(True) >>> >>> # Find documents the current user cannot change >>> docs = client.documents().user_can_change(False)
- resource: DocumentResource
- filters: dict[str, Any]
- class paperap.models.document.DocumentNoteQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]
Bases:
StandardQuerySet
[DocumentNote]QuerySet for document notes.
Provides standard querying capabilities for DocumentNote objects. Inherits all functionality from StandardQuerySet without additional specialization.
- Parameters:
resource (BaseResource[_Model, Self])
filters (dict[str, Any] | None)
_cache (list[_Model] | None)
_fetch_all (bool)
_next_url (str | None)
_last_response (ClientResponse)
_iter (Iterator[_Model] | None)
_urls_fetched (list[str] | None)
- class paperap.models.document.DownloadedDocument(**data)[source]
Bases:
StandardModel
Represents a downloaded Paperless-NgX document file.
This model stores both the binary content of a downloaded document file and metadata about the file, such as its content type and suggested filename. It is typically used as a return value from document download operations.
- mode
The retrieval mode used (download, preview, or thumbnail). Determines which endpoint was used to retrieve the file.
- Type:
RetrieveFileMode | None
- original
Whether to retrieve the original file (True) or the archived version (False). Only applicable for DOWNLOAD mode.
- Type:
- content
The binary content of the downloaded file.
- Type:
bytes | None
- content_type
The MIME type of the file (e.g., “application/pdf”).
- Type:
str | None
- disposition_filename
The suggested filename from the Content-Disposition header.
- Type:
str | None
- disposition_type
The disposition type from the Content-Disposition header (typically “attachment” or “inline”).
- Type:
str | None
Examples
>>> # Download a document >>> doc = client.documents.get(123) >>> downloaded = doc.download_content() >>> print(f"Downloaded {len(downloaded.content)} bytes") >>> print(f"File type: {downloaded.content_type}") >>> print(f"Filename: {downloaded.disposition_filename}")
- Parameters:
data (
Any
)
- class Meta(model)[source]
Bases:
Meta
Metadata for the DownloadedDocument model.
Defines which fields are read-only and should not be modified by the client.
- Parameters:
model (type[_Self])
- blacklist_filtering_params: ClassVar[set[str]] = {}
- field_map: dict[str, str] = {}
- filtering_disabled: ClassVar[set[str]] = {}
- filtering_fields: ClassVar[set[str]] = {'_resource', 'content', 'content_type', 'disposition_filename', 'disposition_type', 'id', 'mode', 'original'}
- read_only_fields: ClassVar[set[str]] = {'content', 'content_type', 'disposition_filename', 'disposition_type', 'id'}
- supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- mode: RetrieveFileMode | None
- original: bool
- content: bytes | None
- content_type: str | None
- disposition_filename: str | None
- disposition_type: str | None
- class paperap.models.document.DownloadedDocumentQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]
Bases:
StandardQuerySet
[DownloadedDocument]A specialized queryset for handling downloaded document operations.
This queryset extends StandardQuerySet to provide functionality specific to downloaded documents from Paperless-NgX. It enables efficient querying, filtering, and manipulation of document download operations.
The queryset is lazy-loaded, meaning API requests are only made when data is actually needed, improving performance when working with large document collections.
Examples
>>> # Download original documents >>> client.documents.filter(title__contains="invoice").download("invoices/") >>> >>> # Download archive versions >>> client.documents.filter(archived=True).download( ... "archives/", archive_version=True ... )
- Parameters:
resource (BaseResource[_Model, Self])
filters (dict[str, Any] | None)
_cache (list[_Model] | None)
_fetch_all (bool)
_next_url (str | None)
_last_response (ClientResponse)
_iter (Iterator[_Model] | None)
_urls_fetched (list[str] | None)
- class paperap.models.document.DocumentMetadata(**data)[source]
Bases:
StandardModel
Represents comprehensive metadata for a Paperless-NgX document.
This model encapsulates all metadata associated with a document in Paperless-NgX, including information about both the original document and its archived version (if available). It provides access to file properties such as checksums, sizes, MIME types, and extracted metadata elements.
The metadata is primarily read-only as it is generated by the Paperless-NgX system during document processing.
- original_checksum
The SHA256 checksum of the original document file.
- original_size
The size of the original document in bytes.
- original_mime_type
The MIME type of the original document (e.g., “application/pdf”).
- media_filename
The filename of the document in the Paperless-NgX media storage.
- has_archive_version
Whether the document has an archived version (typically a PDF/A).
- original_metadata
List of metadata elements extracted from the original document.
- archive_checksum
The SHA256 checksum of the archived document version.
- archive_media_filename
The filename of the archived version in media storage.
- original_filename
The original filename of the document when it was uploaded.
- lang
The detected language code of the document content.
- archive_size
The size of the archived document version in bytes.
- archive_metadata
List of metadata elements extracted from the archived version.
Examples
>>> # Access document metadata >>> metadata = client.documents.get(123).metadata >>> print(f"Original file: {metadata.original_filename}") >>> print(f"Size: {metadata.original_size} bytes") >>> print(f"MIME type: {metadata.original_mime_type}") >>> >>> # Iterate through extracted metadata elements >>> for element in metadata.original_metadata: ... print(f"{element.key}: {element.value}")
- Parameters:
data (
Any
)
- class Meta(model)[source]
Bases:
Meta
Metadata configuration for the DocumentMetadata model.
This class defines metadata properties for the DocumentMetadata model, particularly specifying which fields are read-only.
- Parameters:
model (type[_Self])
- blacklist_filtering_params: ClassVar[set[str]] = {}
- field_map: dict[str, str] = {}
- filtering_disabled: ClassVar[set[str]] = {}
- filtering_fields: ClassVar[set[str]] = {'_resource', 'archive_checksum', 'archive_media_filename', 'archive_metadata', 'archive_size', 'has_archive_version', 'id', 'lang', 'media_filename', 'original_checksum', 'original_filename', 'original_metadata', 'original_mime_type', 'original_size'}
- read_only_fields: ClassVar[set[str]] = {'archive_checksum', 'archive_media_filename', 'archive_metadata', 'archive_size', 'has_archive_version', 'id', 'lang', 'media_filename', 'original_checksum', 'original_filename', 'original_metadata', 'original_mime_type', 'original_size'}
- supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- original_checksum: str | None
- original_size: int | None
- original_mime_type: str | None
- media_filename: str | None
- has_archive_version: bool | None
- original_metadata: list[MetadataElement]
- archive_checksum: str | None
- archive_media_filename: str | None
- original_filename: str | None
- lang: str | None
- archive_size: int | None
- archive_metadata: list[MetadataElement]
- class paperap.models.document.DocumentMetadataQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]
Bases:
StandardQuerySet
[DocumentMetadata]A specialized queryset for interacting with Paperless-NGX document metadata.
This queryset extends StandardQuerySet to provide document metadata-specific filtering methods, making it easier to query metadata by their attributes.
Document metadata contains information about documents such as original filename, media information, archive metadata, and other system-level properties that aren’t part of the document’s content or user-assigned metadata.
The queryset is lazy-loaded, meaning API requests are only made when data is actually needed (when iterating, counting, or accessing specific items).
Examples
>>> # Get metadata for a specific document >>> metadata = client.document_metadata.filter(document=123).first() >>> print(f"Original filename: {metadata.original_filename}") >>> >>> # Get metadata for documents with specific archive information >>> archived = client.document_metadata.filter(archive_checksum__isnull=False)
- Parameters:
resource (BaseResource[_Model, Self])
filters (dict[str, Any] | None)
_cache (list[_Model] | None)
_fetch_all (bool)
_next_url (str | None)
_last_response (ClientResponse)
_iter (Iterator[_Model] | None)
_urls_fetched (list[str] | None)
- class paperap.models.document.MetadataElement(**data)[source]
Bases:
BaseModel
Represents a key-value pair of document metadata in Paperless-NgX.
This model represents individual metadata elements extracted from document files, such as author, creation date, or other file-specific properties. Each element consists of a key and its corresponding value.
- key
The metadata field name or identifier.
- value
The value associated with the metadata field.
Examples
>>> metadata = MetadataElement(key="Author", value="John Doe") >>> print(f"{metadata.key}: {metadata.value}") Author: John Doe
- Parameters:
data (
Any
)
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- key: str
- value: str
- class paperap.models.document.CustomFieldValues(**data)[source]
Bases:
ConstModel
Model for custom field values associated with a document.
- field
The ID of the custom field.
- value
The value of the custom field, which can be of any type depending on the field’s data_type.
- Parameters:
data (
Any
)
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'from_attributes': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- field: int
- value: Any
- class paperap.models.document.DocumentSuggestions(**data)[source]
Bases:
StandardModel
Represents AI-generated suggestions for a Paperless-NgX document.
The DocumentSuggestions model contains lists of suggested metadata IDs that Paperless-NgX’s AI has determined might be appropriate for a document based on its content analysis. These suggestions can be used to quickly apply metadata to documents during processing.
All fields in this model are read-only as they are generated by the Paperless-NgX server and cannot be modified by clients.
- correspondents
List of suggested correspondent IDs that might be associated with this document.
- Type:
list[int]
- tags
List of suggested tag IDs that might be relevant to this document’s content.
- Type:
list[int]
- document_types
List of suggested document type IDs that might categorize this document.
- Type:
list[int]
- storage_paths
List of suggested storage path IDs where this document might be stored.
- Type:
list[int]
- dates
List of suggested relevant dates extracted from the document content.
- Type:
list[date]
Examples
>>> # Get suggestions for a document >>> doc = client.documents.get(123) >>> suggestions = client.document_suggestions.get(doc.id) >>> >>> # Apply suggested tags to the document >>> if suggestions.tags: ... doc.tags.extend(suggestions.tags) ... doc.save()
- Parameters:
data (
Any
)
- class Meta(model)[source]
Bases:
Meta
Metadata for the DocumentSuggestions model.
This class defines metadata for the DocumentSuggestions model, including which fields are read-only.
- Parameters:
model (type[_Self])
- blacklist_filtering_params: ClassVar[set[str]] = {}
- field_map: dict[str, str] = {}
- filtering_disabled: ClassVar[set[str]] = {}
- filtering_fields: ClassVar[set[str]] = {'_resource', 'correspondents', 'dates', 'document_types', 'id', 'storage_paths', 'tags'}
- read_only_fields: ClassVar[set[str]] = {'correspondents', 'dates', 'document_types', 'id', 'storage_paths', 'tags'}
- supported_filtering_params: ClassVar[set[str]] = {'id', 'id__in', 'limit'}
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'populate_by_name': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(context: Any, /) None
We need to both initialize private attributes and call the user-defined model_post_init method.
- correspondents: list[int]
- tags: list[int]
- document_types: list[int]
- storage_paths: list[int]
- dates: list[date]
- class paperap.models.document.DocumentSuggestionsQuerySet(resource, filters=None, _cache=None, _fetch_all=False, _next_url=None, _last_response=None, _iter=None, _urls_fetched=None)[source]
Bases:
StandardQuerySet
[DocumentSuggestions]QuerySet for interacting with document suggestions in Paperless-NgX.
This class extends StandardQuerySet to provide specialized functionality for retrieving and filtering document suggestions. Document suggestions are recommendations for metadata (correspondents, document types, tags) that Paperless-NgX generates based on document content analysis.
The queryset is lazy-loaded, meaning API requests are only made when data is actually accessed, improving performance when working with large datasets.
Examples
>>> # Get all suggestions for a document >>> suggestions = client.document_suggestions.filter(document=123) >>> >>> # Get suggestions with high confidence scores >>> high_confidence = client.document_suggestions.filter( ... document=123, ... confidence__gte=0.8 ... )
- Parameters:
resource (BaseResource[_Model, Self])
filters (dict[str, Any] | None)
_cache (list[_Model] | None)
_fetch_all (bool)
_next_url (str | None)
_last_response (ClientResponse)
_iter (Iterator[_Model] | None)
_urls_fetched (list[str] | None)
Subpackages
Submodules
- paperap.models.document.meta module
- paperap.models.document.model module
DocumentNote
DocumentNote.deleted_at
DocumentNote.restored_at
DocumentNote.transaction_id
DocumentNote.note
DocumentNote.created
DocumentNote.document
DocumentNote.user
DocumentNote.deleted_at
DocumentNote.restored_at
DocumentNote.transaction_id
DocumentNote.note
DocumentNote.created
DocumentNote.document
DocumentNote.user
DocumentNote.Meta
DocumentNote.serialize_datetime()
DocumentNote.get_document()
DocumentNote.get_user()
DocumentNote.model_config
DocumentNote.model_post_init()
DocumentNote.id
Document
Document.added
Document.archive_checksum
Document.archive_filename
Document.archive_serial_number
Document.archived_file_name
Document.checksum
Document.content
Document.correspondent_id
Document.created
Document.created_date
Document.custom_field_dicts
Document.deleted_at
Document.document_type_id
Document.filename
Document.is_shared_by_requester
Document.notes
Document.original_filename
Document.owner
Document.page_count
Document.storage_path_id
Document.storage_type
Document.tag_ids
Document.title
Document.user_can_change
Document.added
Document.archive_checksum
Document.archive_filename
Document.archive_serial_number
Document.archived_file_name
Document.checksum
Document.content
Document.correspondent_id
Document.created
Document.created_date
Document.custom_field_dicts
Document.deleted_at
Document.document_type_id
Document.filename
Document.is_shared_by_requester
Document.notes
Document.original_filename
Document.owner
Document.page_count
Document.storage_path_id
Document.storage_type
Document.tag_ids
Document.title
Document.user_can_change
Document.Meta
Document.serialize_datetime()
Document.serialize_notes()
Document.validate_tags()
Document.validate_custom_fields()
Document.validate_text()
Document.validate_notes()
Document.validate_is_shared_by_requester()
Document.custom_field_ids
Document.custom_field_values
Document.tag_names
Document.tags
Document.correspondent
Document.document_type
Document.storage_path
Document.custom_fields
Document.has_search_hit
Document.search_hit
Document.custom_field_value()
Document.add_tag()
Document.remove_tag()
Document.get_metadata()
Document.download()
Document.model_config
Document.model_post_init()
Document.preview()
Document.id
Document.thumbnail()
Document.get_suggestions()
Document.append_content()
Document.update_locally()
- paperap.models.document.queryset module
CustomFieldQuery
DocumentNoteQuerySet
DocumentQuerySet
DocumentQuerySet.resource
DocumentQuerySet.tag_id()
DocumentQuerySet.tag_name()
DocumentQuerySet.title()
DocumentQuerySet.search()
DocumentQuerySet.more_like()
DocumentQuerySet.correspondent()
DocumentQuerySet.correspondent_id()
DocumentQuerySet.correspondent_name()
DocumentQuerySet.correspondent_slug()
DocumentQuerySet.document_type()
DocumentQuerySet.document_type_id()
DocumentQuerySet.document_type_name()
DocumentQuerySet.storage_path()
DocumentQuerySet.storage_path_id()
DocumentQuerySet.storage_path_name()
DocumentQuerySet.content()
DocumentQuerySet.added_after()
DocumentQuerySet.added_before()
DocumentQuerySet.asn()
DocumentQuerySet.original_filename()
DocumentQuerySet.user_can_change()
DocumentQuerySet.custom_field_fullsearch()
DocumentQuerySet.custom_field()
DocumentQuerySet.has_custom_field_id()
DocumentQuerySet.custom_field_query()
DocumentQuerySet.custom_field_range()
DocumentQuerySet.custom_field_exact()
DocumentQuerySet.custom_field_in()
DocumentQuerySet.custom_field_isnull()
DocumentQuerySet.custom_field_exists()
DocumentQuerySet.custom_field_contains()
DocumentQuerySet.has_custom_fields()
DocumentQuerySet.no_custom_fields()
DocumentQuerySet.notes()
DocumentQuerySet.created_before()
DocumentQuerySet.created_after()
DocumentQuerySet.created_between()
DocumentQuerySet.delete()
DocumentQuerySet.reprocess()
DocumentQuerySet.merge()
DocumentQuerySet.rotate()
DocumentQuerySet.update()
DocumentQuerySet.modify_custom_fields()
DocumentQuerySet.modify_tags()
DocumentQuerySet.add_tag()
DocumentQuerySet.filters
DocumentQuerySet.remove_tag()
DocumentQuerySet.set_permissions()