abacusai.upload
Module Contents
Classes
A Upload Reference for uploading file parts |
- class abacusai.upload.Upload(client, uploadId=None, datasetUploadId=None, status=None, datasetId=None, datasetVersion=None, modelVersion=None, batchPredictionId=None, parts=None, createdAt=None)
Bases:
abacusai.return_class.AbstractApiClass
A Upload Reference for uploading file parts
- Parameters
client (ApiClient) – An authenticated API Client instance
uploadId (str) – The unique ID generated when the upload process of the full large file in smaller parts is initiated.
datasetUploadId (str) – Same as upload_id. It is kept for backwards compatibility purposes.
status (str) – The current status of the upload.
datasetId (str) – A reference to the dataset this upload is adding data to.
datasetVersion (str) – A reference to the dataset version the upload is adding data to.
modelVersion (str) – A reference to the model version the upload is creating.
batchPredictionId (str) – A reference to the batch prediction the upload is creating.
parts (list of json objects) – A list containing the order of the file parts that have been uploaded.
createdAt (str) – The timestamp at which the upload was created.
- __repr__(self)
Return repr(self).
- to_dict(self)
Get a dict representation of the parameters in this class
- Returns
The dict value representation of the class parameters
- Return type
- part(self, part_number, part_data)
Uploads a part of a large dataset file from your bucket to our system. Our system currently supports a size of up to 5GB for a part of a full file and a size of up to 5TB for the full file. Note that each part must be >=5MB in size, unless it is the last part in the sequence of parts for the full file.
- Parameters
part_number (int) – The 1-indexed number denoting the position of the file part in the sequence of parts for the full file.
part_data (io.TextIOBase) – The multipart/form-data for the current part of the full file.
- Returns
The object ‘UploadPart’ which encapsulates the hash and the etag for the part that got uploaded.
- Return type
- mark_complete(self)
Marks an upload process as complete.
- refresh(self)
Calls describe and refreshes the current object’s fields
- Returns
The current object
- Return type
- describe(self)
Retrieves the current upload status (complete or inspecting) and the list of file parts uploaded for a specified dataset upload.
- upload_part(self, upload_args)
Uploads a file part. If the upload fails, it will retry up to 3 times with a short backoff before raising an exception.
- Returns
The object ‘UploadPart’ that encapsulates the hash and the etag for the part that got uploaded.
- Return type
- upload_file(self, file, threads=10, chunksize=1024 * 1024 * 10, wait_timeout=600)
Uploads the file in the specified chunk size using the specified number of workers.
- Parameters
file (IOBase) – A bytesIO or StringIO object to upload to Abacus.AI
threads (int, optional) – The max number of workers to use while uploading the file
chunksize (int, optional) – The number of bytes to use for each chunk while uploading the file. Defaults to 10 MB
wait_timeout (int, optional) – The max number of seconds to wait for the file parts to be joined on Abacus.AI. Defaults to 600.
- Returns
The upload file object.
- Return type
- _yield_upload_part(self, file, chunksize)