Skip to content

steps.base.dataset.uploader

uploader

Functions:

Name Description
upload_full_dataset

Upload both images and annotations for a COCO dataset.

upload_dataset_images

Upload only the image files from a COCO dataset.

upload_dataset_annotations

Upload only the annotations from a COCO dataset.

upload_full_dataset(dataset, datalake=None, data_tag=None, use_id=True, fail_on_asset_not_found=True, replace_annotations=None)

Upload both images and annotations for a COCO dataset.

This step manages the complete dataset upload workflow. It configures the dataset type based on its annotations and handles image and annotation upload according to the dataset's inference type (classification, detection, etc.).

If annotations are present: - The dataset type is automatically inferred. - Both images and annotations are uploaded. - If replace_annotations is not explicitly provided, it will be determined from the processing context.

If annotations are missing: - Only images are uploaded.

Parameters:

Name Type Description Default

dataset

CocoDataset

The dataset to upload (including images and optionally annotations).

required

datalake

Optional[Datalake]

The target datalake. If not provided, it is inferred from the processing context.

None

data_tag

Optional[str]

The tag used to associate the upload in the datalake. Defaults to the one in the context.

None

use_id

bool

Whether to use asset IDs for the upload (defaults to True).

True

fail_on_asset_not_found

bool

If True, raises an error when a corresponding asset is not found.

True

replace_annotations

Optional[bool]

Whether to overwrite existing annotations. Fetched from context if None.

None

upload_dataset_images(dataset, datalake=None, data_tag=None)

Upload only the image files from a COCO dataset.

This step uploads all image assets associated with the provided dataset to the datalake. Annotation data, if present, is ignored.

Parameters:

Name Type Description Default

dataset

CocoDataset

The dataset whose image files should be uploaded.

required

datalake

Optional[Datalake]

The target datalake. Inferred from the context if not provided.

None

data_tag

Optional[str]

Optional tag to associate with the uploaded data. Inferred from the context if not provided.

None

upload_dataset_annotations(dataset, use_id=True, fail_on_asset_not_found=True, replace_annotations=None)

Upload only the annotations from a COCO dataset.

This step uploads only the annotations portion of a dataset, based on its inference type. It configures the dataset type (e.g., classification, detection, etc.) based on the annotations present.

If replace_annotations is not explicitly provided, the value is taken from the processing parameters context.

Parameters:

Name Type Description Default

dataset

CocoDataset

The dataset containing annotations to upload.

required

use_id

bool

Whether to use asset IDs for the upload. Defaults to True.

True

fail_on_asset_not_found

bool

Whether to fail if an asset referenced in the annotations is missing. Defaults to True.

True

replace_annotations

Optional[bool]

Whether to overwrite existing annotations. Fetched from context if not provided.

None