core.data.dataset.dataset_collection¶
dataset_collection
¶
Classes:
Name | Description |
---|---|
DatasetCollection |
A collection of datasets for different splits of a dataset. |
DatasetCollection(datasets)
¶
Bases: ABC
, Generic[TBaseDataset]
A collection of datasets for different splits of a dataset.
This class aggregates datasets for the common splits used in machine learning projects: training, validation, and testing. It provides a convenient way to access and manipulate these datasets as a unified object. The class supports direct access to individual dataset contexts, iteration over all contexts, and collective operations on all contexts, such as downloading assets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
List[TDataset]
|
A list of datasets for different splits (train, val, test). |
required |
Methods:
Name | Description |
---|---|
download_all |
Downloads all assets and annotations for every dataset in the collection. |
Attributes:
Name | Type | Description |
---|---|---|
datasets |
A dictionary of datasets, indexed by their names. |
|
dataset_path |
str | None
|
The path to the dataset directory. |
datasets = {dataset.name: datasetfor dataset in datasets}
instance-attribute
¶
A dictionary of datasets, indexed by their names.
dataset_path = None
instance-attribute
¶
The path to the dataset directory.
download_all(images_destination_dir, annotations_destination_dir, use_id=True, skip_asset_listing=False)
¶
Downloads all assets and annotations for every dataset in the collection.
For each dataset, this method: 1. Downloads the assets (images) to the corresponding image directory. 2. Downloads and builds the COCO annotation file for each dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
str
|
The directory where images will be saved. |
required |
|
str
|
The directory where annotations will be saved. |
required |
|
Optional[bool]
|
Whether to use asset IDs in the file paths. If None, the internal logic of each dataset will handle it. |
True
|
|
bool
|
If True, skips listing the assets when downloading. Defaults to False. |
False
|
Example
If you want to download assets and annotations for both train and validation datasets,
this method will create two directories (e.g., train/images
, train/annotations
,
val/images
, val/annotations
) under the specified destination_path
.