Skip to content

core.contexts.processing.datalake.picsellia_datalake_processing_context

picsellia_datalake_processing_context

Classes:

Name Description
PicselliaDatalakeProcessingContext

Context for running Picsellia datalake processing jobs.

PicselliaDatalakeProcessingContext(processing_parameters_cls, api_token=None, host=None, organization_id=None, job_id=None, use_id=True, download_annotations=True)

Bases: PicselliaContext, Generic[TParameters]

Context for running Picsellia datalake processing jobs.

Manages job initialization, model version, input/output datalakes, and job parameters.

Methods:

Name Description
to_dict

Convert the context to a dictionary representation.

get_datalake

Fetch a datalake by ID.

get_model_version

Fetch the model version by ID.

get_data_ids

Retrieve data IDs from the job payload.

Attributes:

Name Type Description
job_id
job
job_type
job_context
input_datalake
output_datalake
model_version
data_ids
use_id
download_annotations
processing_parameters
model_version_id str | None

Get the model version ID, validating presence if required.

api_token
host
organization_id
organization_name
client
working_dir str

Abstract property to define the working directory path.

job_id = job_id or os.environ.get('job_id') instance-attribute

job = self._initialize_job() instance-attribute

job_type = self.job.sync()['type'] instance-attribute

job_context = self._initialize_job_context() instance-attribute

input_datalake = self.get_datalake(self._input_datalake_id) instance-attribute

output_datalake = self.get_datalake(self._output_datalake_id) if self._output_datalake_id else None instance-attribute

model_version = self.get_model_version() if self._model_version_id else None instance-attribute

data_ids = self.get_data_ids() instance-attribute

use_id = use_id instance-attribute

download_annotations = download_annotations instance-attribute

processing_parameters = processing_parameters_cls(log_data=self.job_context['parameters']) instance-attribute

model_version_id property

Get the model version ID, validating presence if required.

api_token = api_token or os.getenv('api_token') instance-attribute

host = host or os.getenv('host', 'https://app.picsellia.com') instance-attribute

organization_id = organization_id or os.getenv('organization_id') instance-attribute

organization_name = organization_name or os.getenv('organization_name') instance-attribute

client = self._initialize_client() instance-attribute

working_dir abstractmethod property

Abstract property to define the working directory path.

This should be implemented by subclasses to specify where files such as datasets, weights, and logs are stored locally.

Returns:

Name Type Description
str str

Path to the working directory.

to_dict()

Convert the context to a dictionary representation.

get_datalake(datalake_id)

Fetch a datalake by ID.

get_model_version()

Fetch the model version by ID.

get_data_ids()

Retrieve data IDs from the job payload.