Skip to content

core.contexts.processing.datalake.local_datalake_processing_context

local_datalake_processing_context

Classes:

Name Description
LocalDatalakeProcessingContext

Context for local testing of processing jobs without real job execution on Picsellia.

Functions:

Name Description
create_processing

Create a processing configuration in Picsellia.

get_processing

Get the ID of a processing by name.

launch_processing

Launch a processing job on a datalake.

LocalDatalakeProcessingContext(api_token=None, host=None, organization_id=None, job_id=None, job_type=None, input_datalake_id=None, output_datalake_id=None, model_version_id=None, offset=0, limit=100, use_id=True, processing_parameters=None)

Bases: PicselliaContext

Context for local testing of processing jobs without real job execution on Picsellia.

Methods:

Name Description
get_datalake

Retrieve a datalake by its ID.

get_model_version

Retrieve a model version by its ID.

get_data_ids

List data IDs from a datalake with offset and limit.

to_dict

Convert the context to a dictionary.

Attributes:

Name Type Description
job_id
job_type
input_datalake_id
output_datalake_id
model_version_id
input_datalake
output_datalake
model_version
offset
limit
data_ids
use_id
processing_parameters
api_token
host
organization_id
organization_name
client
working_dir str

Abstract property to define the working directory path.

job_id = job_id instance-attribute

job_type = job_type instance-attribute

input_datalake_id = input_datalake_id instance-attribute

output_datalake_id = output_datalake_id instance-attribute

model_version_id = model_version_id instance-attribute

input_datalake = self.get_datalake(input_datalake_id) instance-attribute

output_datalake = self.get_datalake(output_datalake_id) if output_datalake_id else None instance-attribute

model_version = self.get_model_version(model_version_id=model_version_id) instance-attribute

offset = offset instance-attribute

limit = limit instance-attribute

data_ids = self.get_data_ids(datalake=self.input_datalake, offset=self.offset, limit=self.limit) instance-attribute

use_id = use_id instance-attribute

processing_parameters = processing_parameters instance-attribute

api_token = api_token or os.getenv('api_token') instance-attribute

host = host or os.getenv('host', 'https://app.picsellia.com') instance-attribute

organization_id = organization_id or os.getenv('organization_id') instance-attribute

organization_name = organization_name or os.getenv('organization_name') instance-attribute

client = self._initialize_client() instance-attribute

working_dir abstractmethod property

Abstract property to define the working directory path.

This should be implemented by subclasses to specify where files such as datasets, weights, and logs are stored locally.

Returns:

Name Type Description
str str

Path to the working directory.

get_datalake(datalake_id)

Retrieve a datalake by its ID.

get_model_version(model_version_id)

Retrieve a model version by its ID.

get_data_ids(datalake, offset, limit)

List data IDs from a datalake with offset and limit.

to_dict()

Convert the context to a dictionary.

create_processing(client, name, type, default_cpu, default_gpu, default_parameters, docker_image, docker_tag, docker_flags=None)

Create a processing configuration in Picsellia.

Returns:

Name Type Description
str str

ID of the created processing.

get_processing(client, name)

Get the ID of a processing by name.

Returns:

Name Type Description
str str

ID of the found processing.

launch_processing(client, datalake, data_ids, model_version_id, processing_id, parameters, cpu, gpu, target_datalake_name=None)

Launch a processing job on a datalake.

Returns:

Name Type Description
Job Job

The launched job object.