Skip to content

frameworks.clip.services.predictor

predictor

Classes:

Name Description
PicselliaCLIPEmbeddingPrediction

Dataclass representing a CLIP prediction for an image,

CLIPModelPredictor

Predictor class for CLIP-based inference on image and text data.

PicselliaCLIPEmbeddingPrediction(asset, image_embedding, text_embedding) dataclass

Dataclass representing a CLIP prediction for an image, optionally including a text embedding.

Attributes:

Name Type Description
asset Asset
image_embedding list[float]
text_embedding list[float]

asset instance-attribute

image_embedding instance-attribute

text_embedding instance-attribute

CLIPModelPredictor(model, device)

Bases: ModelPredictor

Predictor class for CLIP-based inference on image and text data.

Parameters:

Name Type Description Default

model

CLIPModel

The CLIP model instance.

required

device

str

Target device ("cuda" or "cpu").

required

Methods:

Name Description
embed_image

Encode an image into a CLIP embedding.

embed_text

Encode a text string into a CLIP embedding.

run_image_inference_on_batches

Perform inference on batches of images.

run_inference_on_batches

Perform inference on batches of image-text pairs.

post_process_batches

Convert image-text batch results into Picsellia prediction objects.

post_process_image_batches

Convert image-only batch results into Picsellia prediction objects.

pre_process_dataset

Extracts all image paths from the dataset's image directory.

prepare_batches
get_picsellia_label

Get or create a PicselliaLabel from a dataset category name.

get_picsellia_confidence

Wrap a confidence score in a PicselliaConfidence object.

get_picsellia_rectangle

Create a PicselliaRectangle from bounding box coordinates.

Attributes:

Name Type Description
model
device

model = model instance-attribute

device = device instance-attribute

embed_image(image_path)

Encode an image into a CLIP embedding.

Parameters:

Name Type Description Default
image_path
str

Path to the input image.

required

Returns:

Type Description
list[float]

A list of float values representing the image embedding.

embed_text(text)

Encode a text string into a CLIP embedding.

Parameters:

Name Type Description Default
text
str

Input text string.

required

Returns:

Type Description
list[float]

A list of float values representing the text embedding.

run_image_inference_on_batches(image_batches)

Perform inference on batches of images.

Parameters:

Name Type Description Default
image_batches
list[list[str]]

List of batches, each batch is a list of image paths.

required

Returns:

Type Description
list[list[dict]]

Nested list of dictionaries containing image embeddings.

run_inference_on_batches(image_text_batches)

Perform inference on batches of image-text pairs.

Parameters:

Name Type Description Default
image_text_batches
list[list[tuple[str, str]]]

List of batches containing (image_path, text) tuples.

required

Returns:

Type Description
list[list[dict]]

Nested list of dictionaries with image and text embeddings.

post_process_batches(image_text_batches, batch_results, dataset)

Convert image-text batch results into Picsellia prediction objects.

Parameters:

Name Type Description Default
image_text_batches
list[list[tuple[str, str]]]

Input image-text batches.

required
batch_results
list[list[dict]]

Corresponding results from inference.

required
dataset
TBaseDataset

Dataset object to resolve asset references.

required

Returns:

Type Description
list[PicselliaCLIPEmbeddingPrediction]

List of PicselliaCLIPEmbeddingPrediction.

post_process_image_batches(image_batches, batch_results, dataset)

Convert image-only batch results into Picsellia prediction objects.

Parameters:

Name Type Description Default
image_batches
list[list[str]]

List of image batches.

required
batch_results
list[list[dict]]

Corresponding image-only inference results.

required
dataset
TBaseDataset

Dataset object to resolve asset references.

required

Returns:

Type Description
list[PicselliaCLIPEmbeddingPrediction]

List of PicselliaCLIPEmbeddingPrediction with empty text embeddings.

pre_process_dataset(dataset)

Extracts all image paths from the dataset's image directory.

Parameters:

Name Type Description Default
dataset
TBaseDataset

The dataset object containing the image directory.

required

Returns:

Type Description
list[str]

list[str]: A list of file paths to the dataset images.

prepare_batches(image_paths, batch_size)

get_picsellia_label(category_name, dataset)

Get or create a PicselliaLabel from a dataset category name.

Parameters:

Name Type Description Default
category_name
str

The name of the label category.

required
dataset
TBaseDataset

Dataset that provides label access.

required

Returns:

Name Type Description
PicselliaLabel PicselliaLabel

Wrapped label object.

get_picsellia_confidence(confidence)

Wrap a confidence score in a PicselliaConfidence object.

Parameters:

Name Type Description Default
confidence
float

Prediction confidence score.

required

Returns:

Name Type Description
PicselliaConfidence PicselliaConfidence

Wrapped confidence object.

get_picsellia_rectangle(x, y, w, h)

Create a PicselliaRectangle from bounding box coordinates.

Parameters:

Name Type Description Default
x
int

Top-left x-coordinate.

required
y
int

Top-left y-coordinate.

required
w
int

Width of the box.

required
h
int

Height of the box.

required

Returns:

Name Type Description
PicselliaRectangle PicselliaRectangle

Rectangle wrapper for object detection.