frameworks.clip.services.predictor¶
predictor
¶
Classes:
| Name | Description |
|---|---|
PicselliaCLIPEmbeddingPrediction |
Dataclass representing a CLIP prediction for an image, |
CLIPModelPredictor |
Predictor class for CLIP-based inference on image and text data. |
PicselliaCLIPEmbeddingPrediction(asset, image_embedding, text_embedding)
dataclass
¶
Dataclass representing a CLIP prediction for an image, optionally including a text embedding.
Attributes:
| Name | Type | Description |
|---|---|---|
asset |
Asset
|
|
image_embedding |
list[float]
|
|
text_embedding |
list[float]
|
|
CLIPModelPredictor(model, device)
¶
Bases: ModelPredictor
Predictor class for CLIP-based inference on image and text data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
CLIPModel
|
The CLIP model instance. |
required |
|
str
|
Target device ("cuda" or "cpu"). |
required |
Methods:
| Name | Description |
|---|---|
embed_image |
Encode an image into a CLIP embedding. |
embed_text |
Encode a text string into a CLIP embedding. |
run_image_inference_on_batches |
Perform inference on batches of images. |
run_inference_on_batches |
Perform inference on batches of image-text pairs. |
post_process_batches |
Convert image-text batch results into Picsellia prediction objects. |
post_process_image_batches |
Convert image-only batch results into Picsellia prediction objects. |
pre_process_dataset |
Extracts all image paths from the dataset's image directory. |
prepare_batches |
|
get_picsellia_label |
Get or create a PicselliaLabel from a dataset category name. |
get_picsellia_confidence |
Wrap a confidence score in a PicselliaConfidence object. |
get_picsellia_rectangle |
Create a PicselliaRectangle from bounding box coordinates. |
Attributes:
| Name | Type | Description |
|---|---|---|
model |
|
|
device |
|
model = model
instance-attribute
¶
device = device
instance-attribute
¶
embed_image(image_path)
¶
Encode an image into a CLIP embedding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Path to the input image. |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
A list of float values representing the image embedding. |
embed_text(text)
¶
Encode a text string into a CLIP embedding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Input text string. |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
A list of float values representing the text embedding. |
run_image_inference_on_batches(image_batches)
¶
Perform inference on batches of images.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
list[list[str]]
|
List of batches, each batch is a list of image paths. |
required |
Returns:
| Type | Description |
|---|---|
list[list[dict]]
|
Nested list of dictionaries containing image embeddings. |
run_inference_on_batches(image_text_batches)
¶
Perform inference on batches of image-text pairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
list[list[tuple[str, str]]]
|
List of batches containing (image_path, text) tuples. |
required |
Returns:
| Type | Description |
|---|---|
list[list[dict]]
|
Nested list of dictionaries with image and text embeddings. |
post_process_batches(image_text_batches, batch_results, dataset)
¶
Convert image-text batch results into Picsellia prediction objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
list[list[tuple[str, str]]]
|
Input image-text batches. |
required |
|
list[list[dict]]
|
Corresponding results from inference. |
required |
|
TBaseDataset
|
Dataset object to resolve asset references. |
required |
Returns:
| Type | Description |
|---|---|
list[PicselliaCLIPEmbeddingPrediction]
|
List of PicselliaCLIPEmbeddingPrediction. |
post_process_image_batches(image_batches, batch_results, dataset)
¶
Convert image-only batch results into Picsellia prediction objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
list[list[str]]
|
List of image batches. |
required |
|
list[list[dict]]
|
Corresponding image-only inference results. |
required |
|
TBaseDataset
|
Dataset object to resolve asset references. |
required |
Returns:
| Type | Description |
|---|---|
list[PicselliaCLIPEmbeddingPrediction]
|
List of PicselliaCLIPEmbeddingPrediction with empty text embeddings. |
pre_process_dataset(dataset)
¶
Extracts all image paths from the dataset's image directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
TBaseDataset
|
The dataset object containing the image directory. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
list[str]: A list of file paths to the dataset images. |
prepare_batches(image_paths, batch_size)
¶
get_picsellia_label(category_name, dataset)
¶
Get or create a PicselliaLabel from a dataset category name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
The name of the label category. |
required |
|
TBaseDataset
|
Dataset that provides label access. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PicselliaLabel |
PicselliaLabel
|
Wrapped label object. |
get_picsellia_confidence(confidence)
¶
Wrap a confidence score in a PicselliaConfidence object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
float
|
Prediction confidence score. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PicselliaConfidence |
PicselliaConfidence
|
Wrapped confidence object. |
get_picsellia_rectangle(x, y, w, h)
¶
Create a PicselliaRectangle from bounding box coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
int
|
Top-left x-coordinate. |
required |
|
int
|
Top-left y-coordinate. |
required |
|
int
|
Width of the box. |
required |
|
int
|
Height of the box. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
PicselliaRectangle |
PicselliaRectangle
|
Rectangle wrapper for object detection. |