anchor_python_visualization.embeddings

Methods for loading embeddings and determining labels.

Package Contents

Classes

LabelledFeatures

Maintains separate data-frames for embeddings and labels, but linked in order and count.

Functions

load_features(...)

Loads the embeddings from a CSV file and determines identifiers and labels.

Attributes

COLUMN_NAME_IDENTIFIER

Name for index column.

PLACEHOLDER_FOR_SUBSTITUTION

Optional placeholder used in image_dir argument.

exception anchor_python_visualization.embeddings.InsufficientRowsException[source]

Bases: Exception

When there are too few rows in a data-frame to perform an operation.

anchor_python_visualization.embeddings.COLUMN_NAME_IDENTIFIER: str = 'identifier'[source]

Name for index column.

anchor_python_visualization.embeddings.PLACEHOLDER_FOR_SUBSTITUTION: str = '<IMAGE>'[source]

Optional placeholder used in image_dir argument.

anchor_python_visualization.embeddings.load_features(args: argparse.Namespace) anchor_python_visualization.embeddings.label.LabelledFeatures[source]

Loads the embeddings from a CSV file and determines identifiers and labels.

This determination occurs according to command-line arguments.

Parameters:

args – the command-line arguments.

Returns:

newly-created instance of features after having being loaded.

class anchor_python_visualization.embeddings.LabelledFeatures[source]

Maintains separate data-frames for embeddings and labels, but linked in order and count.

Both data-frames must have the same number of rows, ordred identically.

features: pandas.DataFrame

Data-frame containing only feature-values (numeric) with each row assigned an identifier.

labels: pandas.Series

Series with labels for each item in df_features.

The series must have the same size and order as embeddings.

image_paths: pandas.Series | None

Optional series with a path to an image for each item.

The series must have the same size and order as embeddings.

number_items() int[source]

Returns the number of items (i.e. rows) in the data-frames/series.

sample_without_replacement(sample_size: int) LabelledFeatures[source]

Samples without replacement (taking identical rows from each member data-frame/series).

Parameters:

sample_size – number of items to sample

Returns:

a newly created LabelledFeatures containing the sample.

Raises:

InsufficientRowsException – if there are fewer rows available than sample_size.