anchor_python_visualization.embeddings.label

Routines for loading embeddings from CSV and adding identifiers and labels.

Module Contents

Classes

LabelledFeatures

Maintains separate data-frames for embeddings and labels, but linked in order and count.

class anchor_python_visualization.embeddings.label.LabelledFeatures[source]

Maintains separate data-frames for embeddings and labels, but linked in order and count.

Both data-frames must have the same number of rows, ordred identically.

features: pandas.DataFrame[source]

Data-frame containing only feature-values (numeric) with each row assigned an identifier.

labels: pandas.Series[source]

Series with labels for each item in df_features.

The series must have the same size and order as embeddings.

image_paths: pandas.Series | None[source]

Optional series with a path to an image for each item.

The series must have the same size and order as embeddings.

number_items() int[source]

Returns the number of items (i.e. rows) in the data-frames/series.

sample_without_replacement(sample_size: int) LabelledFeatures[source]

Samples without replacement (taking identical rows from each member data-frame/series).

Parameters:

sample_size – number of items to sample

Returns:

a newly created LabelledFeatures containing the sample.

Raises:

InsufficientRowsException – if there are fewer rows available than sample_size.