Skip to content

orca_sdk.classification_model#

ClassificationModel #

A handle to a classification model in OrcaCloud

Attributes:

  • id (str) –

    Unique identifier for the model

  • name (str) –

    Unique name of the model

  • memoryset (LabeledMemoryset) –

    Memoryset that the model uses

  • head_type (RACHeadType) –

    Classification head type of the model

  • num_classes (int) –

    Number of distinct classes the model can predict

  • memory_lookup_count (int) –

    Number of memories the model uses for each prediction

  • weigh_memories (bool | None) –

    If using a KNN head, whether the model weighs memories by their lookup score

  • min_memory_weight (float | None) –

    If using a KNN head, minimum lookup score memories have to be over to not be ignored

  • created_at (datetime) –

    When the model was created

last_prediction property #

last_prediction

Last prediction made by the model

Note

If the last prediction was part of a batch prediction, the last prediction from the batch is returned. If no prediction has been made yet, a LookupError is raised.

create classmethod #

create(
    name,
    memoryset,
    head_type="KNN",
    *,
    num_classes=None,
    memory_lookup_count=None,
    weigh_memories=True,
    min_memory_weight=None,
    if_exists="error"
)

Create a new classification model

Parameters:

  • name (str) –

    Name for the new model (must be unique)

  • memoryset (LabeledMemoryset) –

    Memoryset to attach the model to

  • head_type (Literal['BMMOE', 'FF', 'KNN', 'MMOE'], default: 'KNN' ) –

    Type of model head to use

  • num_classes (int | None, default: None ) –

    Number of classes this model can predict, will be inferred from memoryset if not specified

  • memory_lookup_count (int | None, default: None ) –

    Number of memories to lookup for each prediction, by default the system uses a simple heuristic to choose a number of memories that works well in most cases

  • weigh_memories (bool, default: True ) –

    If using a KNN head, whether the model weighs memories by their lookup score

  • min_memory_weight (float | None, default: None ) –

    If using a KNN head, minimum lookup score memories have to be over to not be ignored

  • if_exists (CreateMode, default: 'error' ) –

    What to do if a model with the same name already exists, defaults to "error". Other option is "open" to open the existing model.

Returns:

Raises:

  • ValueError

    If the model already exists and if_exists is "error" or if it is "open" and the existing model has different attributes.

Examples:

Create a new model using default options:

1
2
3
4
>>> model = ClassificationModel.create(
...    "my_model",
...    LabeledMemoryset.open("my_memoryset"),
... )

Create a new model with non-default model head and options:

1
2
3
4
5
6
7
>>> model = ClassificationModel.create(
...     name="my_model",
...     memoryset=LabeledMemoryset.open("my_memoryset"),
...     head_type=RACHeadType.MMOE,
...     num_classes=5,
...     memory_lookup_count=20,
... )

open classmethod #

open(name)

Get a handle to a classification model in the OrcaCloud

Parameters:

  • name (str) –

    Name or unique identifier of the classification model

Returns:

Raises:

  • LookupError

    If the classification model does not exist

exists classmethod #

exists(name_or_id)

Check if a classification model exists in the OrcaCloud

Parameters:

  • name_or_id (str) –

    Name or id of the classification model

Returns:

  • bool

    True if the classification model exists, False otherwise

all classmethod #

all()

Get a list of handles to all classification models in the OrcaCloud

Returns:

drop classmethod #

drop(name_or_id, if_not_exists='error')

Delete a classification model from the OrcaCloud

Warning

This will delete the model and all associated data, including predictions, evaluations, and feedback.

Parameters:

  • name_or_id (str) –

    Name or id of the classification model

  • if_not_exists (DropMode, default: 'error' ) –

    What to do if the classification model does not exist, defaults to "error". Other option is "ignore" to do nothing if the classification model does not exist.

Raises:

  • LookupError

    If the classification model does not exist and if_not_exists is "error"

predict #

predict(
    value: list[str],
    expected_labels: list[int] | None = None,
    tags: set[str] = set(),
) -> list[LabelPrediction]
predict(
    value: str,
    expected_labels: int | None = None,
    tags: set[str] = set(),
) -> LabelPrediction
predict(value, expected_labels=None, tags=set())

Predict label(s) for the given input value(s) grounded in similar memories

Parameters:

  • value (list[str] | str) –

    Value(s) to get predict the labels of

  • expected_labels (list[int] | int | None, default: None ) –

    Expected label(s) for the given input to record for model evaluation

  • tags (set[str], default: set() ) –

    Tags to add to the prediction(s)

Returns:

Examples:

Predict the label for a single value:

>>> prediction = model.predict("I am happy", tags={"test"})
LabelPrediction({label: <positive: 1>, confidence: 0.95, input_value: 'I am happy' })

Predict the labels for a list of values:

1
2
3
4
5
>>> predictions = model.predict(["I am happy", "I am sad"], expected_labels=[1, 0])
[
    LabelPrediction({label: <positive: 1>, confidence: 0.95, input_value: 'I am happy'}),
    LabelPrediction({label: <negative: 0>, confidence: 0.05, input_value: 'I am sad'}),
]

predictions #

predictions(limit=100, offset=0, tag=None, sort=[])

Get a list of predictions made by this model

Parameters:

  • limit (int, default: 100 ) –

    Optional maximum number of predictions to return

  • offset (int, default: 0 ) –

    Optional offset of the first prediction to return

  • tag (str | None, default: None ) –

    Optional tag to filter predictions by

  • sort (list[tuple[ListPredictionsRequestSortItemItemType0, ListPredictionsRequestSortItemItemType1]], default: [] ) –

    Optional list of columns and directions to sort the predictions by. Predictions can be sorted by timestamp or confidence.

Returns:

Examples:

Get the last 3 predictions:

1
2
3
4
5
6
>>> predictions = model.predictions(limit=3, sort=[("timestamp", "desc")])
[
    LabeledPrediction({label: <positive: 1>, confidence: 0.95, input_value: 'I am happy'}),
    LabeledPrediction({label: <negative: 0>, confidence: 0.05, input_value: 'I am sad'}),
    LabeledPrediction({label: <positive: 1>, confidence: 0.90, input_value: 'I am ecstatic'}),
]

Get second most confident prediction:

>>> predictions = model.predictions(sort=[("confidence", "desc")], offset=1, limit=1)
[LabeledPrediction({label: <positive: 1>, confidence: 0.90, input_value: 'I am having a good day'})]

evaluate #

evaluate(
    datasource,
    value_column="value",
    label_column="label",
    record_predictions=False,
    tags=None,
)

Evaluate the classification model on a given datasource

Parameters:

  • datasource (Datasource) –

    Datasource to evaluate the model on

  • value_column (str, default: 'value' ) –

    Name of the column that contains the input values to the model

  • label_column (str, default: 'label' ) –

    Name of the column containing the expected labels

  • record_predictions (bool, default: False ) –

    Whether to record LabelPredictions for analysis

  • tags (set[str] | None, default: None ) –

    Optional tags to add to the recorded LabelPredictions

Returns:

Examples:

>>> model.evaluate(datasource, value_column="text", label_column="airline_sentiment")
{ "f1_score": 0.85, "roc_auc": 0.85, "pr_auc": 0.85, "accuracy": 0.85, "loss": 0.35 }

use_memoryset #

use_memoryset(memoryset_override)

Temporarily override the memoryset used by the model for predictions

Parameters:

  • memoryset_override (LabeledMemoryset) –

    Memoryset to override the default memoryset with

Examples:

>>> with model.use_memoryset(LabeledMemoryset.open("my_other_memoryset")):
...     predictions = model.predict("I am happy")

record_feedback #

record_feedback(feedback: dict[str, Any]) -> None
record_feedback(feedback: Iterable[dict[str, Any]]) -> None
record_feedback(feedback)

Record feedback for a list of predictions.

We support recording feedback in several categories for each prediction. A FeedbackCategory is created automatically, the first time feedback with a new name is recorded. Categories are global across models. The value type of the category is inferred from the first recorded value. Subsequent feedback for the same category must be of the same type.

Parameters:

  • feedback (Iterable[dict[str, Any]] | dict[str, Any]) –

    Feedback to record, this should be dictionaries with the following keys:

    • category: Name of the category under which to record the feedback.
    • value: Feedback value to record, should be True for positive feedback and False for negative feedback or a float between -1.0 and +1.0 where negative values indicate negative feedback and positive values indicate positive feedback.
    • comment: Optional comment to record with the feedback.

Examples:

Record whether predictions were correct or incorrect:

1
2
3
4
5
>>> model.record_feedback({
...     "prediction": p.prediction_id,
...     "category": "correct",
...     "value": p.label == p.expected_label,
... } for p in predictions)

Record star rating as normalized continuous score between -1.0 and +1.0:

1
2
3
4
5
6
>>> model.record_feedback({
...     "prediction": "123e4567-e89b-12d3-a456-426614174000",
...     "category": "rating",
...     "value": -0.5,
...     "comment": "2 stars"
... })

Raises:

  • ValueError

    If the value does not match previous value types for the category, or is a float that is not between -1.0 and +1.0.