orca_sdk.regression_model#

RegressionModel #

A handle to a regression model in OrcaCloud

Attributes:

id (str) –

Unique identifier for the model
name (str) –

Unique name of the model
description (str | None) –

Optional description of the model
memoryset (ScoredMemoryset) –

Memoryset that the model uses
head_type (RARHeadType) –

Regression head type of the model
memory_lookup_count (int) –

Number of memories the model uses for each prediction
locked (bool) –

Whether the model is locked to prevent accidental deletion
created_at (datetime) –

When the model was created
updated_at (datetime) –

When the model was last updated

last_prediction `property` #

last_prediction

Last prediction made by the model

Note

If the last prediction was part of a batch prediction, the last prediction from the batch is returned. If no prediction has been made yet, a LookupError is raised.

create `classmethod` #

create(
    name,
    memoryset,
    memory_lookup_count=None,
    description=None,
    if_exists="error",
)

Create a regression model.

Parameters:

name (str) –

Name of the model
memoryset (ScoredMemoryset) –

The scored memoryset to use for prediction
memory_lookup_count (int | None, default: None ) –

Number of memories to retrieve for prediction. Defaults to 10.
description (str | None, default: None ) –

Description of the model
if_exists (CreateMode, default: 'error' ) –

How to handle existing models with the same name

Returns:

RegressionModel –

RegressionModel instance

Raises:

ValueError –

If a model with the same name already exists and if_exists is “error”
ValueError –

If the memoryset is empty
ValueError –

If memory_lookup_count exceeds the number of memories in the memoryset

open `classmethod` #

open(name)

Get a handle to a regression model in the OrcaCloud

Parameters:

name (str) –

Name or unique identifier of the regression model

Returns:

RegressionModel –

Handle to the existing regression model in the OrcaCloud

Raises:

LookupError –

If the regression model does not exist

exists `classmethod` #

exists(name_or_id)

Check if a regression model exists in the OrcaCloud

Parameters:

name_or_id (str) –

Name or id of the regression model

Returns:

bool –

True if the regression model exists, False otherwise

all `classmethod` #

all()

Get a list of handles to all regression models in the OrcaCloud

Returns:

list[RegressionModel] –

List of handles to all regression models in the OrcaCloud

drop `classmethod` #

drop(name_or_id, if_not_exists='error')

Delete a regression model from the OrcaCloud

Warning

This will delete the model and all associated data, including predictions, evaluations, and feedback.

Parameters:

name_or_id (str) –

Name or id of the regression model
if_not_exists (DropMode, default: 'error' ) –

What to do if the regression model does not exist, defaults to "error". Other option is "ignore" to do nothing if the regression model does not exist.

Raises:

LookupError –

If the regression model does not exist and if_not_exists is "error"

refresh #

refresh()

Refresh the model data from the OrcaCloud

set #

set(*, description=UNSET, locked=UNSET)

Update editable attributes of the model.

Note

If a field is not provided, it will default to UNSET and not be updated.

Parameters:

description (str | None, default: UNSET ) –

Value to set for the description
locked (bool, default: UNSET ) –

Value to set for the locked status

Examples:

Update the description:

>>> model.set(description="New description")

Remove description:

>>> model.set(description=None)

Lock the model:

>>> model.set(locked=True)

lock #

lock()

Lock the model to prevent accidental deletion

unlock #

unlock()

Unlock the model to allow deletion

predict #

predict(
    value: str,
    expected_scores: float | None = None,
    tags: set[str] | None = None,
    save_telemetry: TelemetryMode = "on",
    prompt: str | None = None,
    use_lookup_cache: bool = True,
    timeout_seconds: int = 10,
    ignore_unlabeled: bool = False,
    partition_id: str | None = None,
    partition_filter_mode: Literal[
        "ignore_partitions",
        "include_global",
        "exclude_global",
        "only_global",
    ] = "include_global",
    use_gpu: bool = True,
) -> RegressionPrediction

predict(
    value: list[str],
    expected_scores: list[float] | None = None,
    tags: set[str] | None = None,
    save_telemetry: TelemetryMode = "on",
    prompt: str | None = None,
    use_lookup_cache: bool = True,
    timeout_seconds: int = 10,
    ignore_unlabeled: bool = False,
    partition_id: str | list[str | None] | None = None,
    partition_filter_mode: Literal[
        "ignore_partitions",
        "include_global",
        "exclude_global",
        "only_global",
    ] = "include_global",
    use_gpu: bool = True,
) -> list[RegressionPrediction]

predict(
    value,
    expected_scores=None,
    tags=None,
    save_telemetry="on",
    prompt=None,
    use_lookup_cache=True,
    timeout_seconds=10,
    ignore_unlabeled=False,
    partition_id=None,
    partition_filter_mode="include_global",
    use_gpu=True,
)

Make predictions using the regression model.

Parameters:

value (str | list[str]) –

Input text(s) to predict scores for
expected_scores (float | list[float] | None, default: None ) –

Expected score(s) for telemetry tracking
tags (set[str] | None, default: None ) –

Tags to associate with the prediction(s)
save_telemetry (TelemetryMode, default: 'on' ) –

Whether to save telemetry for the prediction(s), defaults to True, which will save telemetry asynchronously unless the ORCA_SAVE_TELEMETRY_SYNCHRONOUSLY environment variable is set to "1". You can also pass "sync" or "async" to explicitly set the save mode.
prompt (str | None, default: None ) –

Optional prompt for instruction-tuned embedding models
use_lookup_cache (bool, default: True ) –

Whether to use cached lookup results for faster predictions
timeout_seconds (int, default: 10 ) –

Timeout in seconds for the request, defaults to 10 seconds
ignore_unlabeled (bool, default: False ) –

If True, only use memories with scores during lookup. If False (default), allow memories without scores when necessary.
partition_id (str | list[str | None] | None, default: None ) –

Optional partition ID(s) to use during memory lookup
partition_filter_mode (Literal['ignore_partitions', 'include_global', 'exclude_global', 'only_global'], default: 'include_global' ) –

Optional partition filter mode to use for the prediction(s). One of * "ignore_partitions": Ignore partitions * "include_global": Include global memories * "exclude_global": Exclude global memories * "only_global": Only include global memories
use_gpu (bool, default: True ) –

Whether to use GPU for the prediction (defaults to True)

Returns:

RegressionPrediction | list[RegressionPrediction] –

Single RegressionPrediction or list of RegressionPrediction objects

Raises:

ValueError –

If expected_scores length doesn’t match value length for batch predictions
ValueError –

If timeout_seconds is not a positive integer
TimeoutError –

If the request times out after the specified duration

predictions #

predictions(limit=100, offset=0, tag=None, sort=[])

Get a list of predictions made by this model

Parameters:

limit (int, default: 100 ) –

Optional maximum number of predictions to return
offset (int, default: 0 ) –

Optional offset of the first prediction to return
tag (str | None, default: None ) –

Optional tag to filter predictions by
sort (list[tuple[Literal['anomaly_score', 'confidence', 'timestamp'], Literal['asc', 'desc']]], default: [] ) –

Optional list of columns and directions to sort the predictions by. Predictions can be sorted by created_at, confidence, anomaly_score, or score.

Returns:

list[RegressionPrediction] –

List of score predictions

Examples:

Get the last 3 predictions:

>>> predictions = model.predictions(limit=3, sort=[("created_at", "desc")])
[
    RegressionPrediction({score: 4.5, confidence: 0.95, anomaly_score: 0.1, input_value: 'Great service'}),
    RegressionPrediction({score: 2.0, confidence: 0.90, anomaly_score: 0.1, input_value: 'Poor experience'}),
    RegressionPrediction({score: 3.5, confidence: 0.85, anomaly_score: 0.1, input_value: 'Average'}),
]

Get second most confident prediction:

>>> predictions = model.predictions(sort=[("confidence", "desc")], offset=1, limit=1)
[RegressionPrediction({score: 4.2, confidence: 0.90, anomaly_score: 0.1, input_value: 'Good service'})]

evaluate #

evaluate(
    data: Datasource | Dataset,
    *,
    value_column: str = "value",
    score_column: str = "score",
    record_predictions: bool = False,
    tags: set[str] = {"evaluation"},
    batch_size: int = 100,
    prompt: str | None = None,
    subsample: int | float | None = None,
    background: Literal[True],
    ignore_unlabeled: bool = False,
    partition_column: str | None = None,
    partition_filter_mode: Literal[
        "ignore_partitions",
        "include_global",
        "exclude_global",
        "only_global",
    ] = "include_global"
) -> Job[RegressionMetrics]

evaluate(
    data: Datasource | Dataset,
    *,
    value_column: str = "value",
    score_column: str = "score",
    record_predictions: bool = False,
    tags: set[str] = {"evaluation"},
    batch_size: int = 100,
    prompt: str | None = None,
    subsample: int | float | None = None,
    background: Literal[False] = False,
    ignore_unlabeled: bool = False,
    partition_column: str | None = None,
    partition_filter_mode: Literal[
        "ignore_partitions",
        "include_global",
        "exclude_global",
        "only_global",
    ] = "include_global"
) -> RegressionMetrics

evaluate(
    data,
    *,
    value_column="value",
    score_column="score",
    record_predictions=False,
    tags={"evaluation"},
    batch_size=100,
    prompt=None,
    subsample=None,
    background=False,
    ignore_unlabeled=False,
    partition_column=None,
    partition_filter_mode="include_global"
)

Evaluate the regression model on a given dataset or datasource

Parameters:

data (Datasource | Dataset) –

Dataset or Datasource to evaluate the model on
value_column (str, default: 'value' ) –

Name of the column that contains the input values to the model
score_column (str, default: 'score' ) –

Name of the column containing the expected scores
record_predictions (bool, default: False ) –

Whether to record RegressionPredictions for analysis
tags (set[str], default: {'evaluation'} ) –

Optional tags to add to the recorded RegressionPredictions
batch_size (int, default: 100 ) –

Batch size for processing Dataset inputs (only used when input is a Dataset)
prompt (str | None, default: None ) –

Optional prompt for instruction-tuned embedding models
subsample (int | float | None, default: None ) –

Optional number (int) of rows to sample or fraction (float in (0, 1]) of data to sample for evaluation.
background (bool, default: False ) –

Whether to run the operation in the background and return a job handle
ignore_unlabeled (bool, default: False ) –

If True, only use memories with scores during lookup. If False (default), allow memories without scores
partition_column (str | None, default: None ) –

Optional name of the column that contains the partition IDs
partition_filter_mode (Literal['ignore_partitions', 'include_global', 'exclude_global', 'only_global'], default: 'include_global' ) –

Optional partition filter mode to use for the evaluation. One of * "ignore_partitions": Ignore partitions * "include_global": Include global memories * "exclude_global": Exclude global memories * "only_global": Only include global memories

Returns: RegressionMetrics containing metrics including MAE, MSE, RMSE, R2, and anomaly score statistics

Examples:

>>> model.evaluate(datasource, value_column="text", score_column="rating")
RegressionMetrics({
    mae: 0.2500,
    rmse: 0.3536,
    r2: 0.8500,
    anomaly_score: 0.3500 ± 0.0500,
})

>>> # Using with an instruction-tuned embedding model
>>> model.evaluate(dataset,prompt="Represent this review for rating prediction:")
RegressionMetrics({
mae: 0.2000,
rmse: 0.3000,
r2: 0.9000,
anomaly_score: 0.3000 ± 0.0400})

use_memoryset #

use_memoryset(memoryset_override)

Temporarily override the memoryset used by the model for predictions

Parameters:

memoryset_override (ScoredMemoryset) –

Memoryset to override the default memoryset with

Examples:

>>> with model.use_memoryset(ScoredMemoryset.open("my_other_memoryset")):
...     predictions = model.predict("Rate your experience")

record_feedback #

record_feedback(feedback: dict[str, Any]) -> None

record_feedback(feedback: Iterable[dict[str, Any]]) -> None

record_feedback(feedback)

Record feedback for a list of predictions.

We support recording feedback in several categories for each prediction. A FeedbackCategory is created automatically, the first time feedback with a new name is recorded. Categories are global across models. The value type of the category is inferred from the first recorded value. Subsequent feedback for the same category must be of the same type.

Parameters:

feedback (Iterable[dict[str, Any]] | dict[str, Any]) –
Feedback to record, this should be dictionaries with the following keys:
- category: Name of the category under which to record the feedback.
- value: Feedback value to record, should be True for positive feedback and False for negative feedback or a float between -1.0 and +1.0 where negative values indicate negative feedback and positive values indicate positive feedback.
- comment: Optional comment to record with the feedback.

Examples:

Record whether predictions were accurate:

>>> model.record_feedback({
...     "prediction": p.prediction_id,
...     "category": "accurate",
...     "value": abs(p.score - p.expected_score) < 0.5,
... } for p in predictions)

Record star rating as normalized continuous score between -1.0 and +1.0:

>>> model.record_feedback({
...     "prediction": "123e4567-e89b-12d3-a456-426614174000",
...     "category": "rating",
...     "value": -0.5,
...     "comment": "2 stars"
... })

Raises:

ValueError –

If the value does not match previous value types for the category, or is a float that is not between -1.0 and +1.0.

orca_sdk.regression_model#

RegressionModel #

last_prediction property #

create classmethod #

open classmethod #

exists classmethod #

all classmethod #

drop classmethod #

refresh #

set #

lock #

unlock #

predict #

predictions #

evaluate #

use_memoryset #

record_feedback #

last_prediction `property` #

create `classmethod` #

open `classmethod` #

exists `classmethod` #

all `classmethod` #

drop `classmethod` #