OrcaSDK Release Notes#

This document tracks notable changes to the OrcaSDK.

v0.1.14#

Added an ignore_unlabeled parameter to model create; defaults to True and can be overridden via predict and evaluate.
Added automatic serialization of datetimes in metadata (datetimes will be returned as ISO strings).
Changed lookup_count default from 15 to 10.

Added methods to retrieve the currently used org ID, scopes, API key name, and config.
Added progress bars for batch operations in predict, insert, update, and delete methods.
Added a "replace" option to the if_exists parameter in model creation methods to allow replacing existing models.
Enhanced explanation output to include input and prediction information.
Refactored memory suggestion API to return a list of LabeledMemorySuggestion objects that can be directly inserted into a memoryset.
Replaced head_type with balance_classes parameter in classification model create method.
Fixed prediction explanation timeout issue.
Aligned and fixed __repr__ methods across the SDK.
Fixed predictions not storing the correct memoryset when memoryset override is used.
Fixed API health check in OrcaSDK to properly validate responses and fail faster.
Removed default file type from datasource download.
Added ability for root users to upload locally finetuned embedding models.

Added a partitioned property on memorysets to check whether a memoryset uses partitioning.
Added a partitioned parameter to memoryset create to create empty partitioned memorysets.
Added ability to change whether a memoryset is partitioned during clone via the partitioned parameter.
Added a static compute method to classification and regression metrics to calculate metrics from a list of predictions.
Added shared logger to be used throughout the SDK that can be customized.
Added a consistency parameter to get, query, search, and predict methods.
Added support for model evaluate with pandas data frames and generic iterables of dictionaries.
Fixed a bug where expected labels/scores were not being saved on predictions when telemetry was disabled.
Removed scikit-learn and numpy dependencies (metrics calculation now happens on the API server).
Removed datasets dependency (was only needed for types and torch data parsing which was refactored to not need it anymore).

Added a cascade parameter to drop method on memorysets and finetuned embedding models to allow deleting related resources in one call (avoids foreign-key errors).
Added classification_models and regression_models properties on memorysets to list the models associated with a memoryset.
Added support for updating all memories that match a filter via the memoryset update method.
Changed the memoryset update method to return the number of updated memories (instead of the updated objects) to reduce network usage.
Added support for deleting all memories that match a filter via the memoryset delete method.
Added a truncate method on memorysets to delete all memories, or only those in a specific partition (partition_id defaults to UNSET; passing None truncates the global partition).
Removed partition parameters from the memoryset delete and query methods; use filter to target one or more partitions, or use truncate to clear a partition.
Fixed a bug where updating non-metadata fields on a memory could clear its metadata.
Removed torch, pandas, and pyarrow dependencies (they were only needed for typing).
Made gradio optional; install the notebook UI extras via orca_sdk[ui].

Added support for Python 3.14 including updating datasets to 4.4.2, pyarrow to 22.0.0, gradio to 6.3.0, and fixing several incompatibility issues.
Changed predictions to return all predictions by default when limit is None.
Changed predict and apredict to automatically batch requests to reduce network overhead.
Fixed evaluate to also include the confusion matrix, when running evaluate with a local dataset.

Added confusion matrix to classification metrics
Added ability to create empty memorysets
Added stricter checks for if_exists="open" during memoryset creation
Added support for running distribution, duplicate, cluster, and projection analyses on ScoredMemoryset
Tweaked representation of predictive models and embedding models
Fixed classification metrics calculation when test set classes don’t match model’s predicted classes.

Added use_gpu parameter to prediction methods to allow CPU-based predictions
Added support for using string columns as label columns
Added sample parameter to memoryset creation methods and model evaluate methods to allow sampling of rows
Added ignore_unlabeled parameter to prediction and evaluate methods
Added method to query datasource rows
Added support to finetune embedding models for regression tasks
Added support for querying prediction telemetry on memories
Updated SDK to use new job endpoints
Improved prediction caching
Fixed dependency vulnerability

Added async ClassificationModel.apredict and Memoryset.ainsert methods.
Added batching to Memoryset.insert, Memoryset.update, and Memoryset.delete methods to reduce network issues.
Renamed "neighbor" analysis to "distribution" analysis.
Allowed injecting custom httpx clients via context to cleanly override api keys and controll client lifecycle.
Fixed creation of orphaned datasources when using the if_exists="open" option during memoryset creation.
Removed deprecated Memoryset.run_embedding_evaluation method, use EmbeddingModel.evaluate instead.

Added support for None labels and scores to memorysets and models.
Added automatic retrying of requests to improve mitigate transient network and service issues.
Fixed bug when receiving additional field in API responses for metrics
Updated dependencies to resolve vulnerabilities