orca_sdk.async_client#
ColumnType
module-attribute
#
The type of a column in a datasource
JobStatus
module-attribute
#
JobStatus = Literal[
"INITIALIZED",
"DISPATCHED",
"WAITING",
"PROCESSING",
"COMPLETED",
"FAILED",
"ABORTING",
"ABORTED",
]
Status of job in the job queue
WorkerStatus
module-attribute
#
Status of worker in the worker pool
ActionRecommendation
#
EmbeddingFinetuneConfig
#
Bases: TypedDict
loss
instance-attribute
#
Which loss family to train with. ‘prediction’ adds a linear head on top of embeddings. ‘contrastive’ and ‘triplet’ train embeddings directly for similarity. ‘proxy’ uses class-proxy similarity. Each loss has different defaults for batch size, evaluation, and learning rate.
task_type
instance-attribute
#
Whether the model predicts discrete labels or continuous scores. The default is classification; set regression when training against a score column.
learning_rate
instance-attribute
#
Peak learning rate after warmup. Higher values train faster but risk instability. Tuple searches log-uniformly between (min, max).
batch_size
instance-attribute
#
Total samples per training step. Larger batches give more stable gradients and better contrastive negatives but use more memory. Automatically split across GPUs and gradient accumulation steps. List to try multiple sizes in a sweep.
max_steps
instance-attribute
#
Maximum number of training steps. Overrides epochs when set. Useful
for quick validation runs or capping long-running jobs.
warmup
instance-attribute
#
Learning rate warmup. int = number of steps, float = fraction of total steps (0–1). Warmup helps stabilize early training. Tuples/lists follow the same int/float convention.
weight_decay
instance-attribute
#
L2 regularization strength. Helps prevent overfitting. Typical range: 0.0 to 0.1.
learning_rate_scheduler
instance-attribute
#
How the learning rate changes after warmup. ‘linear’ decays to zero, ‘cosine’ follows a cosine curve, ‘constant’ stays flat. List to compare schedulers in a sweep.
loss_scale
instance-attribute
#
Inverse temperature (1/τ) for contrastive and proxy losses — controls how sharply the model distinguishes similar from dissimilar pairs. Higher = more discriminative. Only used by contrastive and proxy losses.
contrastive_sigma
instance-attribute
#
Gaussian kernel width for contrastive regression. None auto-tunes from the training score standard deviation. Only used by contrastive regression.
normalize_embeddings
instance-attribute
#
L2-normalize embeddings before the prediction head. Can improve stability when embedding magnitudes vary. Only used by prediction loss.
max_seq_length
instance-attribute
#
Maximum token length for input text. Longer sequences are truncated. ‘max’ fits the longest sample, ‘p95’ / ‘p99’ covers 95% / 99% of samples (saves memory), or set an int for an explicit limit.
truncation_side
instance-attribute
#
Which end to cut when text exceeds max_seq_length. ‘right’ keeps the beginning.
instruction
instance-attribute
#
Task instruction for instruction-tuned models (e.g. ‘Classify this text’). Formatted into ‘Instruct: {instruction}\nQuery: ’ by the embedding model. None uses the model’s built-in default prompt (if any).
bf16
instance-attribute
#
Use bfloat16 mixed precision to halve memory usage and speed up training. None auto-enables on supported GPUs (A100, H100, etc.). Set False to force full precision.
device_batch_size_limit
instance-attribute
#
Maximum samples that fit in one GPU’s memory per forward pass. None auto-estimates from your GPU’s memory and model size. Override if you hit out-of-memory errors or want tighter control.
gradient_checkpointing
instance-attribute
#
Trade compute for memory by recomputing activations during backward. Roughly halves memory at ~30% slower training. None auto-enables when it would avoid quality-degrading workarounds (mini-batching for contrastive, or fitting an otherwise impossible triplet batch).
gather_across_devices
instance-attribute
#
Share contrastive negatives across all GPUs. Improves quality when the batch is too large for a single GPU. None auto-enables when needed. Only relevant for contrastive loss with multiple GPUs.
eval_method
instance-attribute
#
How to measure model quality during training. ‘head’ reuses the prediction head (fast, prediction only). ‘neighbor’ runs nearest-neighbor search (works for all losses, slower). ‘loss’ uses the training loss on held-out data (cheapest). None picks ‘head’ for prediction, ‘neighbor’ otherwise.
eval_steps
instance-attribute
#
How often to evaluate. int = every N training steps, ‘epoch’ = once per epoch, ‘end’ = only after training finishes, ‘off’ = skip evaluation entirely. None picks a sensible default based on eval_method.
eval_batch_size
instance-attribute
#
Batch size for evaluation inference. None auto-detects from GPU memory.
max_eval_batch_size
instance-attribute
#
Cap on auto-detected eval_batch_size. Lower this if evaluation hits OOM.
neighbor_eval_count
instance-attribute
#
Number of nearest neighbors to consider for neighbor evaluation.
neighbor_eval_pool_subsample
instance-attribute
#
Reduce the neighbor search pool for faster evaluation. int = use this many train samples, float = use this fraction. None uses the full train set.
early_stopping
instance-attribute
#
Stop training when the eval metric stops improving. True = stop after 2 evaluations without improvement, int = custom patience count, False = always train for all epochs. Requires eval_steps to run during training (not ‘end’ or ‘off’).
early_stopping_threshold
instance-attribute
#
Minimum improvement to count as progress for early stopping. 0.0 means any improvement resets the patience counter.
trial_count
instance-attribute
#
Number of hyperparameter configurations to try. 1 = single training run. Values > 1 activate sweep mode, which uses Optuna to search over any parameter specified as a range or list.
startup_trial_count
instance-attribute
#
Number of random trials before the optimizer starts making informed suggestions. 0 = start optimizing immediately (good for small trial budgets).
seed
instance-attribute
#
Random seed for reproducibility. Controls data shuffling, dropout, weight initialization, and sweep trial sampling.
logging_steps
instance-attribute
#
Print training metrics (loss, learning rate, etc.) every N steps.
accelerator_config
instance-attribute
#
Advanced HuggingFace Accelerator settings. Most users can leave this as None.
extra_training_args
instance-attribute
#
Additional HuggingFace TrainingArguments not exposed above. Merged directly into the training configuration.
MemorysetClassPatternsAnalysisConfig
#
Bases: TypedDict
min_uniformity_threshold
instance-attribute
#
Minimum uniformity score (0-1) required for a memory to be considered as a representative. A uniformity of 1.0 means all neighbors are the same class. Lower values allow more flexibility but may select fewer prototypical examples. Default is 1.0.
PRCurve
#
PredictionFeedbackRequest
#
RegressionMetrics
#
Bases: TypedDict
explained_variance
instance-attribute
#
Explained variance score of the predictions
anomaly_score_mean
instance-attribute
#
Mean of anomaly scores across the dataset
anomaly_score_median
instance-attribute
#
Median of anomaly scores across the dataset
anomaly_score_variance
instance-attribute
#
Variance of anomaly scores across the dataset
GetMemorysetByNameOrIdMemoryByMemoryIdParams
#
DeleteMemorysetByNameOrIdMemoryByMemoryIdParams
#
PostEmbeddingModelUploadRequest
#
PostDatasourceUploadRequest
#
GetDatasourceByNameOrIdDownloadParams
#
GetClassificationModelParams
#
GetRegressionModelParams
#
GetPredictiveModelParams
#
GetTelemetryPredictionByPredictionIdParams
#
ClassificationMetrics
#
Bases: TypedDict
anomaly_score_mean
instance-attribute
#
Mean of anomaly scores across the dataset
anomaly_score_median
instance-attribute
#
Median of anomaly scores across the dataset
anomaly_score_variance
instance-attribute
#
Variance of anomaly scores across the dataset
pr_auc
instance-attribute
#
Average precision (area under the curve of the precision-recall curve)
confusion_matrix
instance-attribute
#
Confusion matrix where the entry at row i, column j is the count of samples with true label i predicted as label j