Skip to content

Welcome to Orca#

Orca enables you to build and maintain retrieval-augmented models that can adapt to changing circumstances. This documentation will help you to get started building models with Orca, understand the core concepts of retrieval-augmentation, and teach you how to leverage Orca to maintain model performance through memory tuning.

Retrieval-Augmentation#

You might have heard of retrieval-augmented generation (RAG) in the context of LLMs, Orca takes a similar approach but applies it to other types of models like classification, regression, and recommendation models.

Retrieval-Augmentation is a technique that enables machine learning models to adapt to new circumstances without retraining by accessing external data which we call “memories” that are stored separately from the model’s logic. During training, Orca injects relevant memory data based on model inputs, teaching the model to efficiently use this supplementary information alongside its inherent knowledge. At inference, the model reliably looks up and uses memories, allowing for behavior changes without retraining or redeployment.

This enables adapting model behavior in real-time by updating the memories used by the model to make predictions. Allowing you to solve issues such as data-drift, customize behavior to users at scale, identify and combat bias, and more without the need for retraining or redeployment. Check out our guide on retrieval-augmentation to learn more.

Orca provides a fully managed cloud solution for hosting and maintaining retrieval-augmented models, storing and optimizing memories, and observing and tuning model behavior in real-time.

Orca Components#

Orca consists of three main components that work together to enable building, instrumenting, and maintaining retrieval-augmented models:

OrcaCloud is a fully managed cloud solution for hosting retrieval-augmented models, storing and embedding memories, collecting telemetry data about memory usage, and observing and tuning model behavior in real-time.

OrcaSDK is a Python library that allows you to easily ingest memories into OrcaCloud, quickly deploy retrieval-augmented models, collect feedback from model predictions, and analyze and optimize memory usage.

OrcaApp is a web application for managing models in the OrcaCloud, browsing memory data, monitoring memory usage, and tuning memories as model usage evolves over time.

Overview Diagram Overview Diagram

Benefits of Orca#

Once you deploy a retrieval-augmented model and instrument it with Orca, you can leverage Orca to optimize model performance. Orca actively records memory usage, analyzes memory relevance, and allows you to record feedback for all model predictions. You can use this data to drive ongoing model performance by:

  • Assessing which memories contribute to accurate and inaccurate results and making surgical updates to problematic memories.
  • Identifying underperforming clusters of inputs that would benefit from additional memories and easily generating new (real or synthetic) memories.
  • Understanding which memories contribute to specific outputs to ensure compliance and guarantee your model is making fair and unbiased decisions.

Structure of these Docs#

This documentation is structured to take you from zero to production and beyond with Orca. It consists of the following sections:

Quick Start will get you up and running with Orca in a few minutes.

How-to Guides help you leverage all the bells and whistles of Orca to deploy retrieval-augmented models, and tune memories to optimize model performance.

Concepts provides explanations of the core concepts of retrieval-augmentation and how to reason about memory usage of these models.

Reference contains the detailed specification for the interfaces of all public OrcaSDK modules as well as the OrcaCloud API.

Where to Start#

To get started, follow our quick start to get up and running with Orca in a few minutes:

  • Quick Start


    Setup OrcaDB, install OrcaLib, and build your first retrieval-augmented classifier in less than 10 minutes.

    Continue

Do you still have questions or are unsure how Orca can fit into your specific use case? We’d love to chat with you!

Let’s talk!