Orca Quick Start#
This tutorial will give you a quick walk through of how to install OrcaLib, store memories in an OrcaDB, and implement a retrieval-augmented classification (RAC) model that uses those memories to guide its predictions.
Install OrcaLib#
First we need to install OrcaLib, our Python library for interacting with OrcaDB instances and building retrieval-augmented models. OrcaLib is compatible with Python 3.10 or higher and is available on PyPI. You can install it with your favorite python package manager alongside a few other standard dependencies we will need for this tutorial.
Connect to OrcaDB#
OrcaDB makes it easy to store memories that your model will look up to guide its predictions. Because we are building a classification model, we will store memories that contain labeled examples from our dataset. OrcaLib contains the LabeledMemoryset
class which provides a convenient way to store memories and look up similar memories for a given input. (1)
- OrcaDB generates embeddings for each memory and uses an approximate nearest neighbor (ANN) index to look up similar memories for the embedding of a given input.
For training and experimenting with your model, it is usually most convenient to use an immutable local file-based OrcaDB instance. Once you are ready to deploy your model to production, you can clone your memoryset into a hosted OrcaDB instance, which will be more performant and allow you to dynamically update memories and collect telemetry data.
To create a local database in a file called local.db
and store memories in a table called airline_sentiment
, we can create an immutable LabeledMemoryset
with a file URL like this:
- We use a file URL with a fragment to specify the path to the local database and the table name in the fragment.
OrcaDB is currently in early access, so you will need to contact us to get an invite so you can create an account and deploy an instance on AWS or GCP (1).
- If you need to deploy to a different cloud provider or on-premises, please get in touch to discuss your requirements. Orca is designed to be cloud-agnostic and can be deployed to any cloud provider or on-premises.
Once you have gotten an invite, follow these steps to get access to your OrcaDB instance:
- Head to app.orcadb.ai and log in with your work email account. You will be prompted to join your organization.
- Once you have joined your organization, you can find a list of deployed instances on the Cloud Tab.
- If you already have an instance deployed, proceed to step 4. If you don’t have an instance deployed yet, click the “Deploy your first Instance” button to deploy your first instance. For more information on how to deploy an instance, check out our guide on how to deploy a new instance. Then proceed to step 5.
- Select the instance you want to connect to from the list and open the instance details screen by clicking on the arrow button on the right side of the list.
- On the instance details screen, you will find information about the configuration of your instance. To connect to the instance in the next step you will need the instance credentials. You can access those by clicking on the key icon next to the URL (see screenshot below) which will open a popup from where you can copy the endpoint, API key, and secret key.
OrcaLib automatically picks up the credentials needed to connect to your OrcaDB from environment variables. You can use a tool like dotenv to load them into your environment from a .env
file that should contain the following values:
Now we can create a LabeledMemoryset
for storing our memories in a table called airline_sentiment
in the OrcaDB we just created and we are able to update the memories stored there.
- The first argument to the
LabeledMemoryset
constructor is the name of the table to store the memories in (or the full database URL with the table name in the fragment).
Prepare a Memoryset#
With that out of the way, let’s go ahead and load our dataset. We will use the IMDB Dataset to train our model. This dataset contains the text of 50000 movie reviews and a label indicating whether the review has a positive or negative sentiment. It is equally distributed and split half-and-half for training and testing.
- The IMDB dataset is ordered by label by default, so we shuffle it to randomize the order of the samples.
To implement a retrieval-augmented model, we need to store memories that contain examples of inputs and outputs that are similar to the inputs we want the model to classify. To get started, we will simply use the training dataset as the memories. So let’s go ahead and insert
them into our previously created memoryset.
- This will take a while to generate the embeddings for all 25,000 training samples. But since the result is saved to an OrcaDB table, you will not have to run this again unless you
reset
the memoryset.
To get a feel for how memorysets work, we can manually retrieve some sample reviews using the semantic search capabilities of OrcaDB with the the lookup
method.
[
LabeledMemoryLookup(
value="I absolutely loved this movie. It met all expectations and went beyond that. I loved the humor and the way the movie wasn't just randomly silly. It also had a message. Jim Carrey makes me happy. :)",
label=<pos: 1>,
embedding=<array.float64(768,)>,
memory_id='52bcf742-dc9e-4640-adaf-af956c760cd2',
memory_version=1,
lookup_score=0.6890271713892993, # (1)!
),
...
]
- The lookup score is the cosine similarity between the memory embedding and the input embedding.
As you can see we get back a list of LabeledMemoryLookup
objects that contain the text and label of the memories we inserted alongside some extra information like the similarity score and the embedding that was used for the lookup.
Build a RAC Model#
Time to build our first retrieval-augmented model that use similar memories to the input to guide their predictions. Orca makes it really easy to do this with the RACModel
class which automates the retrieval of memories and has convenient methods for training, evaluating, and using the model.
- This tells the model how many classes we have in our dataset.
By default the RAC model uses an MMOE head that uses a cross-attention mechanism to blend the labels of the retrieved memories based on weights derived from the input and memory embeddings. It is initialized to already return decent predictions based on the memory labels without any training. But to learn the weights for the attention mechanism, we can use the finetune
method to train the model on our labeled dataset.
- This will take a while to train. You can
save
the model to disk after it is trained so you canload
it back up later without having to train it again.
Now let’s see how well our model performs on the test dataset by using the evaluate
method that calculates typical classification metrics.
We see that the model has an accuracy of around 93% which is quite good for a first try. There are many ways to improve the model by tuning our training hyper parameters, fine-tuning the embedding model, adding reranking to the lookups, or curating the contents of our memoryset.
Lastly, let’s check our the retrieval augmentation in action by calling the predict
method on a sample input.
PredictionResult(
label=<pos: 1>,
logits=[0.0001646280289, 0.9998353719711304],
memories=[ # (1)!
LabeledMemoryLookup(
value="I absolutely loved this movie. It met all expectations and went beyond that. I loved the humor and the way the movie wasn't just randomly silly. It also had a message. Jim Carrey makes me happy. :)",
label=<pos: 1>,
embedding=<array.float64(768,)>,
memory_id='52bcf742-dc9e-4640-adaf-af956c760cd2',
memory_version=1,
lookup_score=0.6890271713892993,
attention_weight=0.6890273094177246, # (2)!
),
...
]
)
- The memories attribute contains a list of all the memories that were looked up to guide the prediction.
- The attention weight is the weight the model assigned to each memory to combine them to form the final prediction.
As you can see the returned PredictionResult
object contains not only the predicted label and logits, but also the memories that were looked up to guide this prediction including their lookup similarity score and the attention weight the model assigned to each memory to combine them.
Up Next#
That was a lot. We have seen how to install OrcaLib, create a memoryset to store memory data in an OrcaDB instance that can be easily retrieved, and build our first retrieval-augmented classification (RAC) model that uses those memories to guide its predictions.
To dive deeper into how to build your own retrieval-augmented models, check out our more detailed tutorials for different model types: