Memories#

This guide will walk you through what memories are, how to choose the right type of memories for your retrieval-augmented model, and how to optimize their performance.

What are Memories?#

In the context of Orca, memories are additional data that your model uses to guide its predictions. Your model will look up relevant memories based on the input it receives and use them to inform its output. Memories are stored in OrcaCloud memorysets and can thus be updated at any time, which allows changing the model’s behavior without retraining or redeploying it.

Memories have three core properties:

value: The value that the memory is looked up by. This can be a string, an image, or some other type of data. Orca will generate an embedding for the value which will be used to look it up. The embedding might also be used by the model to make its prediction.
label or score: The memory may contain other features associated with the value that is used by the model to make a prediction. Currently we support labeled and scored memories that can be used by a classification model. Orca provides special purpose memorysets for working with memories with different features. See the memorysets guide for more information.
metadata: Additional data about the memory that is not used by the model. This can be used to store information like the memory’s source, creation date, or any other relevant information.

How Models Use Memories#

During inference and training, retrieval-augmented models look up memories that are similar to the input. This is done by calculating a similarity metric between the embedding of the input to your model and the memory embeddings.

The model will then receive the memory embeddings and features to use in its predictions. In the simplest case, a classification model might for example predict the label of an input to be the most common label of the memories retrieved for that input.

Because the predictions of the model are guided by the memories, it also becomes easier to understand why the model made a certain prediction. If the model made a mistake, you can simply add, remove, or edit memories in the memoryset to fix it. Since the memories are stored in the OrcaCloud, these updates will be reflected in the model’s predictions immediately without the need for retraining or redeploying it.

What to use as Memories#

Memories can generally do one of two things:

Provide relevant contextual features to the model.
Provide relevant examples for similar inputs and expected outputs.

For classification models, the second type of memory is usually the most useful. For example, in a sentiment analysis model, you want to store examples of inputs and outputs that are similar to those the model will get. At inference time, the model will look up memories similar to the input to guide its outputs.

Often, the training data is a good place to start. From there you can analyze your memories and update them based on metrics and production feedback.

A great source of memories are usually the data you observe during inference in production. Orca can help you identify what production inputs will help improve your model’s behavior the most and help you label it easily.

If you have further questions how to find the right memories for your specific model, get in touch! We’d love to learn more about your use case and figure out together how to best leverage memories for your model.