Memories and Memorysets#
This guide dives into the details of how to work with memories and memorysets in OrcaLib. You will learn what memories are, how to create a memoryset, insert memories into it, lookup memories that are similar to a given query, and update or delete memories.
What are Memories?#
In the context of Orca, memories are additional data that your model uses to guide its predictions. Your model will look up relevant memories based on the input it receives and use them to inform its output. Memories are stored in OrcaDB tables and can thus be updated at any time, which allows changing the model’s behavior without retraining or redeploying it. For more information about memories, check out our memories concept guide
The easiest way to work with memories in Orca is by using memorysets that provide a high-level interface for storing and looking up memories.
Create a Memoryset#
In this guide we will use the LabeledMemoryset
, which is a memoryset that stores labels for classification tasks, as an example. The memoryset will take care of creating a table with the right schema in the database if it doesn’t exist yet (or use the existing one), and will automatically generate embeddings for your memories using the embedding model you specify.
We recommend, starting out with a memoryset saved to a local DB for quick testing and prototyping.
- This is the name of the table in your database that will store the memories.
- This is the embedding model that will be used to embed the memories for semantic search.
The memoryset stores LabeledMemory
objects with the following properties:
value
: thestr
orImage
value of the memorylabel
: the label of the memorymetadata
: adict
with additional information about the memoryembedding
: the embedding of the value that is generated by the embedding model attached to the memorysetmemory_id
: the ID of the memory that is generated when it is inserted into the tablememory_version
: the version of the memory that is incremented each time the memory is updated
Insert and Inspect Memories#
To insert memories into the memoryset, you use the insert
method.
This will insert two memories into the faq_items
table and generate an embedding for each of them. The insert
method accepts a wide range of data types (e.g. list[dict]
, Dataset
, and DataFrame
) that will be automatically converted into the correct format and saved.
All formats except for the list of tuples provided above, require the input to contain a value
key (text
and image
keys are also supported for convenience) as well as keys for all features of the specific memoryset (e.g. label
and label_name
for our LabeledMemoryset
).
To quickly inspect the contents of the memoryset, you can call the to_pandas
method:
value label embedding metadata memory_id memory_version
'OrcaDB is a memory-augmented database that all...' 0 [0.005379246082156897, 0.0002812617167364806, ... {'tag': 'db'} '5fb9521a-d3c2-430f-b43a-f51ff92643de' 1
'OrcaLib is a Python library that allows you to...' 1 [0.011897868476808071, -0.011060018092393875, ... {'tag': 'sdk'} '5fb9521a-d3c2-430f-b43a-f51ff92643de' 1
Look up Relevant Memories#
The main purpose of a memoryset is to enable efficiently looking up memories that are similar to a given query (typically an input to a model). You use the lookup
method for this:
The lookup
method takes a single query, a list if queries, or an embedding and is automatically batched for efficiency.
The result is a list of LabeledMemoryLookup
s that contain the memory properties and an additional lookup_score
property with a score between 0 and 1 that indicates the similarity between the query and the memory (1).
- If you have a reranker attached to the memoryset, the lookups will also contain a
reranker_score
property. See the Reranking Guide to learn more about how to use rerankers with your memorysets.
Mapping and Filtering Memories#
Memorysets are not designed for updating individual memory values. But they do provide a way to generate new memorysets based on the existing ones by using the map
and filter
methods.
Let’s say you want to create a new memoryset that contains the memories of the original memoryset but with the label flipped. You can achieve this by using the map
method:
- The lambda function takes in a memory and returns a dictionary containing the values to update in the memory (or an entirely new memory).
- This is the name of the table in which the new memoryset will be stored. If you leave this as
None
, the new memoryset will replace the original memoryset.
Or say you want to create a new memoryset that contains only the memories of the original memoryset that have a specific metadata value. You can achieve this by using the filter
method:
- The lambda function takes in a memory and returns a boolean indicating whether the memory should be included in the new memoryset.
- This is the name of the table in which the new memoryset will be stored. If you leave this as
None
, the new memoryset will replace the original memoryset.
Sometimes you may want to delete all memories from the memoryset. You can achieve this by using the reset
method:
Deploy Memoryset to Hosted OrcaDB#
Once you are ready to deploy your model to production, you can deploy your memoryset to a hosted OrcaDB with the clone
method:
- Ensure you followed the Installation & Setup Tutorial to setup the environment variable.
This will create a new table in your hosted OrcaDB and copy all the memories from the original memoryset into the new one. You can now adapt the behavior of the model that has this memoryset attached, by updating memories in the hosted memoryset. To update memories in the hosted memoryset, you can use the Orca App or directly run SQL queries against the table that backs the hosted memoryset. See the DB Querying Guide to learn more about how to interact directly with tables in your hosted OrcaDB.