orcalib.orca_torch_mixins#
ProjectionMode
#
Determines how the values from the memory should be “projected” into the models embedding space (i.e. what’s the V in the attention mechanism QKV).
Attributes:
-
LABEL
–Project the memory’s label into the model embedding space.
-
POSITIONAL
–Project the memory’s position (0…num_memories-1) into the model embedding space.
ClassificationMode
#
DropExactMatchOption
#
Determines when to drop exact matches from the results.
Attributes:
-
ALWAYS
–Always drop exact matches from the results.
-
NEVER
–Never drop exact matches from the results.
-
TRAINING_ONLY
–Drop exact matches from the results only during training.
-
INFERENCE_ONLY
–Drop exact matches from the results only during inference.
PostInitMixin
#
Bases: ABC
Mixin class that adds an (abstract) post_init() and wraps descendent’s init() to call it.
Note:
If PostInitMixin appears more than once in the inheritance chain, only the outermost class will run post_init(). In other words, the post_init method will only be called once, after all other init methods have been called, even if there are multiple PostInitMixin classes in the inheritance chain.
PreForwardMixin
#
Bases: ABC
Mixin class that adds an (abstract) pre_forward() and wraps descendent’s forward() to call it before the original forward method.
NOTE: This uses functools.wraps to wrap the forward method, so the original forward method’s signature is preserved.
CurateSettingsMixin
#
Mixin that adds curate settings to a class as self.curate_settings, then provides properties to access the individual settings.
Note
This class is intended to be used with OrcaModule classes, and should not be used directly.
Parameters:
-
curate_database
(OrcaDatabase | str | None
, default:None
) –The database to use for saving curate tracking data.
-
model_id
(str | None
, default:None
) –The model id to associate with curated model runs.
-
model_version
(str | None
, default:None
) –The model version to associate with curated model runs.
-
metadata
(OrcaMetadataDict | None
, default:None
) –The metadata to attach to curated model runs.
-
curate_enabled
(bool
, default:False
) –Whether the model should collect curate tracking data during eval runs.
-
tags
(Iterable[str] | None
, default:None
) –The tags to attach to the curated model runs.
curate_database
property
writable
#
The name of the database to use for saving curate tracking data.
curate_batch_size
property
writable
#
The batch size of the model run to track curate data for, usually inferred automatically.
last_curate_run_ids
property
writable
#
The run ids of the last model run for which curate tracking data was collected.
DatabaseIndexName
dataclass
#
LookupSettingsSummary
dataclass
#
A summary of lookup settings for a collection of OrcaLookupModule instances that share the same database and index.
Note
The summary doesn’t actually summarize over all possible settings, but instead chooses to ignore the “override” settings (e.g., lookup_result_override, lookup_query_override).
Attributes:
-
lookup_database_name
(str
) –The name of the database used for looking up memories. This is half of the key for the summary; the other half is the memory index name.
-
memory_index_name
(str
) –The name of the index used for looking up memories. This is half of the key for the summary; the other half is the lookup database name.
-
lookup_column_names
(list[str]
) –A list of lookup columns that were requested by any of the [
LookupSettings
] in this summary. -
num_memories_range
(tuple[int, int] | None
) –The range of the number of memories to look up across all [
LookupSettings
] in this summary. This will beNone
ifnum_memories
was not set in any of the [LookupSettings
]. -
drop_exact_match
(list[DropExactMatchOption]
) –The options for dropping exact matches from the results for all [
LookupSettings
] in this summary. This will be[]
if no drop-exact-match options were set. -
exact_match_thresholds
(list[float]
) –The exact-match thresholds for all [
LookupSettings
] in this this summary. This will be[]
if no exact-match thresholds were set. -
shuffle_memories
(list[bool]
) –The shuffle-memories options for all [
LookupSettings
] in this summary. This will be[]
if no shuffle-memories options were set.
Example
__or__
#
Merges a LookupSettings object into the LookupSettingsSummary object.
Parameters:
-
settings
(LookupSettings
) –The LookupSettings object to merge.
Returns:
-
LookupSettingsSummary
(LookupSettingsSummary
) –The merged LookupSettingsSummary object.
from_lookup_settings
classmethod
#
Create a dictionary of LookupSettingsSummary objects from a collection of LookupSettings objects.
This is useful for summarizing the lookup settings for a collection of OrcaLookupModule
instances. The keys of the dictionary are DatabaseIndexName
objects, so we have
a separate summary object for each unique database–index combination.
Parameters:
-
lookup_settings
(Iterable[LookupSettings]
) –An iterable collection LookupSettings objects to summarize.
Returns:
-
dict[DatabaseIndexName, LookupSettingsSummary]
–A dictionary of
DatabaseIndexName
objects to LookupSettingsSummary objects.
LookupSettingsMixin
#
Mixin that adds lookup settings to a class as self.lookup_settings, then provides properties to access the individual settings.
Note
This class is intended to be used with OrcaModule classes, and should not be used directly.
Parameters:
-
lookup_database
(OrcaDatabase | str | None
, default:None
) –The database to use for looking up memories.
-
memory_index_name
(str | None
, default:None
) –The name of the index to use for looking up memories.
-
lookup_column_names
(list[str] | None
, default:None
) –The names of the columns to retrieve for each memory.
-
num_memories
(int | None
, default:None
) –The number of memories to look up.
-
drop_exact_match
(DropExactMatchOption | None
, default:None
) –Whether to drop exact matches from the results.
-
exact_match_threshold
(float | None
, default:None
) –The similarity threshold for exact matches.
-
shuffle_memories
(bool
, default:False
) –Whether to shuffle the looked up memories.
-
freeze_num_memories
(bool
, default:False
) –Whether to freeze the number of memories once set.
-
propagate_lookup_settings
(bool
, default:True
) –Whether to propagate lookup settings to child modules.
lookup_result_transforms
property
writable
#
A list of transforms to apply to the lookup result. NOTE: This will be applied even when lookup_result_override is set.
extra_lookup_column_names
property
writable
#
While set, all lookups will include these additional columns. They may inclue columns on the indexed table as well as index-specific columns, e.g., $score, $embedding.
lookup_query_override
property
writable
#
The query to use instead of performing a lookup. NOTE: This will be ignored if lookup_result_override is also set.
get_effective_lookup_settings
#
Returns the effective lookup settings for this module, with any inherited settings applied.
Returns:
-
LookupSettings
–The effective lookup settings for this module. Practically, this be the lookup settings
-
LookupSettings
–set on this module. For any settings that are not set on this module, the inherited settings
-
LookupSettings
–will be used instead.
LabelColumnNameMixin
#
Mixin that lets the user set a label column for lookup instead of requiring them to set the lookup column names directly. It can be mixed with OrcaModel or OrcaLookupModule classes.
This is useful when the user wants the lookup columns to be ["$embedding", label_column_name]
. The label_column_name
property handles updates to lookup_column_names automatically.
Note
Make sure to set self.label_column_name
AFTER calling super().__init__(...)
in derived modules/models.