Skip to content

orcalib.index_query#

DefaultIndexQuery #

DefaultIndexQuery(
    db_name,
    primary_table,
    index,
    index_query,
    index_value=None,
    drop_exact_match=False,
    exact_match_threshold=EXACT_MATCH_THRESHOLD,
    **kwargs
)

Bases: TableQuery['DefaultIndexQuery']

A query on a (for now) single table.

This is used to build up a query and then execute it.

Parameters:

  • db_name (str) –

    The name of the database to query.

  • primary_table (TableHandle) –

    The primary table to query.

  • index (IndexName) –

    The name of the index to query.

  • index_query (Any) –

    The value to query the index for.

  • index_value (ColumnName | None, default: None ) –

    The name of the column to store the index value in. If None, the index value is not stored.

  • drop_exact_match (bool, default: False ) –

    Whether or not to drop exact matches.

  • exact_match_threshold (float, default: EXACT_MATCH_THRESHOLD ) –

    The threshold at which to drop exact matches

limit #

limit(limit)

Limits the number of rows returned by the query.

Parameters:

  • limit (int) –

    The maximum number of rows to return

Returns:

  • T

    query handle for chaining

Examples:

>>> query.limit(1).fetch()
[{"column1": "value1", "column2": "value2"}]

select #

select(*columns)

Selects the given columns from the table. If no columns are specified, all columns are selected.

Parameters:

Returns:

  • T

    query handle for chaining

Examples:

>>> query.select("column1", "column2").fetch(1)
[{"column1": "value1", "column2": "value2"}]

fetch #

fetch(limit=None)

Fetch the results of this query

Parameters:

  • limit (int | None, default: None ) –

    The maximum number of rows to return

Returns:

  • list[RowDict]

    The results of this query as a list of dictionaries mapping column names to values

df #

df(limit=None, explode=False)

Fetch the results of this query as a pandas DataFrame

Parameters:

  • limit (int | None, default: None ) –

    The maximum number of rows to return

  • explode (bool, default: False ) –

    Whether to explode the index_value column (if it exists) into multiple rows

Returns:

  • DataFrame

    The results of this query as a pandas DataFrame

VectorIndexQuery #

VectorIndexQuery(
    db_name,
    primary_table,
    index,
    index_query,
    drop_exact_match=False,
    exact_match_threshold=EXACT_MATCH_THRESHOLD,
    curate_run_ids=None,
    curate_layer_name=None,
    columns=None,
    filter=None,
    order_by_columns=None,
    limit=None,
    default_order=Order.ASCENDING,
)

Bases: TableQuery['VectorIndexQuery']

A query on a (for now) single table. This is used to build up a query and then execute it with .fetch()

Parameters:

  • db_name (str) –

    The name of the database to query.

  • primary_table (TableHandle) –

    The primary table to query.

  • columns (list[ColumnName] | None, default: None ) –

    The columns to select

  • filter (OrcaExpr | None, default: None ) –

    The filter to apply to the query.

  • order_by_columns (OrderByColumns | None, default: None ) –

    The columns to order by.

  • limit (int | None, default: None ) –

    The maximum number of rows to return.

  • default_order (Order, default: ASCENDING ) –

    The default order to use with “order_by” if no order is specified.

  • index (IndexName) –

    The name of the index to query.

  • index_query (OrcaExpr) –

    The value to query the index for.

  • drop_exact_match (bool, default: False ) –

    Whether to drop the exact match from the results.

  • exact_match_threshold (float, default: EXACT_MATCH_THRESHOLD ) –

    The minimum threshold for dropping the exact match.

  • curate_run_ids (list[int] | None, default: None ) –

    The run ids to use for curate.

  • curate_layer_name (str | None, default: None ) –

    The layer name to use for curate.

limit #

limit(limit)

Limits the number of rows returned by the query.

Parameters:

  • limit (int) –

    The maximum number of rows to return

Returns:

  • T

    query handle for chaining

Examples:

>>> query.limit(1).fetch()
[{"column1": "value1", "column2": "value2"}]

select #

select(*columns)

Selects the given columns from the table. If no columns are specified, all columns are selected.

Parameters:

Returns:

  • T

    query handle for chaining

Examples:

>>> query.select("column1", "column2").fetch(1)
[{"column1": "value1", "column2": "value2"}]

df #

df(limit)

Fetch rows from the table and return as a DataFrame

Parameters:

  • limit (int | None) –

    The maximum number of rows to return

Returns:

  • DataFrame

    A DataFrame containing the rows

fetch #

fetch(limit=None)

Fetch the results of this query

Parameters:

  • limit (int | None, default: None ) –

    The maximum number of rows to return

Returns:

track_with_curate #

track_with_curate(run_ids, layer_name)

Enable curate tracking for the memories in this query

Parameters:

  • run_ids (list[int]) –

    The ids of the model runs to track these memory lookups under

  • layer_name (str) –

    The name of the model layer performing the lookup

Returns: