Skip to content

orcalib.database#

OrcaDatabase #

1
2
3
OrcaDatabase(
    uri=None, api_key=None, secret_key=None, name=None
)
Note

This will create a database with the given name if it doesn’t exist yet.

Parameters:

  • uri (str | None, default: None ) –

    URL of the database instance to connect to or name of the database. If empty, the ORCADB_URL environment variable is used instead. If a string is provided, it is interpreted as the name of the database.

  • api_key (str | None, default: None ) –

    API key for the OrcaDB instance. If not provided, the ORCADB_API_KEY environment variable or the credentials encoded in the uri are used

  • secret_key (str | None, default: None ) –

    Secret key for the OrcaDB instance. If not provided, the ORCADB_SECRET_KEY environment variable or the credentials encoded in the uri are used.

  • name (str | None, default: None ) –

    Name of the database. Do not provide this if it is already encoded in the uri.

Examples:

Infer connection details from the ORCADB_URL, ORCADB_API_KEY, and ORCADB_SECRET_KEY environment variables:

>>> import os
>>> os.environ["ORCADB_URL"] = "https://<my-api-key>:<my-secret-key>@instance.orcadb.cloud/my-db"
>>> OrcaDatabase()
OrcaDatabase(name="my-db")
>>> OrcaDatabase("my-database")
OrcaDatabase(name="my-database")

All connection details can be fully encoded in the the uri:

>>> OrcaDatabase("https://<my-api-key>:<my-secret-key>@instance.orcadb.cloud/my-db")
OrcaDatabase(name="my-db")

Or they can be provided explicitly:

>>> OrcaDatabase(
...    "https://instance.orcadb.cloud",
...    api_key="my-api-key",
...    secret_key="my-secret-key",
...    name="my-other-db"
... )
OrcaDatabase(name="my-other-db")

__contains__ #

__contains__(table_name)

Check if a table exists in the database

Parameters:

  • table_name (str) –

    name of the table

__getitem__ #

__getitem__(table_name)

Get a handle to a table by name

Parameters:

  • table_name (str) –

    name of the table

Returns:

get_table #

get_table(table_name)

Get a handle to a table by name

Parameters:

  • table_name (str) –

    name of the table

Returns:

list_databases staticmethod #

list_databases()

List all databases on the server

Returns:

  • list[str]

    list of database names

is_server_up classmethod #

is_server_up()

Check if the server is up and running

Returns:

  • bool

    True if server is up, False otherwise

drop #

drop()

Drop the database

drop_database classmethod #

drop_database(db, ignore_db_not_found=False)

Drops a database by name or using the OrcaDatabase object

Parameters:

  • db (str | OrcaDatabase) –

    name of the database or OrcaDatabase object to drop

  • ignore_db_not_found (bool, default: False ) –

    if True, ignore error if database doesn’t exist and continue with the operation anyway

exists classmethod #

exists(db)

Checks if a database exists by name or using the OrcaDatabase object

Parameters:

Returns:

  • bool

    True if database exists, False otherwise

restore staticmethod #

restore(target_db_name, backup_name, checksum=None)

Restore a backup into a target database

Careful:

This will overwrite the target database if it already exists.

Parameters:

  • target_db_name (str) –

    name of database that backup will be restored into (will be created if it doesn’t exist)

  • backup_name (str) –

    name of the backup to restore

  • checksum (str | None, default: None ) –

    optionally the checksum for the backup

Returns:

list_tables #

list_tables()

List all tables in the database

Returns:

  • list[str]

    list of table names

backup #

backup()

Create a backup of the database

Returns:

  • backup_name ( str ) –

    name of the backup

  • checksum ( str ) –

    checksum for the backup

download_backup staticmethod #

download_backup(backup_file_name)

Downloads the backup of the database

Parameters:

  • backup_file_name (str) –

    name of the backup file

Returns:

  • Response

    backed up file

upload_backup staticmethod #

upload_backup(file_path)

Uploads tar file of the database

Parameters:

  • file_path (str) –

    path to the tar file

Returns:

  • Response

    Upload response

delete_backup staticmethod #

delete_backup(backup_file_name)

Delete backup file

Parameters:

  • backup_file_name (str) –

    name of the backup file

Returns:

  • Response

    delete response

create_table #

1
2
3
4
5
create_table(
    table_name,
    if_table_exists=TableCreateMode.ERROR_IF_TABLE_EXISTS,
    **columns
)

Create a table in the database

Parameters:

  • table_name (str) –

    name of the table

  • if_table_exists (TableCreateMode, default: ERROR_IF_TABLE_EXISTS ) –

    what to do if the table already exists

  • **columns (OrcaTypeHandle, default: {} ) –

    column names and types

Returns:

get_index_status #

get_index_status(index_name)

Get the status of an index

Parameters:

  • index_name (str) –

    name of the index

Returns:

get_index #

get_index(index_name)

Get a handle to an index by name

Parameters:

  • index_name (str) –

    name of the index

Returns:

create_vector_index #

1
2
3
4
5
6
7
create_vector_index(
    index_name,
    table_name,
    column,
    ann_index_type="hnswlib",
    error_if_exists=True,
)

Create a vector index on a table

Parameters:

  • index_name (str) –

    name of the index

  • table_name (str) –

    name of the table

  • column (str) –

    name of the column

  • error_if_exists (bool, default: True ) –

    if True, raise an error if the index already exists

Returns:

create_document_index #

1
2
3
4
5
6
7
8
create_document_index(
    index_name,
    table_name,
    column,
    ann_index_type="hnswlib",
    error_if_exists=True,
    embedding_model=EmbeddingModel.SENTENCE_TRANSFORMER,
)

Create a document index on a table

Parameters:

  • index_name (str) –

    name of the index

  • table_name (str) –

    name of the table

  • column (str) –

    name of the column

  • error_if_exists (bool, default: True ) –

    if True, raise an error if the index already exists

  • embedding_model (EmbeddingModel | None, default: SENTENCE_TRANSFORMER ) –

    embedding model to use

Returns:

create_text_index #

1
2
3
4
5
6
7
8
create_text_index(
    index_name,
    table_name,
    column,
    ann_index_type="hnswlib",
    error_if_exists=True,
    embedding_model=EmbeddingModel.SENTENCE_TRANSFORMER,
)

Create a text index on a table

Parameters:

  • index_name (str) –

    name of the index

  • table_name (str) –

    name of the table

  • column (str) –

    name of the column

  • error_if_exists (bool, default: True ) –

    if True, raise an error if the index already exists

  • embedding_model (EmbeddingModel | None, default: SENTENCE_TRANSFORMER ) –

    embedding model to use

Returns:

create_btree_index #

1
2
3
4
5
6
7
create_btree_index(
    index_name,
    table_name,
    column,
    ann_index_type="hnswlib",
    error_if_exists=True,
)

Create a btree index on a table

Parameters:

  • index_name (str) –

    name of the index

  • table_name (str) –

    name of the table

  • column (str) –

    name of the column

  • error_if_exists (bool, default: True ) –

    if True, raise an error if the index already exists

Returns:

drop_index #

drop_index(index_name, error_if_not_exists=True)

Drop an index from the database

Parameters:

  • index_name (str) –

    name of the index

  • error_if_not_exists (bool, default: True ) –

    if True, raise an error if the index doesn’t exist

drop_table #

drop_table(table_name, error_if_not_exists=True)

Drop a table from the database

Parameters:

  • table_name (str) –

    name of the table

  • error_if_not_exists (bool, default: True ) –

    if True, raise an error if the table doesn’t exist

search_memory #

search_memory(index_name, query, limit, columns)

Search a given index for memories related to a query

This is a convenience method that wraps the scan_index method to perform a quick search on a given index. For more advanced queries, use the orcalib.index_handle.IndexHandle.scan or orcalib.index_handle.IndexHandle.vector_scan methods directly.

Parameters:

  • index_name (str) –

    The name of the index to search

  • query (list[float] | str) –

    Query value for the index, can either be a vector represented as a list of floats, or a value that matches the column type of the index, for example a string index, this can must match the column type this index is defined on, for example this would be a string if this is a text index

  • limit (int) –

    maximum number of results to return

  • columns (list[str]) –

    list of columns to return in the result

Returns:

Examples:

>>> db.search_memory(
...     "text_index",
...     query="Are Orcas really whales?",
...     limit=1,
...     columns=["id", "text"]
... )
[
    {
        'id': 1,
        'text': "Despite being commonly known as killer whales, orcas are actually the largest member of the dolphin family."
    }
]

scan_index #

1
2
3
4
5
6
scan_index(
    index_name,
    query,
    drop_exact_match=False,
    exact_match_threshold=EXACT_MATCH_THRESHOLD,
)

Entry point for a search query on the index

Parameters:

  • index_name (str) –

    name of the index

Note

See IndexHandle.scan for details.

vector_scan_index #

1
2
3
4
5
6
vector_scan_index(
    index_name,
    query,
    drop_exact_match=False,
    exact_match_threshold=EXACT_MATCH_THRESHOLD,
)

Entry point for a vector search query on the index that returns a results batch

Parameters:

  • index_name (str) –

    name of the index

Note

See IndexHandle.scan for details.

full_vector_memory_join #

full_vector_memory_join(
    *,
    index_name,
    memory_index_name,
    num_memories,
    query_columns,
    page_index,
    page_size,
    drop_exact_match=False,
    exact_match_threshold=EXACT_MATCH_THRESHOLD,
    shuffle_memories=False
)

Join a vector index with a memory index

Parameters:

  • index_name (str) –

    name of the index

  • memory_index_name (str) –

    name of the memory index

  • num_memories (int) –

    number of memories to join

  • query_columns (list[str]) –

    list of columns to return

  • page_index (int) –

    page index

  • page_size (int) –

    page size

Returns:

  • PagedResponse

    dictionary containing the joined vectors and extra columns

query #

query(query, params=[])

Send a raw SQL read query to the database

This cannot be used for inserting, updating, or deleting data.

Parameters:

  • query (str) –

    SQL query to run

  • params (list[None | int | float | bytes | str], default: [] ) –

    optional values to pass to a parametrized query

Returns:

  • DataFrame

    pandas DataFrame containing the results