Skip to content

orcalib.orca_types#

This module defines the types of columns that can be used to declare tables in the Orca database.

Examples:

>>> TextT
text
>>> DocumentT.notnull
document NOT NULL
>>> IntT.unique
int32 UNIQUE
>>> VectorT[768]
vector[768]
>>> ImageT["PNG"]
image[PNG]
>>> class Sentiment(Enum):
...     neg = 0
...     pos = 1
>>> EnumT[Sentiment]
enum_as_int[neg=0,pos=1]

TextT module-attribute #

TextT = TextTypeHandle()

Represents a text column type

DocumentT module-attribute #

DocumentT = DocumentTypeHandle()

Represents a document column type

A document is a long text that needs to be broken up for indexing

EnumT module-attribute #

EnumT = EnumTypeHandle()

Represents an Enum column type.

This must be used with the [] operator to specify the enum values.

Examples:

>>> class Sentiment(Enum):
...     neg = 0
...     pos = 1
>>> EnumT[Sentiment]
enum_as_int[neg=0,pos=1]
>>> EnumT["foo", "bar"]
enum_as_int[foo=0,bar=1]
>>> EnumT["foo=2", "bar=3"]
enum_as_int[foo=2,bar=3]
>>> EnumT["foo=2", "bar"]
enum_as_int[foo=2,bar=3]
>>> EnumT[("foo", 1), ("bar", 2)]
>>> EnumT[{"foo": 1, "bar": 2}]
enum_as_int[foo=1,bar=2]

IntT module-attribute #

IntT = NumericTypeHandle(dtype=int64)

Represents an integer column type

FloatT module-attribute #

FloatT = NumericTypeHandle(dtype=float32)

Represents a float column type

VectorT module-attribute #

VectorT = Float32T

Represents a vector column type.

This must be used with the [] operator to specify the vector length.

Examples:

>>> VectorT[768]
vector[768]

ImageT module-attribute #

ImageT = ImageTypeHandle(None)

Represents an image column type.

This must be used with the [] operator to specify the image format.

Examples:

>>> ImageT["JPG"]
image[JPG]

OrcaTypeHandle #

OrcaTypeHandle(t_name, notnull=False, unique=False)

Bases: ABC

Base class for all Orca types. Derived classes represent the types of columns in a table.

Parameters:

  • t_name (str) –

    The name of the type, e.g., “vector” or “image”

  • notnull (bool, default: False ) –

    Whether the column must have a value

  • unique (bool, default: False ) –

    Whether the column must have a unique value

is_vector property #

is_vector

True if this is a vector type, False otherwise.

vector_shape property #

vector_shape

The shape of the vector, or None if this is not a vector type.

Examples:

>>> VectorT[768].vector_shape
768
>>> IntT.vector_shape
None

torch_dtype property #

torch_dtype

Corresponding torch type, or None if it doesn’t exist.

numpy_dtype property #

numpy_dtype

Corresponding numpy type, or None if it doesn’t exist.

parameters property #

parameters

The type parameters as a tuple.

Note

These should be in the same order as the parameters in the constructor

notnull property #

notnull

Set a constraint that the column must have a value.

unique property #

unique

Set a constraint that each row in this column must have a unique value.

full_name property #

full_name

The full name of the type, including type parameters.

Examples:

>>> VectorT[768].full_name
'vector[768]'
>>> IntT.full_name
'int32'
>>> ImageT["PNG"].full_name
'image[PNG]'

from_string classmethod #

from_string(type_str)

Parses a type string into an OrcaTypeHandle.

Essentially, you can reconstruct a type after using str(), repr(), or the type’s full_name property.

Parameters:

  • type_str (str) –

    The type string to parse

Returns:

Examples:

>>> OrcaTypeHandle.from_string("vector[128] NOT NULL UNIQUE") == VectorT[128].notnull.unique
True
>>> OrcaTypeHandle.from_string("image[PNG] UNIQUE") == ImageT["PNG"].unique
True

__eq__ #

__eq__(other)

Returns True if the two types are equal, False otherwise.

Parameters:

  • other (Any) –

    The other type to compare with

Returns:

  • bool

    True if the types are equal, False otherwise

__hash__ #

__hash__()

Generate a hash that is unique to the type, including type parameters and constraints.

Returns:

  • int

    Value of the hash

__str__ #

__str__()

Get the type name, including type parameters and constraints.

Returns:

  • str

    The string representation of the type

Examples:

>>> print(VectorT[768].notnull.unique)
vector[768] NOT NULL UNIQUE
>>> print(ImageT["PNG"].unique)
image[PNG] UNIQUE

NumericTypeHandle #

1
2
3
NumericTypeHandle(
    dtype, length=None, notnull=False, unique=False
)

Bases: OrcaTypeHandle

Represents a numeric column type, such as integer or float that has a specific data type, e.g., float16 or int32.

Parameters:

  • dtype (NumericType | NumericTypeAlternative) –

    The numeric type, e.g., int32 or float64

  • length (int | None, default: None ) –

    The length of the vector, or None if this is not a vector type

  • notnull (bool, default: False ) –

    Whether the column must have a value

  • unique (bool, default: False ) –

    Whether the column must have a unique value

notnull property #

notnull

Set a constraint that the column must have a value.

unique property #

unique

Set a constraint that each row in this column must have a unique value.

is_vector property #

is_vector

True if this is a vector type, False otherwise.

vector_shape property #

vector_shape

The shape of the vector, or None if this is not a vector type.

is_scalar property #

is_scalar

True if this is a scalar type, False otherwise.

torch_dtype property #

torch_dtype

The corresponding torch type, or None if it doesn’t exist.

numpy_dtype property #

numpy_dtype

The corresponding numpy type, or None if it doesn’t exist.

full_name property #

full_name

The full name of the type.

Examples:

>>> VectorT[768].full_name
'vector[768]'
>>> IntT.full_name
'int32'

from_string classmethod #

from_string(type_str)

Parses a type string into an OrcaTypeHandle.

Essentially, you can reconstruct a type after using str(), repr(), or the type’s full_name property.

Parameters:

  • type_str (str) –

    The type string to parse

Returns:

Examples:

>>> OrcaTypeHandle.from_string("vector[128] NOT NULL UNIQUE") == VectorT[128].notnull.unique
True
>>> OrcaTypeHandle.from_string("image[PNG] UNIQUE") == ImageT["PNG"].unique
True

__eq__ #

__eq__(other)

Returns True if the two types are equal, False otherwise.

Parameters:

  • other (Any) –

    The other type to compare with

Returns:

  • bool

    True if the types are equal, False otherwise

__hash__ #

__hash__()

Generate a hash that is unique to the type, including type parameters and constraints.

Returns:

  • int

    Value of the hash

__str__ #

__str__()

Get the type name, including type parameters and constraints.

Returns:

  • str

    The string representation of the type

Examples:

>>> print(VectorT[768].notnull.unique)
vector[768] NOT NULL UNIQUE
>>> print(ImageT["PNG"].unique)
image[PNG] UNIQUE

__getitem__ #

__getitem__(length)

Returns a copy of this type with the specified vector length.

Parameters:

  • length (int) –

    The length of the vector

Examples:

>>> VectorT[768].vector_shape
768

CustomSerializable #

Bases: Protocol[T]

Protocol for column types that should be transferred as a file instead of a value.

binary_serialize abstractmethod #

binary_serialize(value)

Serializes the value as a binary stream, so we can send it to the server.

msgpack_deserialize abstractmethod #

msgpack_deserialize(value)

Deserializes the value from a msgpack-compatible dictionary.

TextTypeHandle #

TextTypeHandle(notnull=False, unique=False)

Bases: OrcaTypeHandle

Represents a text column type, such as text or text NOT NULL UNIQUE.

Parameters:

  • notnull (bool, default: False ) –

    Whether the column must have a value

  • unique (bool, default: False ) –

    Whether the column must have a unique value

is_vector property #

is_vector

True if this is a vector type, False otherwise.

vector_shape property #

vector_shape

The shape of the vector, or None if this is not a vector type.

Examples:

>>> VectorT[768].vector_shape
768
>>> IntT.vector_shape
None

torch_dtype property #

torch_dtype

Corresponding torch type, or None if it doesn’t exist.

parameters property #

parameters

The type parameters as a tuple.

Note

These should be in the same order as the parameters in the constructor

notnull property #

notnull

Set a constraint that the column must have a value.

unique property #

unique

Set a constraint that each row in this column must have a unique value.

full_name property #

full_name

The full name of the type, including type parameters.

Examples:

>>> VectorT[768].full_name
'vector[768]'
>>> IntT.full_name
'int32'
>>> ImageT["PNG"].full_name
'image[PNG]'

numpy_dtype property #

numpy_dtype

The corresponding numpy type: np.str_

from_string classmethod #

from_string(type_str)

Parses a type string into an OrcaTypeHandle.

Essentially, you can reconstruct a type after using str(), repr(), or the type’s full_name property.

Parameters:

  • type_str (str) –

    The type string to parse

Returns:

Examples:

>>> OrcaTypeHandle.from_string("vector[128] NOT NULL UNIQUE") == VectorT[128].notnull.unique
True
>>> OrcaTypeHandle.from_string("image[PNG] UNIQUE") == ImageT["PNG"].unique
True

__eq__ #

__eq__(other)

Returns True if the two types are equal, False otherwise.

Parameters:

  • other (Any) –

    The other type to compare with

Returns:

  • bool

    True if the types are equal, False otherwise

__hash__ #

__hash__()

Generate a hash that is unique to the type, including type parameters and constraints.

Returns:

  • int

    Value of the hash

__str__ #

__str__()

Get the type name, including type parameters and constraints.

Returns:

  • str

    The string representation of the type

Examples:

>>> print(VectorT[768].notnull.unique)
vector[768] NOT NULL UNIQUE
>>> print(ImageT["PNG"].unique)
image[PNG] UNIQUE

EnumTypeHandle #

1
2
3
4
5
6
7
EnumTypeHandle(
    store_as_string=False,
    *args,
    notnull=False,
    unique=False,
    name_to_value=None
)

Bases: OrcaTypeHandle

Represents an enum column type, such as enum or enum NOT NULL UNIQUE.

Parameters:

  • store_as_string (bool, default: False ) –

    Whether the enum values are stored as strings (for modeling purposes you will usually want to store them as integers)

  • notnull (bool, default: False ) –

    Whether the column must have a value

  • unique (bool, default: False ) –

    Whether the column must have a unique value

  • name_to_value (dict[str, int] | None, default: None ) –

    A dictionary of name-value pairs for the enum. You can also use the __getitem__ method to specify the values. This is primarily used when cloning the type.

is_vector property #

is_vector

True if this is a vector type, False otherwise.

vector_shape property #

vector_shape

The shape of the vector, or None if this is not a vector type.

Examples:

>>> VectorT[768].vector_shape
768
>>> IntT.vector_shape
None

torch_dtype property #

torch_dtype

Corresponding torch type, or None if it doesn’t exist.

notnull property #

notnull

Set a constraint that the column must have a value.

unique property #

unique

Set a constraint that each row in this column must have a unique value.

full_name property #

full_name

The full name of the type, including type parameters.

Examples:

>>> VectorT[768].full_name
'vector[768]'
>>> IntT.full_name
'int32'
>>> ImageT["PNG"].full_name
'image[PNG]'

parameters property #

parameters

The type parameters as a tuple.

as_string property #

as_string

Set this type to store its values as strings in the database

as_integer property #

as_integer

Set this type to store its values as integers in the database

numpy_dtype property #

numpy_dtype

The corresponding numpy type: np.int64 or np.str_ if the values are stored as strings

from_string classmethod #

from_string(type_str)

Parses a type string into an OrcaTypeHandle.

Essentially, you can reconstruct a type after using str(), repr(), or the type’s full_name property.

Parameters:

  • type_str (str) –

    The type string to parse

Returns:

Examples:

>>> OrcaTypeHandle.from_string("vector[128] NOT NULL UNIQUE") == VectorT[128].notnull.unique
True
>>> OrcaTypeHandle.from_string("image[PNG] UNIQUE") == ImageT["PNG"].unique
True

__eq__ #

__eq__(other)

Returns True if the two types are equal, False otherwise.

Parameters:

  • other (Any) –

    The other type to compare with

Returns:

  • bool

    True if the types are equal, False otherwise

__hash__ #

__hash__()

Generate a hash that is unique to the type, including type parameters and constraints.

Returns:

  • int

    Value of the hash

__str__ #

__str__()

Get the type name, including type parameters and constraints.

Returns:

  • str

    The string representation of the type

Examples:

>>> print(VectorT[768].notnull.unique)
vector[768] NOT NULL UNIQUE
>>> print(ImageT["PNG"].unique)
image[PNG] UNIQUE

__getitem__ #

__getitem__(*args)

Set the enum values for this type.

Examples:

>>> class Sentiment(Enum):
...     neg = 0
...     pos = 1
>>> EnumT[Sentiment]
enum_as_int[neg=0,pos=1]
>>> EnumT["foo", "bar"]
enum_as_int[foo=0,bar=1]
>>> EnumT["foo=2", "bar=3"]
enum_as_int[foo=2,bar=3]
>>> EnumT["foo=2", "bar"]
enum_as_int[foo=2,bar=3]
>>> EnumT[("foo", 1), ("bar", 2)]
>>> EnumT[{"foo": 1, "bar": 2}]
enum_as_int[foo=1,bar=2]

DocumentTypeHandle #

DocumentTypeHandle(notnull=False, unique=False)

Bases: OrcaTypeHandle

Represents a document column type.

Parameters:

  • notnull (bool, default: False ) –

    Whether the column must have a value

  • unique (bool, default: False ) –

    Whether the column must have a unique value

is_vector property #

is_vector

True if this is a vector type, False otherwise.

vector_shape property #

vector_shape

The shape of the vector, or None if this is not a vector type.

Examples:

>>> VectorT[768].vector_shape
768
>>> IntT.vector_shape
None

torch_dtype property #

torch_dtype

Corresponding torch type, or None if it doesn’t exist.

parameters property #

parameters

The type parameters as a tuple.

Note

These should be in the same order as the parameters in the constructor

notnull property #

notnull

Set a constraint that the column must have a value.

unique property #

unique

Set a constraint that each row in this column must have a unique value.

full_name property #

full_name

The full name of the type, including type parameters.

Examples:

>>> VectorT[768].full_name
'vector[768]'
>>> IntT.full_name
'int32'
>>> ImageT["PNG"].full_name
'image[PNG]'

numpy_dtype property #

numpy_dtype

The corresponding numpy type np.str_

from_string classmethod #

from_string(type_str)

Parses a type string into an OrcaTypeHandle.

Essentially, you can reconstruct a type after using str(), repr(), or the type’s full_name property.

Parameters:

  • type_str (str) –

    The type string to parse

Returns:

Examples:

>>> OrcaTypeHandle.from_string("vector[128] NOT NULL UNIQUE") == VectorT[128].notnull.unique
True
>>> OrcaTypeHandle.from_string("image[PNG] UNIQUE") == ImageT["PNG"].unique
True

__eq__ #

__eq__(other)

Returns True if the two types are equal, False otherwise.

Parameters:

  • other (Any) –

    The other type to compare with

Returns:

  • bool

    True if the types are equal, False otherwise

__hash__ #

__hash__()

Generate a hash that is unique to the type, including type parameters and constraints.

Returns:

  • int

    Value of the hash

__str__ #

__str__()

Get the type name, including type parameters and constraints.

Returns:

  • str

    The string representation of the type

Examples:

>>> print(VectorT[768].notnull.unique)
vector[768] NOT NULL UNIQUE
>>> print(ImageT["PNG"].unique)
image[PNG] UNIQUE

ImageTypeHandle #

ImageTypeHandle(format, notnull=False, unique=False)

Bases: OrcaTypeHandle, CustomSerializable[Image]

Represents an image column type

Parameters:

  • format (ImageFormat | str | None) –

    The image format, for example “PNG” or ImageFormat.PNG

  • notnull (bool, default: False ) –

    Whether the column must have a value

  • unique (bool, default: False ) –

    Whether the column must have a unique value

is_vector property #

is_vector

True if this is a vector type, False otherwise.

vector_shape property #

vector_shape

The shape of the vector, or None if this is not a vector type.

Examples:

>>> VectorT[768].vector_shape
768
>>> IntT.vector_shape
None

torch_dtype property #

torch_dtype

Corresponding torch type, or None if it doesn’t exist.

numpy_dtype property #

numpy_dtype

Corresponding numpy type, or None if it doesn’t exist.

notnull property #

notnull

Set a constraint that the column must have a value.

unique property #

unique

Set a constraint that each row in this column must have a unique value.

full_name property #

full_name

The full name of the type, including type parameters.

Examples:

>>> VectorT[768].full_name
'vector[768]'
>>> IntT.full_name
'int32'
>>> ImageT["PNG"].full_name
'image[PNG]'

parameters property #

parameters

The type parameters as a tuple.

from_string classmethod #

from_string(type_str)

Parses a type string into an OrcaTypeHandle.

Essentially, you can reconstruct a type after using str(), repr(), or the type’s full_name property.

Parameters:

  • type_str (str) –

    The type string to parse

Returns:

Examples:

>>> OrcaTypeHandle.from_string("vector[128] NOT NULL UNIQUE") == VectorT[128].notnull.unique
True
>>> OrcaTypeHandle.from_string("image[PNG] UNIQUE") == ImageT["PNG"].unique
True

__eq__ #

__eq__(other)

Returns True if the two types are equal, False otherwise.

Parameters:

  • other (Any) –

    The other type to compare with

Returns:

  • bool

    True if the types are equal, False otherwise

__hash__ #

__hash__()

Generate a hash that is unique to the type, including type parameters and constraints.

Returns:

  • int

    Value of the hash

__str__ #

__str__()

Get the type name, including type parameters and constraints.

Returns:

  • str

    The string representation of the type

Examples:

>>> print(VectorT[768].notnull.unique)
vector[768] NOT NULL UNIQUE
>>> print(ImageT["PNG"].unique)
image[PNG] UNIQUE

__getitem__ #

__getitem__(format)

Sets the image format for this type.

Examples:

>>> ImageT["JPG"]
image[JPG]

binary_serialize #

binary_serialize(value)

Serializes the image as a binary stream, to send it to the server.

Parameters:

  • value (Image) –

    The image to serialize

Returns:

msgpack_deserialize #

msgpack_deserialize(value)

Deserializes the image from a msgpack-compatible dictionary.

Parameters:

  • value (dict[str, Any]) –

    The msgpack-compatible dictionary to deserialize

Returns:

  • Image

    The deserialized image