添加注册登录功能

This commit is contained in:
2025-08-29 00:34:40 +08:00
parent 09065f2ce7
commit 2fe3474d9e
3060 changed files with 29217 additions and 87137 deletions

View File

@@ -722,11 +722,230 @@ The ``oracle_compress`` parameter accepts either an integer specifying the
number of prefix columns to compress, or ``True`` to use the default (all
columns for non-unique indexes, all but the last column for unique indexes).
.. _oracle_vector_datatype:
VECTOR Datatype
---------------
Oracle Database 23ai introduced a new VECTOR datatype for artificial intelligence
and machine learning search operations. The VECTOR datatype is a homogeneous array
of 8-bit signed integers, 8-bit unsigned integers (binary), 32-bit floating-point
numbers, or 64-bit floating-point numbers.
A vector's storage type can be either DENSE or SPARSE. A dense vector contains
meaningful values in most or all of its dimensions. In contrast, a sparse vector
has non-zero values in only a few dimensions, with the majority being zero.
Sparse vectors are represented by the total number of vector dimensions, an array
of indices, and an array of values where each values location in the vector is
indicated by the corresponding indices array position. All other vector values are
treated as zero.
The storage formats that can be used with sparse vectors are float32, float64, and
int8. Note that the binary storage format cannot be used with sparse vectors.
Sparse vectors are supported when you are using Oracle Database 23.7 or later.
.. seealso::
`Using VECTOR Data
<https://python-oracledb.readthedocs.io/en/latest/user_guide/vector_data_type.html>`_ - in the documentation
for the :ref:`oracledb` driver.
.. versionadded:: 2.0.41 - Added VECTOR datatype
.. versionadded:: 2.0.43 - Added DENSE/SPARSE support
CREATE TABLE support for VECTOR
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With the :class:`.VECTOR` datatype, you can specify the number of dimensions,
the storage format, and the storage type for the data. Valid values for the
storage format are enum members of :class:`.VectorStorageFormat`. Valid values
for the storage type are enum members of :class:`.VectorStorageType`. If
storage type is not specified, a DENSE vector is created by default.
To create a table that includes a :class:`.VECTOR` column::
from sqlalchemy.dialects.oracle import (
VECTOR,
VectorStorageFormat,
VectorStorageType,
)
t = Table(
"t1",
metadata,
Column("id", Integer, primary_key=True),
Column(
"embedding",
VECTOR(
dim=3,
storage_format=VectorStorageFormat.FLOAT32,
storage_type=VectorStorageType.SPARSE,
),
),
Column(...),
...,
)
Vectors can also be defined with an arbitrary number of dimensions and formats.
This allows you to specify vectors of different dimensions with the various
storage formats mentioned below.
**Examples**
* In this case, the storage format is flexible, allowing any vector type data to be
inserted, such as INT8 or BINARY etc::
vector_col: Mapped[array.array] = mapped_column(VECTOR(dim=3))
* The dimension is flexible in this case, meaning that any dimension vector can
be used::
vector_col: Mapped[array.array] = mapped_column(
VECTOR(storage_format=VectorStorageType.INT8)
)
* Both the dimensions and the storage format are flexible. It creates a DENSE vector::
vector_col: Mapped[array.array] = mapped_column(VECTOR)
* To create a SPARSE vector with both dimensions and the storage format as flexible,
use the :attr:`.VectorStorageType.SPARSE` storage type::
vector_col: Mapped[array.array] = mapped_column(
VECTOR(storage_type=VectorStorageType.SPARSE)
)
Python Datatypes for VECTOR
~~~~~~~~~~~~~~~~~~~~~~~~~~~
VECTOR data can be inserted using Python list or Python ``array.array()`` objects.
Python arrays of type FLOAT (32-bit), DOUBLE (64-bit), INT (8-bit signed integers),
or BINARY (8-bit unsigned integers) are used as bind values when inserting
VECTOR columns::
from sqlalchemy import insert, select
with engine.begin() as conn:
conn.execute(
insert(t1),
{"id": 1, "embedding": [1, 2, 3]},
)
Data can be inserted into a sparse vector using the :class:`_oracle.SparseVector`
class, creating an object consisting of the number of dimensions, an array of indices, and a
corresponding array of values::
from sqlalchemy import insert, select
from sqlalchemy.dialects.oracle import SparseVector
sparse_val = SparseVector(10, [1, 2], array.array("d", [23.45, 221.22]))
with engine.begin() as conn:
conn.execute(
insert(t1),
{"id": 1, "embedding": sparse_val},
)
VECTOR Indexes
~~~~~~~~~~~~~~
The VECTOR feature supports an Oracle-specific parameter ``oracle_vector``
on the :class:`.Index` construct, which allows the construction of VECTOR
indexes.
SPARSE vectors cannot be used in the creation of vector indexes.
To utilize VECTOR indexing, set the ``oracle_vector`` parameter to True to use
the default values provided by Oracle. HNSW is the default indexing method::
from sqlalchemy import Index
Index(
"vector_index",
t1.c.embedding,
oracle_vector=True,
)
The full range of parameters for vector indexes are available by using the
:class:`.VectorIndexConfig` dataclass in place of a boolean; this dataclass
allows full configuration of the index::
Index(
"hnsw_vector_index",
t1.c.embedding,
oracle_vector=VectorIndexConfig(
index_type=VectorIndexType.HNSW,
distance=VectorDistanceType.COSINE,
accuracy=90,
hnsw_neighbors=5,
hnsw_efconstruction=20,
parallel=10,
),
)
Index(
"ivf_vector_index",
t1.c.embedding,
oracle_vector=VectorIndexConfig(
index_type=VectorIndexType.IVF,
distance=VectorDistanceType.DOT,
accuracy=90,
ivf_neighbor_partitions=5,
),
)
For complete explanation of these parameters, see the Oracle documentation linked
below.
.. seealso::
`CREATE VECTOR INDEX <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=GUID-B396C369-54BB-4098-A0DD-7C54B3A0D66F>`_ - in the Oracle documentation
Similarity Searching
~~~~~~~~~~~~~~~~~~~~
When using the :class:`_oracle.VECTOR` datatype with a :class:`.Column` or similar
ORM mapped construct, additional comparison functions are available, including:
* ``l2_distance``
* ``cosine_distance``
* ``inner_product``
Example Usage::
result_vector = connection.scalars(
select(t1).order_by(t1.embedding.l2_distance([2, 3, 4])).limit(3)
)
for user in vector:
print(user.id, user.embedding)
FETCH APPROXIMATE support
~~~~~~~~~~~~~~~~~~~~~~~~~
Approximate vector search can only be performed when all syntax and semantic
rules are satisfied, the corresponding vector index is available, and the
query optimizer determines to perform it. If any of these conditions are
unmet, then an approximate search is not performed. In this case the query
returns exact results.
To enable approximate searching during similarity searches on VECTORS, the
``oracle_fetch_approximate`` parameter may be used with the :meth:`.Select.fetch`
clause to add ``FETCH APPROX`` to the SELECT statement::
select(users_table).fetch(5, oracle_fetch_approximate=True)
""" # noqa
from __future__ import annotations
from collections import defaultdict
from dataclasses import fields
from functools import lru_cache
from functools import wraps
import re
@@ -749,6 +968,9 @@ from .types import RAW
from .types import ROWID # noqa
from .types import TIMESTAMP
from .types import VARCHAR2 # noqa
from .vector import VECTOR
from .vector import VectorIndexConfig
from .vector import VectorIndexType
from ... import Computed
from ... import exc
from ... import schema as sa_schema
@@ -767,6 +989,7 @@ from ...sql import func
from ...sql import null
from ...sql import or_
from ...sql import select
from ...sql import selectable as sa_selectable
from ...sql import sqltypes
from ...sql import util as sql_util
from ...sql import visitors
@@ -828,6 +1051,7 @@ ischema_names = {
"BINARY_DOUBLE": BINARY_DOUBLE,
"BINARY_FLOAT": BINARY_FLOAT,
"ROWID": ROWID,
"VECTOR": VECTOR,
}
@@ -985,6 +1209,18 @@ class OracleTypeCompiler(compiler.GenericTypeCompiler):
def visit_ROWID(self, type_, **kw):
return "ROWID"
def visit_VECTOR(self, type_, **kw):
dim = type_.dim if type_.dim is not None else "*"
storage_format = (
type_.storage_format.value
if type_.storage_format is not None
else "*"
)
storage_type = (
type_.storage_type.value if type_.storage_type is not None else "*"
)
return f"VECTOR({dim},{storage_format},{storage_type})"
class OracleCompiler(compiler.SQLCompiler):
"""Oracle compiler modifies the lexical structure of Select
@@ -1223,6 +1459,29 @@ class OracleCompiler(compiler.SQLCompiler):
else:
return select._fetch_clause
def fetch_clause(
self,
select,
fetch_clause=None,
require_offset=False,
use_literal_execute_for_simple_int=False,
**kw,
):
text = super().fetch_clause(
select,
fetch_clause=fetch_clause,
require_offset=require_offset,
use_literal_execute_for_simple_int=(
use_literal_execute_for_simple_int
),
**kw,
)
if select.dialect_options["oracle"]["fetch_approximate"]:
text = re.sub("FETCH FIRST", "FETCH APPROX FIRST", text)
return text
def translate_select_structure(self, select_stmt, **kwargs):
select = select_stmt
@@ -1471,6 +1730,48 @@ class OracleCompiler(compiler.SQLCompiler):
class OracleDDLCompiler(compiler.DDLCompiler):
def _build_vector_index_config(
self, vector_index_config: VectorIndexConfig
) -> str:
parts = []
sql_param_name = {
"hnsw_neighbors": "neighbors",
"hnsw_efconstruction": "efconstruction",
"ivf_neighbor_partitions": "neighbor partitions",
"ivf_sample_per_partition": "sample_per_partition",
"ivf_min_vectors_per_partition": "min_vectors_per_partition",
}
if vector_index_config.index_type == VectorIndexType.HNSW:
parts.append("ORGANIZATION INMEMORY NEIGHBOR GRAPH")
elif vector_index_config.index_type == VectorIndexType.IVF:
parts.append("ORGANIZATION NEIGHBOR PARTITIONS")
if vector_index_config.distance is not None:
parts.append(f"DISTANCE {vector_index_config.distance.value}")
if vector_index_config.accuracy is not None:
parts.append(
f"WITH TARGET ACCURACY {vector_index_config.accuracy}"
)
parameters_str = [f"type {vector_index_config.index_type.name}"]
prefix = vector_index_config.index_type.name.lower() + "_"
for field in fields(vector_index_config):
if field.name.startswith(prefix):
key = sql_param_name.get(field.name)
value = getattr(vector_index_config, field.name)
if value is not None:
parameters_str.append(f"{key} {value}")
parameters_str = ", ".join(parameters_str)
parts.append(f"PARAMETERS ({parameters_str})")
if vector_index_config.parallel is not None:
parts.append(f"PARALLEL {vector_index_config.parallel}")
return " ".join(parts)
def define_constraint_cascades(self, constraint):
text = ""
if constraint.ondelete is not None:
@@ -1503,6 +1804,9 @@ class OracleDDLCompiler(compiler.DDLCompiler):
text += "UNIQUE "
if index.dialect_options["oracle"]["bitmap"]:
text += "BITMAP "
vector_options = index.dialect_options["oracle"]["vector"]
if vector_options:
text += "VECTOR "
text += "INDEX %s ON %s (%s)" % (
self._prepared_index_name(index, include_schema=True),
preparer.format_table(index.table, use_schema=True),
@@ -1520,6 +1824,11 @@ class OracleDDLCompiler(compiler.DDLCompiler):
text += " COMPRESS %d" % (
index.dialect_options["oracle"]["compress"]
)
if vector_options:
if vector_options is True:
vector_options = VectorIndexConfig()
text += " " + self._build_vector_index_config(vector_options)
return text
def post_create_table(self, table):
@@ -1670,7 +1979,16 @@ class OracleDialect(default.DefaultDialect):
"tablespace": None,
},
),
(sa_schema.Index, {"bitmap": False, "compress": False}),
(
sa_schema.Index,
{
"bitmap": False,
"compress": False,
"vector": False,
},
),
(sa_selectable.Select, {"fetch_approximate": False}),
(sa_selectable.CompoundSelect, {"fetch_approximate": False}),
]
@util.deprecated_params(