Snowflake

Qdrant 支持使用 Snowflake 文本嵌入模型。您可以在 HuggingFace 上找到所有可用模型。

设置 Qdrant 和 Snowflake 模型

from qdrant_client import QdrantClient
from fastembed import TextEmbedding

qclient = QdrantClient(":memory:")
embedding_model = TextEmbedding("snowflake/snowflake-arctic-embed-s")

texts = [
    "Qdrant is the best vector search engine!",
    "Loved by Enterprises and everyone building for low latency, high performance, and scale.",
]
import {QdrantClient} from '@qdrant/js-client-rest';
import { pipeline } from '@xenova/transformers';

const client = new QdrantClient({ url: 'http://localhost:6333' });

const extractor = await pipeline('feature-extraction', 'Snowflake/snowflake-arctic-embed-s');

const texts = [
    "Qdrant is the best vector search engine!",
    "Loved by Enterprises and everyone building for low latency, high performance, and scale.",
]

以下示例展示了如何使用 snowflake-arctic-embed-s 模型嵌入文档,该模型生成大小为 384 的句子嵌入。

嵌入文档

embeddings = embedding_model.embed(texts)
const embeddings = await extractor(texts, { normalize: true, pooling: 'cls' });

将模型输出转换为 Qdrant 点

from qdrant_client.models import PointStruct

points = [
    PointStruct(
        id=idx,
        vector=embedding,
        payload={"text": text},
    )
    for idx, (embedding, text) in enumerate(zip(embeddings, texts))
]
let points = embeddings.tolist().map((embedding, i) => {
    return {
        id: i,
        vector: embedding,
        payload: {
            text: texts[i]
        }
    }
});

创建集合以插入文档

from qdrant_client.models import VectorParams, Distance

COLLECTION_NAME = "example_collection"

qclient.create_collection(
    COLLECTION_NAME,
    vectors_config=VectorParams(
        size=384,
        distance=Distance.COSINE,
    ),
)
qclient.upsert(COLLECTION_NAME, points)
const COLLECTION_NAME = "example_collection"

await client.createCollection(COLLECTION_NAME, {
    vectors: {
        size: 384,
        distance: 'Cosine',
    }
});

await client.upsert(COLLECTION_NAME, {
    wait: true,
    points
});

使用 Qdrant 搜索文档

文档添加后,您可以搜索最相关的文档。

query_embedding = next(embedding_model.query_embed("What is the best to use for vector search scaling?"))

qclient.search(
    collection_name=COLLECTION_NAME,
    query_vector=query_embedding,
)
const query_embedding = await extractor("What is the best to use for vector search scaling?", {
    normalize: true,
    pooling: 'cls'
});

await client.search(COLLECTION_NAME, {
    vector: query_embedding.tolist()[0],
});
此页面有帮助吗?

感谢您的反馈! 🙏

我们很抱歉听到这个消息。😔 您可以在 GitHub 上编辑此页面,或创建一个 GitHub Issue。