负载

Qdrant 的一个重要特性是能够存储与向量关联的额外信息。在 Qdrant 的术语中,这些信息被称为 payload(有效载荷)。

Qdrant 允许你存储任何可以使用 JSON 表示的信息。

以下是一个典型有效载荷的示例

{
    "name": "jacket",
    "colors": ["red", "blue"],
    "count": 10,
    "price": 11.99,
    "locations": [
        {
            "lon": 52.5200, 
            "lat": 13.4050
        }
    ],
    "reviews": [
        {
            "user": "alice",
            "score": 4
        },
        {
            "user": "bob",
            "score": 5
        }
    ]
}

有效载荷类型

除了存储有效载荷外,Qdrant 还允许你根据特定类型的值进行搜索。此功能作为搜索过程中的额外过滤器来实现,使你能够在语义相似度的基础上结合自定义逻辑。

在过滤期间,Qdrant 会检查与过滤条件类型匹配的值。如果存储的值类型不符合过滤条件,则该条件将被视为不满足。

例如,如果你对字符串数据应用 范围条件 (range condition),你将得到一个空结果。

然而,数组(同一类型的多个值)的处理方式略有不同。当我们对数组应用过滤器时,只要数组中至少有一个值满足条件,过滤就会成功。

有关过滤过程的详细讨论,请参阅 过滤 (Filtering) 一节。

让我们看看 Qdrant 支持搜索的数据类型

整数 (Integer)

integer - 64 位整数,范围从 -92233720368547758089223372036854775807

单个和多个 integer 值的示例

{
    "count": 10,
    "sizes": [35, 36, 38]
}

浮点数 (Float)

float - 64 位浮点数。

单个和多个 float 值的示例

{
    "price": 11.99,
    "ratings": [9.1, 9.2, 9.4]
}

布尔值 (Bool)

Bool - 二进制值。等于 truefalse

单个和多个 bool 值的示例

{
    "is_delivered": true,
    "responses": [false, false, true, false]
}

关键字 (Keyword)

keyword - 字符串值。

单个和多个 keyword 值的示例

{
    "name": "Alice",
    "friends": [
        "bob",
        "eva",
        "jack"
    ]
}

地理位置

geo 用于表示地理坐标。

单个和多个 geo 值的示例

{
    "location": {
        "lon": 52.5200,
        "lat": 13.4050
    },
    "cities": [
        {
            "lon": 51.5072,
            "lat": 0.1276
        },
        {
            "lon": 40.7128,
            "lat": 74.0060
        }
    ]
}

坐标应描述为一个包含两个字段的对象:lon(经度)和 lat(纬度)。

日期时间 (Datetime)

自 v1.8.0 起可用

datetime - RFC 3339 格式的日期和时间。

请参阅以下单个和多个 datetime 值的示例

{
    "created_at": "2023-02-08T10:49:00Z",
    "updated_at": [
        "2023-02-08T13:52:00Z",
        "2023-02-21T21:23:00Z"
    ]
}

支持以下格式

  • "2023-02-08T10:49:00Z" (RFC 3339, UTC)
  • "2023-02-08T11:49:00+01:00" (RFC 3339, 带时区)
  • "2023-02-08T10:49:00" (无时区,默认视为 UTC)
  • "2023-02-08T10:49" (无时区和秒)
  • "2023-02-08" (仅日期,默认视为午夜)

关于格式的说明

  • T 可以替换为空格。
  • TZ 符号不区分大小写。
  • 未指定时区时,始终假设为 UTC。
  • 时区可以采用以下格式:±HH:MM±HHMM±HHZ
  • 秒数最多可保留 6 位小数,因此 datetime 的最高精度为微秒。

UUID

自 v1.11.0 起可用

除了基本的 keyword 类型外,Qdrant 还支持 uuid 类型用于存储 UUID 值。在功能上,它与 keyword 相同,但在内部存储已解析的 UUID 值。

{
    "uuid": "550e8400-e29b-41d4-a716-446655440000",
    "uuids": [
        "550e8400-e29b-41d4-a716-446655440000",
        "550e8400-e29b-41d4-a716-446655440001"
    ]
}

UUID 的字符串表示(例如 550e8400-e29b-41d4-a716-446655440000)占用 36 字节。但当使用数字表示时,仅占用 128 位(16 字节)。

建议在有效载荷密集的集合中使用 uuid 索引类型,以节省内存并提高搜索性能。

创建带有有效载荷的点

REST API (Schema)

PUT /collections/{collection_name}/points
{
    "points": [
        {
            "id": 1,
            "vector": [0.05, 0.61, 0.76, 0.74],
            "payload": {"city": "Berlin", "price": 1.99}
        },
        {
            "id": 2,
            "vector": [0.19, 0.81, 0.75, 0.11],
            "payload": {"city": ["Berlin", "London"], "price": 1.99}
        },
        {
            "id": 3,
            "vector": [0.36, 0.55, 0.47, 0.94],
            "payload": {"city": ["Berlin", "Moscow"], "price": [1.99, 2.99]}
        }
    ]
}
from qdrant_client import QdrantClient, models

client = QdrantClient(url="https://:6333")

client.upsert(
    collection_name="{collection_name}",
    points=[
        models.PointStruct(
            id=1,
            vector=[0.05, 0.61, 0.76, 0.74],
            payload={
                "city": "Berlin",
                "price": 1.99,
            },
        ),
        models.PointStruct(
            id=2,
            vector=[0.19, 0.81, 0.75, 0.11],
            payload={
                "city": ["Berlin", "London"],
                "price": 1.99,
            },
        ),
        models.PointStruct(
            id=3,
            vector=[0.36, 0.55, 0.47, 0.94],
            payload={
                "city": ["Berlin", "Moscow"],
                "price": [1.99, 2.99],
            },
        ),
    ],
)
import { QdrantClient } from "@qdrant/js-client-rest";

const client = new QdrantClient({ host: "localhost", port: 6333 });

client.upsert("{collection_name}", {
  points: [
    {
      id: 1,
      vector: [0.05, 0.61, 0.76, 0.74],
      payload: {
        city: "Berlin",
        price: 1.99,
      },
    },
    {
      id: 2,
      vector: [0.19, 0.81, 0.75, 0.11],
      payload: {
        city: ["Berlin", "London"],
        price: 1.99,
      },
    },
    {
      id: 3,
      vector: [0.36, 0.55, 0.47, 0.94],
      payload: {
        city: ["Berlin", "Moscow"],
        price: [1.99, 2.99],
      },
    },
  ],
});
use qdrant_client::qdrant::{PointStruct, UpsertPointsBuilder};
use qdrant_client::{Payload, Qdrant};
use serde_json::json;

let client = Qdrant::from_url("https://:6334").build()?;

let points = vec![
    PointStruct::new(
        1,
        vec![0.05, 0.61, 0.76, 0.74],
        Payload::try_from(json!({"city": "Berlin", "price": 1.99})).unwrap(),
    ),
    PointStruct::new(
        2,
        vec![0.19, 0.81, 0.75, 0.11],
        Payload::try_from(json!({"city": ["Berlin", "London"]})).unwrap(),
    ),
    PointStruct::new(
        3,
        vec![0.36, 0.55, 0.47, 0.94],
        Payload::try_from(json!({"city": ["Berlin", "Moscow"], "price": [1.99, 2.99]}))
            .unwrap(),
    ),
];

client
    .upsert_points(UpsertPointsBuilder::new("{collection_name}", points).wait(true))
    .await?;
import static io.qdrant.client.PointIdFactory.id;
import static io.qdrant.client.ValueFactory.list;
import static io.qdrant.client.ValueFactory.value;
import static io.qdrant.client.VectorsFactory.vectors;

import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Points.PointStruct;
import java.util.List;
import java.util.Map;

QdrantClient client =
    new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());

client
    .upsertAsync(
        "{collection_name}",
        List.of(
            PointStruct.newBuilder()
                .setId(id(1))
                .setVectors(vectors(0.05f, 0.61f, 0.76f, 0.74f))
                .putAllPayload(Map.of("city", value("Berlin"), "price", value(1.99)))
                .build(),
            PointStruct.newBuilder()
                .setId(id(2))
                .setVectors(vectors(0.19f, 0.81f, 0.75f, 0.11f))
                .putAllPayload(
                    Map.of("city", list(List.of(value("Berlin"), value("London")))))
                .build(),
            PointStruct.newBuilder()
                .setId(id(3))
                .setVectors(vectors(0.36f, 0.55f, 0.47f, 0.94f))
                .putAllPayload(
                    Map.of(
                        "city",
                        list(List.of(value("Berlin"), value("London"))),
                        "price",
                        list(List.of(value(1.99), value(2.99)))))
                .build()))
    .get();
using Qdrant.Client;
using Qdrant.Client.Grpc;

var client = new QdrantClient("localhost", 6334);

await client.UpsertAsync(
    collectionName: "{collection_name}",
    points: new List<PointStruct>
    {
        new PointStruct
        {
            Id = 1,
            Vectors = new[] { 0.05f, 0.61f, 0.76f, 0.74f },
            Payload = { ["city"] = "Berlin", ["price"] = 1.99 }
        },
        new PointStruct
        {
            Id = 2,
            Vectors = new[] { 0.19f, 0.81f, 0.75f, 0.11f },
            Payload = { ["city"] = new[] { "Berlin", "London" } }
        },
        new PointStruct
        {
            Id = 3,
            Vectors = new[] { 0.36f, 0.55f, 0.47f, 0.94f },
            Payload =
            {
                ["city"] = new[] { "Berlin", "Moscow" },
                ["price"] = new Value[] { 1.99, 2.99 }
            }
        }
    }
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

client.Upsert(context.Background(), &qdrant.UpsertPoints{
    CollectionName: "{collection_name}",
    Points: []*qdrant.PointStruct{
        {
            Id:      qdrant.NewIDNum(1),
            Vectors: qdrant.NewVectors(0.05, 0.61, 0.76, 0.74),
            Payload: qdrant.NewValueMap(map[string]any{
                "city": "Berlin", "price": 1.99}),
        },
        {
            Id:      qdrant.NewIDNum(2),
            Vectors: qdrant.NewVectors(0.19, 0.81, 0.75, 0.11),
            Payload: qdrant.NewValueMap(map[string]any{
                "city": []any{"Berlin", "London"}}),
        },
        {
            Id:      qdrant.NewIDNum(3),
            Vectors: qdrant.NewVectors(0.36, 0.55, 0.47, 0.94),
            Payload: qdrant.NewValueMap(map[string]any{
                "city":  []any{"Berlin", "London"},
                "price": []any{1.99, 2.99}}),
        },
    },
})

更新有效载荷

在 Qdrant 中更新有效载荷提供了管理向量元数据的灵活方法。设置有效载荷 (set payload) 方法用于更新特定字段,同时保持其他字段不变;而 覆盖 (overwrite) 方法则替换整个有效载荷。开发人员还可以使用 清除有效载荷 (clear payload) 来删除所有元数据,或通过删除字段来移除特定键而不影响其余部分。这些选项为适应动态数据集提供了精确控制。

设置有效载荷

仅设置点上的给定有效载荷值。

REST API (Schema)

POST /collections/{collection_name}/points/payload
{
    "payload": {
        "property1": "string",
        "property2": "string"
    },
    "points": [
        0, 3, 100
    ]
}
client.set_payload(
    collection_name="{collection_name}",
    payload={
        "property1": "string",
        "property2": "string",
    },
    points=[0, 3, 10],
)
client.setPayload("{collection_name}", {
  payload: {
    property1: "string",
    property2: "string",
  },
  points: [0, 3, 10],
});
use qdrant_client::qdrant::{
    PointsIdsList, SetPayloadPointsBuilder,
};
use qdrant_client::Payload;
use serde_json::json;

client
    .set_payload(
        SetPayloadPointsBuilder::new(
            "{collection_name}",
            Payload::try_from(json!({
                "property1": "string",
                "property2": "string",
            }))
            .unwrap(),
        )
        .points_selector(PointsIdsList {
            ids: vec![0.into(), 3.into(), 10.into()],
        })
        .wait(true),
    )
    .await?;
import static io.qdrant.client.PointIdFactory.id;
import static io.qdrant.client.ValueFactory.value;

import java.util.List;
import java.util.Map;

client
    .setPayloadAsync(
        "{collection_name}",
        Map.of("property1", value("string"), "property2", value("string")),
        List.of(id(0), id(3), id(10)),
        true,
        null,
        null)
    .get();
using Qdrant.Client;
using Qdrant.Client.Grpc;

var client = new QdrantClient("localhost", 6334);

await client.SetPayloadAsync(
    collectionName: "{collection_name}",
    payload: new Dictionary<string, Value> { { "property1", "string" }, { "property2", "string" } },
    ids: new ulong[] { 0, 3, 10 }
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

client.SetPayload(context.Background(), &qdrant.SetPayloadPoints{
    CollectionName: "{collection_name}",
    Payload: qdrant.NewValueMap(
        map[string]any{"property1": "string", "property2": "string"}),
    PointsSelector: qdrant.NewPointsSelector(
        qdrant.NewIDNum(0),
        qdrant.NewIDNum(3)),
})

你不需要知道想要修改的点的 ID。另一种选择是使用过滤器。

POST /collections/{collection_name}/points/payload
{
    "payload": {
        "property1": "string",
        "property2": "string"
    },
    "filter": {
        "must": [
            {
                "key": "color",
                "match": {
                    "value": "red"
                }
            }
        ]
    }
}
client.set_payload(
    collection_name="{collection_name}",
    payload={
        "property1": "string",
        "property2": "string",
    },
    points=models.Filter(
        must=[
            models.FieldCondition(
                key="color",
                match=models.MatchValue(value="red"),
            ),
        ],
    ),
)
client.setPayload("{collection_name}", {
  payload: {
    property1: "string",
    property2: "string",
  },
  filter: {
    must: [
      {
        key: "color",
        match: {
          value: "red",
        },
      },
    ],
  },
});
use qdrant_client::qdrant::{Condition, Filter, SetPayloadPointsBuilder};
use qdrant_client::Payload;
use serde_json::json;

client
    .set_payload(
        SetPayloadPointsBuilder::new(
            "{collection_name}",
            Payload::try_from(json!({
                "property1": "string",
                "property2": "string",
            }))
            .unwrap(),
        )
        .points_selector(Filter::must([Condition::matches(
            "color",
            "red".to_string(),
        )]))
        .wait(true),
    )
    .await?;
import static io.qdrant.client.ConditionFactory.matchKeyword;
import static io.qdrant.client.ValueFactory.value;

import io.qdrant.client.grpc.Common.Filter;
import java.util.Map;

client
    .setPayloadAsync(
        "{collection_name}",
        Map.of("property1", value("string"), "property2", value("string")),
        Filter.newBuilder().addMust(matchKeyword("color", "red")).build(),
        true,
        null,
        null)
    .get();
using Qdrant.Client;
using Qdrant.Client.Grpc;
using static Qdrant.Client.Grpc.Conditions;

var client = new QdrantClient("localhost", 6334);

await client.SetPayloadAsync(
    collectionName: "{collection_name}",
    payload: new Dictionary<string, Value> { { "property1", "string" }, { "property2", "string" } },
    filter: MatchKeyword("color", "red")
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

client.SetPayload(context.Background(), &qdrant.SetPayloadPoints{
    CollectionName: "{collection_name}",
    Payload: qdrant.NewValueMap(
        map[string]any{"property1": "string", "property2": "string"}),
    PointsSelector: qdrant.NewPointsSelectorFilter(&qdrant.Filter{
        Must: []*qdrant.Condition{
            qdrant.NewMatch("color", "red"),
        },
    }),
})

自 v1.8.0 起可用

可以使用 key 参数仅修改有效载荷的特定键。

例如,假设点上存在以下有效载荷 JSON 对象

{
    "property1": {
        "nested_property": "foo",
    },
    "property2": {
        "nested_property": "bar",
    }
}

你可以通过以下请求修改 property1nested_property

POST /collections/{collection_name}/points/payload
{
    "payload": {
        "nested_property": "qux",
    },
    "key": "property1",
    "points": [1]
}

从而得到以下有效载荷

{
    "property1": {
        "nested_property": "qux",
    },
    "property2": {
        "nested_property": "bar",
    }
}

覆盖有效载荷

用给定的有效载荷完全替换现有的有效载荷。

REST API (Schema)

PUT /collections/{collection_name}/points/payload
{
    "payload": {
        "property1": "string",
        "property2": "string"
    },
    "points": [
        0, 3, 100
    ]
}
client.overwrite_payload(
    collection_name="{collection_name}",
    payload={
        "property1": "string",
        "property2": "string",
    },
    points=[0, 3, 10],
)
client.overwritePayload("{collection_name}", {
  payload: {
    property1: "string",
    property2: "string",
  },
  points: [0, 3, 10],
});
use qdrant_client::qdrant::{PointsIdsList, SetPayloadPointsBuilder};
use qdrant_client::Payload;
use serde_json::json;

client
    .overwrite_payload(
        SetPayloadPointsBuilder::new(
            "{collection_name}",
            Payload::try_from(json!({
                "property1": "string",
                "property2": "string",
            }))
            .unwrap(),
        )
        .points_selector(PointsIdsList {
            ids: vec![0.into(), 3.into(), 10.into()],
        })
        .wait(true),
    )
    .await?;
import static io.qdrant.client.PointIdFactory.id;
import static io.qdrant.client.ValueFactory.value;

import java.util.List;
import java.util.Map;

client
    .overwritePayloadAsync(
        "{collection_name}",
        Map.of("property1", value("string"), "property2", value("string")),
        List.of(id(0), id(3), id(10)),
        true,
        null,
        null)
    .get();
using Qdrant.Client;
using Qdrant.Client.Grpc;

var client = new QdrantClient("localhost", 6334);

await client.OverwritePayloadAsync(
    collectionName: "{collection_name}",
    payload: new Dictionary<string, Value> { { "property1", "string" }, { "property2", "string" } },
    ids: new ulong[] { 0, 3, 10 }
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

client.OverwritePayload(context.Background(), &qdrant.SetPayloadPoints{
    CollectionName: "{collection_name}",
    Payload: qdrant.NewValueMap(
        map[string]any{"property1": "string", "property2": "string"}),
    PointsSelector: qdrant.NewPointsSelector(
        qdrant.NewIDNum(0),
        qdrant.NewIDNum(3)),
})

设置有效载荷 一样,你不需要知道要修改点的 ID。另一种选择是使用过滤器。

清除有效载荷

此方法从指定点删除所有有效载荷键。

REST API (Schema)

POST /collections/{collection_name}/points/payload/clear
{
    "points": [0, 3, 100]
}
client.clear_payload(
    collection_name="{collection_name}",
    points_selector=[0, 3, 100],
)
client.clearPayload("{collection_name}", {
  points: [0, 3, 100],
});
use qdrant_client::qdrant::{ClearPayloadPointsBuilder, PointsIdsList};

client
    .clear_payload(
        ClearPayloadPointsBuilder::new("{collection_name}")
            .points(PointsIdsList {
                ids: vec![0.into(), 3.into(), 10.into()],
            })
            .wait(true),
    )
    .await?;
import static io.qdrant.client.PointIdFactory.id;

import java.util.List;

client
    .clearPayloadAsync("{collection_name}", List.of(id(0), id(3), id(100)), true, null, null)
    .get();
using Qdrant.Client;

var client = new QdrantClient("localhost", 6334);

await client.ClearPayloadAsync(collectionName: "{collection_name}", ids: new ulong[] { 0, 3, 100 });
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client.ClearPayload(context.Background(), &qdrant.ClearPayloadPoints{
    CollectionName: "{collection_name}",
    Points: qdrant.NewPointsSelector(
        qdrant.NewIDNum(0),
        qdrant.NewIDNum(3)),
})

删除有效载荷键

从点中删除特定的有效载荷键。

REST API (Schema)

POST /collections/{collection_name}/points/payload/delete
{
    "keys": ["color", "price"],
    "points": [0, 3, 100]
}
client.delete_payload(
    collection_name="{collection_name}",
    keys=["color", "price"],
    points=[0, 3, 100],
)
client.deletePayload("{collection_name}", {
  keys: ["color", "price"],
  points: [0, 3, 100],
});
use qdrant_client::qdrant::{DeletePayloadPointsBuilder, PointsIdsList};

client
    .delete_payload(
        DeletePayloadPointsBuilder::new(
            "{collection_name}",
            vec!["color".to_string(), "price".to_string()],
        )
        .points_selector(PointsIdsList {
            ids: vec![0.into(), 3.into(), 10.into()],
        })
        .wait(true),
    )
    .await?;
import static io.qdrant.client.PointIdFactory.id;

import java.util.List;

client
    .deletePayloadAsync(
        "{collection_name}",
        List.of("color", "price"),
        List.of(id(0), id(3), id(100)),
        true,
        null,
        null)
    .get();
using Qdrant.Client;

var client = new QdrantClient("localhost", 6334);

await client.DeletePayloadAsync(
    collectionName: "{collection_name}",
    keys: ["color", "price"],
    ids: new ulong[] { 0, 3, 100 }
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

client.DeletePayload(context.Background(), &qdrant.DeletePayloadPoints{
    CollectionName: "{collection_name}",
    Keys:           []string{"color", "price"},
    PointsSelector: qdrant.NewPointsSelector(
        qdrant.NewIDNum(0),
        qdrant.NewIDNum(3)),
})

或者,你可以使用过滤器从点中删除有效载荷键。

POST /collections/{collection_name}/points/payload/delete
{
    "keys": ["color", "price"],
    "filter": {
        "must": [
            {
                "key": "color",
                "match": {
                    "value": "red"
                }
            }
        ]
    }
}
client.delete_payload(
    collection_name="{collection_name}",
    keys=["color", "price"],
    points=models.Filter(
        must=[
            models.FieldCondition(
                key="color",
                match=models.MatchValue(value="red"),
            ),
        ],
    ),
)
client.deletePayload("{collection_name}", {
  keys: ["color", "price"],
  filter: {
    must: [
      {
        key: "color",
        match: {
          value: "red",
        },
      },
    ],
  },
});
use qdrant_client::qdrant::{Condition, DeletePayloadPointsBuilder, Filter};

client
    .delete_payload(
        DeletePayloadPointsBuilder::new(
            "{collection_name}",
            vec!["color".to_string(), "price".to_string()],
        )
        .points_selector(Filter::must([Condition::matches(
            "color",
            "red".to_string(),
        )]))
        .wait(true),
    )
    .await?;
import static io.qdrant.client.ConditionFactory.matchKeyword;

import io.qdrant.client.grpc.Common.Filter;
import java.util.List;

client
    .deletePayloadAsync(
        "{collection_name}",
        List.of("color", "price"),
        Filter.newBuilder().addMust(matchKeyword("color", "red")).build(),
        true,
        null,
        null)
    .get();
using Qdrant.Client;
using static Qdrant.Client.Grpc.Conditions;

var client = new QdrantClient("localhost", 6334);

await client.DeletePayloadAsync(
    collectionName: "{collection_name}",
    keys: ["color", "price"],
    filter: MatchKeyword("color", "red")
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

client.DeletePayload(context.Background(), &qdrant.DeletePayloadPoints{
    CollectionName: "{collection_name}",
    Keys:           []string{"color", "price"},
    PointsSelector: qdrant.NewPointsSelectorFilter(
        &qdrant.Filter{
            Must: []*qdrant.Condition{qdrant.NewMatch("color", "red")},
        },
    ),
})

有效载荷索引

为了更有效地进行带过滤器的搜索,Qdrant 允许你通过指定字段名称和类型来为有效载荷字段创建索引。

已索引的字段也会影响向量索引。详见 索引 (Indexing)

在实践中,我们建议对那些最有可能限制结果范围的字段创建索引。例如,使用对象 ID 的索引将比颜色索引高效得多,因为 ID 对每条记录都是唯一的,而颜色只有几种可能的值。

在涉及多个字段的复合查询中,Qdrant 将尝试首先使用限制性最强的索引。

要为字段创建索引,可以使用以下方法

REST API (Schema)

PUT /collections/{collection_name}/index
{
    "field_name": "name_of_the_field_to_index",
    "field_schema": "keyword"
}
client.create_payload_index(
    collection_name="{collection_name}",
    field_name="name_of_the_field_to_index",
    field_schema=models.PayloadSchemaType.KEYWORD,
)
client.createPayloadIndex("{collection_name}", {
  field_name: "name_of_the_field_to_index",
  field_schema: "keyword",
});
use qdrant_client::qdrant::{CreateFieldIndexCollectionBuilder, FieldType};

client
    .create_field_index(
        CreateFieldIndexCollectionBuilder::new(
            "{collection_name}",
            "name_of_the_field_to_index",
            FieldType::Keyword,
        )
        .wait(true),
    )
    .await?;
import io.qdrant.client.grpc.Collections.PayloadSchemaType;

client.createPayloadIndexAsync(
    "{collection_name}",
    "name_of_the_field_to_index",
    PayloadSchemaType.Keyword,
    null,
    true,
    null,
    null);
using Qdrant.Client;

var client = new QdrantClient("localhost", 6334);

await client.CreatePayloadIndexAsync(
    collectionName: "{collection_name}",
    fieldName: "name_of_the_field_to_index"
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

client.CreateFieldIndex(context.Background(), &qdrant.CreateFieldIndexCollection{
    CollectionName: "{collection_name}",
    FieldName:      "name_of_the_field_to_index",
    FieldType:      qdrant.FieldType_FieldTypeKeyword.Enum(),
})

索引使用标志会显示在 集合信息 API (collection info API) 的有效载荷模式中。

有效载荷模式示例

{
    "payload_schema": {
        "property1": {
            "data_type": "keyword"
        },
        "property2": {
            "data_type": "integer"
        }
    }
}

分面统计 (Facet counts)

v1.12.0 及更高版本可用

分面 (Faceting) 是一种特殊的计数技术,可用于多种目的

  • 了解有效载荷键存在哪些唯一值。
  • 了解包含每个唯一值的点数。
  • 了解通过匹配特定值,过滤器的限制程度如何。

具体而言,它是针对字段中值的计数聚合,类似于 SQL 中的 GROUP BYCOUNT(*) 命令。

特定字段的这些结果称为“分面 (facet)”。例如,当你查看电子商务搜索结果页面时,你可能会在侧边栏看到品牌列表,显示每个品牌的商品数量。这就是 "brand" 字段的一个分面。

要获取字段的分面统计,可以使用以下方法

REST API (Facet)

POST /collections/{collection_name}/facet
{
    "key": "size",
    "filter": {
      "must": {
        "key": "color",
        "match": { "value": "red" }
      }
    }
}
from qdrant_client import QdrantClient, models

client = QdrantClient(url="https://:6333")

client.facet(
    collection_name="{collection_name}",
    key="size",
    facet_filter=models.Filter(
        must=[
            models.FieldCondition(
                key="color",
                match=models.MatchValue(value="red"),
            )
        ]
    ),
)
import { QdrantClient } from "@qdrant/js-client-rest";

const client = new QdrantClient({ host: "localhost", port: 6333 });

client.facet("{collection_name}", {
    filter: {
        must: [
            {
                key: "color",
                match: {
                    value: "red",
                },
            },
        ],
    },
    key: "size",
});
use qdrant_client::qdrant::{Condition, FacetCountsBuilder, Filter};
use qdrant_client::Qdrant;

let client = Qdrant::from_url("https://:6334").build()?;

client
    .facet(
         FacetCountsBuilder::new("{collection_name}", "size")
             .limit(10)
             .filter(Filter::must(vec![Condition::matches(
                 "color",
                 "red".to_string(),
             )])),
     )
     .await?;
import static io.qdrant.client.ConditionFactory.matchKeyword;

import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Common.Filter;
import io.qdrant.client.grpc.Points;

QdrantClient client = new QdrantClient(
                QdrantGrpcClient.newBuilder("localhost", 6334, false).build());

client
    .facetAsync(
        Points.FacetCounts.newBuilder()
            .setCollectionName("{collection_name}")
            .setKey("size")
            .setFilter(Filter.newBuilder().addMust(matchKeyword("color", "red")).build())
            .build())
        .get();
using Qdrant.Client;
using static Qdrant.Client.Grpc.Conditions;

var client = new QdrantClient("localhost", 6334);

await client.FacetAsync(
    "{collection_name}",
    key: "size",
    filter: MatchKeyword("color", "red")
);
import (
    "context"

    "github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
    Host: "localhost",
    Port: 6334,
})

res, err := client.Facet(context.Background(), &qdrant.FacetCounts{
    CollectionName: "{collection_name}",
    Key:            "size",
        Filter: &qdrant.Filter{
        Must: []*qdrant.Condition{
            qdrant.NewMatch("color", "red"),
        },
    },
})

响应将包含字段中每个唯一值的计数

{
  "response": {
    "hits": [
      {"value": "L", "count": 19},
      {"value": "S", "count": 10},
      {"value": "M", "count": 5},
      {"value": "XL", "count": 1},
      {"value": "XXL", "count": 1}
    ]
  },
  "time": 0.0001
}

结果按计数降序排列,然后按值升序排列。仅返回计数非零的值。

默认情况下,Qdrant 计算每个值计数的方式是近似值,以实现快速响应。对于大多数情况,这已经足够准确,但如果你需要调试存储,可以使用 exact 参数来获取精确计数。

POST /collections/{collection_name}/facet
{
    "key": "size",
    "exact": true
}
client.facet(
    collection_name="{collection_name}",
    key="size",
    exact=True,
)
client.facet("{collection_name}", {
    key: "size",
    exact: true,
});
use qdrant_client::qdrant::FacetCountsBuilder;

client
    .facet(
         FacetCountsBuilder::new("{collection_name}", "size")
             .limit(10)
             .exact(true),
     )
     .await?;
import io.qdrant.client.grpc.Points.FacetCounts;

client
      .facetAsync(
          FacetCounts.newBuilder()
              .setCollectionName("{collection_name}")
              .setKey("foo")
              .setExact(true)
              .build())
      .get();
using Qdrant.Client;
using Qdrant.Client.Grpc;

await client.FacetAsync(
    "{collection_name}",
    key: "size",
    exact: true
);
res, err := client.Facet(context.Background(), &qdrant.FacetCounts{
    CollectionName: "{collection_name}",
    Key:            "key",
    Exact:          qdrant.PtrOf(true),
})
此页面有用吗?

感谢您的反馈!🙏

对此感到遗憾。😔 你可以在 GitHub 上 编辑 此页面,或 创建 一个 GitHub 问题。