为 Cohere RAG 实现自定义连接器

时间：45 分钟	级别：中级

实现检索增强生成 (Retrieval Augmented Generation, RAG) 的常用方法要求用户用 LLM 可能依赖的相关上下文构建提示，并手动将其发送给模型。Cohere 在这方面非常独特，因为他们的模型现在可以与外部工具对话并自行提取有意义的数据。您几乎可以将任何数据源连接起来，并让 Cohere LLM 知道如何访问它。显然，向量搜索与 LLM 非常契合，并且对您的数据启用语义搜索是一个典型案例。

Cohere RAG 有许多有趣的功能，例如内联引用，这有助于您引用用于生成响应的文档特定部分。

Cohere RAG citations

来源：https://docs.cohere.com/docs/retrieval-augmented-generation-rag

连接器必须实现特定的接口，并将数据源暴露为 HTTP REST API。Cohere 文档描述了创建连接器的一般过程。本教程将指导您逐步构建围绕 Qdrant 的此类服务。

Qdrant 连接器

您可能已经有一些想要提供给 LLM 的集合。也许您的管道是使用一些流行的库（如 Langchain、Llama Index 或 Haystack）设置的。Cohere 连接器甚至可以实现更复杂的逻辑，例如混合搜索。在我们的案例中，我们将从一个新的 Qdrant 集合开始，使用 Cohere Embed v3 索引数据，构建连接器，最后将其与 Command-R 模型连接起来。

构建集合

首先，让我们构建一个集合并将其配置为使用 Cohere embed-multilingual-v3.0 模型。它生成 1024 维的嵌入，我们可以选择 Qdrant 中可用的任何距离指标。我们的连接器将充当软件工程师的个人助理，并将暴露我们的笔记以建议优先级或要执行的操作。

from qdrant_client import QdrantClient, models

client = QdrantClient(
    "https://my-cluster.cloud.qdrant.io:6333", 
    api_key="my-api-key",
)
client.create_collection(
    collection_name="personal-notes",
    vectors_config=models.VectorParams(
        size=1024,
        distance=models.Distance.DOT,
    ),
)

我们的笔记将表示为带有特定笔记的 title 和 text 的简单 JSON 对象。嵌入将仅从 text 字段创建。

notes = [
    {
        "title": "Project Alpha Review",
        "text": "Review the current progress of Project Alpha, focusing on the integration of the new API. Check for any compatibility issues with the existing system and document the steps needed to resolve them. Schedule a meeting with the development team to discuss the timeline and any potential roadblocks."
    },
    {
        "title": "Learning Path Update",
        "text": "Update the learning path document with the latest courses on React and Node.js from Pluralsight. Schedule at least 2 hours weekly to dedicate to these courses. Aim to complete the React course by the end of the month and the Node.js course by mid-next month."
    },
    {
        "title": "Weekly Team Meeting Agenda",
        "text": "Prepare the agenda for the weekly team meeting. Include the following topics: project updates, review of the sprint backlog, discussion on the new feature requests, and a brainstorming session for improving remote work practices. Send out the agenda and the Zoom link by Thursday afternoon."
    },
    {
        "title": "Code Review Process Improvement",
        "text": "Analyze the current code review process to identify inefficiencies. Consider adopting a new tool that integrates with our version control system. Explore options such as GitHub Actions for automating parts of the process. Draft a proposal with recommendations and share it with the team for feedback."
    },
    {
        "title": "Cloud Migration Strategy",
        "text": "Draft a plan for migrating our current on-premise infrastructure to the cloud. The plan should cover the selection of a cloud provider, cost analysis, and a phased migration approach. Identify critical applications for the first phase and any potential risks or challenges. Schedule a meeting with the IT department to discuss the plan."
    },
    {
        "title": "Quarterly Goals Review",
        "text": "Review the progress towards the quarterly goals. Update the documentation to reflect any completed objectives and outline steps for any remaining goals. Schedule individual meetings with team members to discuss their contributions and any support they might need to achieve their targets."
    },
    {
        "title": "Personal Development Plan",
        "text": "Reflect on the past quarter's achievements and areas for improvement. Update the personal development plan to include new technical skills to learn, certifications to pursue, and networking events to attend. Set realistic timelines and check-in points to monitor progress."
    },
    {
        "title": "End-of-Year Performance Reviews",
        "text": "Start preparing for the end-of-year performance reviews. Collect feedback from peers and managers, review project contributions, and document achievements. Consider areas for improvement and set goals for the next year. Schedule preliminary discussions with each team member to gather their self-assessments."
    },
    {
        "title": "Technology Stack Evaluation",
        "text": "Conduct an evaluation of our current technology stack to identify any outdated technologies or tools that could be replaced for better performance and productivity. Research emerging technologies that might benefit our projects. Prepare a report with findings and recommendations to present to the management team."
    },
    {
        "title": "Team Building Event Planning",
        "text": "Plan a team-building event for the next quarter. Consider activities that can be done remotely, such as virtual escape rooms or online game nights. Survey the team for their preferences and availability. Draft a budget proposal for the event and submit it for approval."
    }
]

存储嵌入及其元数据非常简单。

import cohere
import uuid

cohere_client = cohere.Client(api_key="my-cohere-api-key")

response = cohere_client.embed(
    texts=[
        note.get("text")
        for note in notes
    ],
    model="embed-multilingual-v3.0",
    input_type="search_document",
)

client.upload_points(
    collection_name="personal-notes",
    points=[
        models.PointStruct(
            id=uuid.uuid4().hex,
            vector=embedding,
            payload=note,
        )
        for note, embedding in zip(notes, response.embeddings)
    ]
)

我们的集合现在可以进行搜索了。在现实世界中，笔记集合会随着时间而变化，因此摄取过程不会如此简单。这些数据尚未暴露给 LLM，但我们将在下一步构建连接器。

连接器 Web 服务

FastAPI 是一个现代化的 Web 框架，非常适合构建简单的 HTTP API。我们将使用它来构建我们的连接器。根据模型的需要，将只有一个端点。它将接受 /search 路径上的 POST 请求。需要一个 query 参数。让我们定义一个相应的模型。

from pydantic import BaseModel

class SearchQuery(BaseModel):
    query: str

RAG 连接器不需要以任何特定格式返回文档。尽管有一些推荐的最佳实践，但 Cohere 模型在这方面非常灵活。结果只需以 JSON 格式返回，输出的 results 属性中包含一个对象列表。我们将使用与 Qdrant payload 相同的文档结构，因此无需转换。这需要创建另外两个模型。

from typing import List

class Document(BaseModel):
    title: str
    text: str

class SearchResults(BaseModel):
    results: List[Document]

模型类准备好后，我们就可以实现获取查询并提供相关笔记的逻辑了。请注意，LLM 不会定义要返回的文档数量。完全取决于您想在上下文中包含多少文档。

我们还需要与两个服务进行交互——Qdrant 服务器和 Cohere API。FastAPI 有一个依赖注入的概念，我们将使用它将两个客户端提供给实现。

对于查询，在调用 Cohere API 时，我们需要将 input_type 设置为 search_query。

from fastapi import FastAPI, Depends
from typing import Annotated

app = FastAPI()

def client() -> QdrantClient:
    return QdrantClient(config.QDRANT_URL, api_key=config.QDRANT_API_KEY)

def cohere_client() -> cohere.Client:
    return cohere.Client(api_key=config.COHERE_API_KEY)

@app.post("/search")
def search(
    query: SearchQuery,
    client: Annotated[QdrantClient, Depends(client)],
    cohere_client: Annotated[cohere.Client, Depends(cohere_client)],
) -> SearchResults:
    response = cohere_client.embed(
        texts=[query.query],
        model="embed-multilingual-v3.0",
        input_type="search_query",
    )
    results = client.query_points(
        collection_name="personal-notes",
        query=response.embeddings[0],
        limit=2,
    ).points
    return SearchResults(
        results=[
            Document(**point.payload)
            for point in results
        ]
    )

如果我们安装了 uvicorn 服务器，我们的应用程序可以在本地启动用于开发目的

uvicorn main:app

FastAPI 在 http://localhost:8000/docs 暴露交互式文档，我们可以在那里测试我们的端点。/search 端点在那里可用。

FastAPI documentation

我们可以与它交互，检查针对特定查询将返回的文档。例如，我们想回顾一下我们应该为您的项目的基础设施做些什么。

curl -X "POST" \
    -H "Content-type: application/json" \
    -d '{"query": "Is there anything I have to do regarding the project infrastructure?"}' \
    "http://localhost:8000/search"

输出应如下所示

{
  "results": [
    {
      "title": "Cloud Migration Strategy",
      "text": "Draft a plan for migrating our current on-premise infrastructure to the cloud. The plan should cover the selection of a cloud provider, cost analysis, and a phased migration approach. Identify critical applications for the first phase and any potential risks or challenges. Schedule a meeting with the IT department to discuss the plan."
    },
    {
      "title": "Project Alpha Review",
      "text": "Review the current progress of Project Alpha, focusing on the integration of the new API. Check for any compatibility issues with the existing system and document the steps needed to resolve them. Schedule a meeting with the development team to discuss the timeline and any potential roadblocks."
    }
  ]
}

连接到 Command-R

我们的 Web 服务已实现，但仅运行在本地机器上。在 Command-R 与其交互之前，它必须暴露到公共网络。对于快速实验，使用 ngrok 等服务设置隧道可能就足够了。本教程不会涵盖所有细节，但他们的快速入门是逐步描述该过程的极佳资源。或者，您也可以使用公共 URL 部署该服务。

完成后，我们可以先创建连接器，然后在通过聊天 API 交互时告诉模型使用它。创建连接器是对 Cohere 客户端的一次调用

connector_response = cohere_client.connectors.create(
    name="personal-notes",
    url="https:/this-is-my-domain.app/search",
)

connector_response.connector 将是一个描述符，其中 id 是其属性之一。我们将使用此标识符进行如下交互

response = cohere_client.chat(
    message=(
        "Is there anything I have to do regarding the project infrastructure? "
        "Please mention the tasks briefly."
    ),
    connectors=[
        cohere.ChatConnector(id=connector_response.connector.id)
    ],
    model="command-r",
)

我们将 model 更改为 command-r，因为这是目前向公众提供的最佳 Cohere 模型。response.text 是模型的输出

Here are some of the tasks related to project infrastructure that you might have to perform:
- You need to draft a plan for migrating your on-premise infrastructure to the cloud and come up with a plan for the selection of a cloud provider, cost analysis, and a gradual migration approach.
- It's important to evaluate your current technology stack to identify any outdated technologies. You should also research emerging technologies and the benefits they could bring to your projects.

您只需要创建一次特定的连接器！请不要对发送到 chat 方法的每条消息都调用 cohere_client.connectors.create。

总结

我们构建了一个与您存储在 Qdrant 中的现有知识库集成的 Cohere RAG 连接器。我们只介绍了基本流程，但在实际场景中，您还应该考虑例如构建身份验证系统以防止未经授权的访问。

为 Cohere RAG 实现自定义连接器

Qdrant 连接器

构建集合

连接器 Web 服务

连接到 Command-R

总结

本页有用吗？