Pandas-AI

Pandas-AI 是一个 Python 库,它使用生成式 AI 模型来解释自然语言查询,并将其翻译成 Python 代码,以便与 pandas 数据帧交互并将最终结果返回给用户。

安装

pip install pandasai[qdrant]

使用

您可以通过基于 Pandas 数据帧实例化一个 Agent 实例来开始对话。默认的 Pandas-AI LLM 需要一个 API 密钥

您可以在此处找到所有支持的 LLM 列表

import os
import pandas as pd
from pandasai import Agent

# Sample DataFrame
sales_by_country = pd.DataFrame(
    {
        "country": [
            "United States",
            "United Kingdom",
            "France",
            "Germany",
            "Italy",
            "Spain",
            "Canada",
            "Australia",
            "Japan",
            "China",
        ],
        "sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000],
    }
)

os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"

agent = Agent(sales_by_country)
agent.chat("Which are the top 5 countries by sales?")
# OUTPUT: China, United States, Japan, Germany, Australia

Qdrant 支持

您可以训练 Pandas-AI 更好地理解您的数据并提高结果质量。

Qdrant 可以配置为向量存储,用于摄取训练数据并检索语义相关内容。

from pandasai.ee.vectorstores.qdrant import Qdrant

qdrant = Qdrant(
    collection_name="<SOME_COLLECTION>",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    url="http://localhost:6333",
    grpc_port=6334,
    prefer_grpc=True
)

agent = Agent(df, vector_store=qdrant)

# Train with custom information
agent.train(docs="The fiscal year starts in April")

# Train the q/a pairs of code snippets
query = "What are the total sales for the current fiscal year?"
response = """
import pandas as pd

df = dfs[0]

# Calculate the total sales for the current fiscal year
total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
result = { "type": "number", "value": total_sales }
"""
agent.train(queries=[query], codes=[response])

# # The model will use the information provided in the training to generate a response

延伸阅读

此页面有用吗?

感谢您的反馈!🙏

很抱歉给您带来不便。😔 您可以在 GitHub 上编辑此页面,或者创建一个 GitHub issue。