Pandas-AI
Pandas-AI 是一个 Python 库,它使用生成式 AI 模型来解释自然语言查询,并将其翻译成 Python 代码,以便与 pandas 数据帧交互并将最终结果返回给用户。
安装
pip install pandasai[qdrant]
使用
您可以通过基于 Pandas 数据帧实例化一个 Agent
实例来开始对话。默认的 Pandas-AI LLM 需要一个 API 密钥。
您可以在此处找到所有支持的 LLM 列表
import os
import pandas as pd
from pandasai import Agent
# Sample DataFrame
sales_by_country = pd.DataFrame(
{
"country": [
"United States",
"United Kingdom",
"France",
"Germany",
"Italy",
"Spain",
"Canada",
"Australia",
"Japan",
"China",
],
"sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000],
}
)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
agent = Agent(sales_by_country)
agent.chat("Which are the top 5 countries by sales?")
# OUTPUT: China, United States, Japan, Germany, Australia
Qdrant 支持
您可以训练 Pandas-AI 更好地理解您的数据并提高结果质量。
Qdrant 可以配置为向量存储,用于摄取训练数据并检索语义相关内容。
from pandasai.ee.vectorstores.qdrant import Qdrant
qdrant = Qdrant(
collection_name="<SOME_COLLECTION>",
embedding_model="sentence-transformers/all-MiniLM-L6-v2",
url="http://localhost:6333",
grpc_port=6334,
prefer_grpc=True
)
agent = Agent(df, vector_store=qdrant)
# Train with custom information
agent.train(docs="The fiscal year starts in April")
# Train the q/a pairs of code snippets
query = "What are the total sales for the current fiscal year?"
response = """
import pandas as pd
df = dfs[0]
# Calculate the total sales for the current fiscal year
total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
result = { "type": "number", "value": total_sales }
"""
agent.train(queries=[query], codes=[response])
# # The model will use the information provided in the training to generate a response