Swiftide

Swiftide 是一个用于构建 LLM 应用程序的 Rust 库。它支持从简单的提示补全到快速、流式索引和查询管道,以及构建使用工具或调用其他代理的可组合代理。

高级功能

  • 通用 LLM 任务的简单原语
  • 流式索引和查询管道
  • 可组合的代理和管道
  • 模块化、可扩展的 API,抽象度最低
  • 与流行的 LLM 和存储提供商集成
  • 内置管道转换(或自带)
  • 带有任务的图状工作流
  • Langfuse 支持

安装

安装支持 Qdrant、OpenAI 和 Redis 的 Swiftide

cargo add swiftide --features=qdrant,openai,redis

请注意,Swiftide 默认是 barebones(裸机),因此您需要为您想要使用的集成启用功能。


索引示例(分步)

此示例使用 Swiftide 和 Qdrant 作为向量存储来索引 .rs 文件。

use swiftide::{
    indexing,
    indexing::LanguageModelWithBackOff,
    indexing::loaders::FileLoader,
    indexing::transformers::{ChunkCode, Embed, MetadataQACode},
    integrations::{self, qdrant::Qdrant, redis::Redis},
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    tracing_subscriber::fmt::init();

    // 1. Set up OpenAI client for embedding and prompt models
    let openai_client = integrations::openai::OpenAI::builder()
        .default_embed_model("text-embedding-3-small")
        .default_prompt_model("gpt-3.5-turbo")
        .build()?;

    // 3. Set up Redis for caching which files/chunks are already processed
    let redis_url = std::env::var("REDIS_URL")
        .as_deref()
        .unwrap_or("redis://:6379")
        .to_owned();

    indexing::Pipeline::from_loader(FileLoader::new(".").with_extensions(&["rs"]))
        // 4. Skip files/chunks already indexed (cached in Redis)
        .filter_cached(Redis::try_from_url(redis_url, "swiftide-examples")?)
        // 5. Generate metadata Q&A for code chunks, using LLM
        .then(MetadataQACode::new(openai_client.clone()))
        // 6. Split code into chunks suitable for embedding
        .then_chunk(ChunkCode::try_for_language_and_chunk_size("rust", 10..2048)?)
        // 7. Embed code+metadata in batches
        .then_in_batch(Embed::new(openai_client.clone()).with_batch_size(10))
        // 8. Store results in a Qdrant collection
        .then_store_with(
            Qdrant::builder()
                .batch_size(50)
                .vector_size(1536)
                .collection_name("swiftide-examples")
                .build()?,
        )
        // 9. Run the pipeline asynchronously
        .run()
        .await?;
    Ok(())
}

混合搜索示例

下面是使用 Qdrant 进行混合稠密/稀疏搜索的简化工作流。

use swiftide::{
    indexing::{
        self, EmbeddedField,
        loaders::FileLoader,
        transformers::{self, ChunkCode, MetadataQACode},
    },
    integrations::{fastembed::FastEmbed, openai, qdrant::Qdrant},
    query::{self, answers, query_transformers, search_strategies::HybridSearch},
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    tracing_subscriber::fmt::init();

    // 1. Create fastembed (dense/sparse) clients
    let batch_size = 64;
    let fastembed_sparse = FastEmbed::try_default_sparse().unwrap().to_owned();
    let fastembed = FastEmbed::try_default().unwrap().to_owned();

    // 2. Use a compact OpenAI prompt model for metadata Q&A generation
    let openai = openai::OpenAI::builder()
        .default_prompt_model("gpt-4o-mini")
        .build()
        .unwrap();

    // 3. Set up Qdrant for both dense and sparse vectors
    let qdrant = Qdrant::builder()
        .batch_size(batch_size)
        .vector_size(384)
        .with_vector(EmbeddedField::Combined)
        .with_sparse_vector(EmbeddedField::Combined)
        .collection_name("swiftide-hybrid-example")
        .build()?;

    indexing::Pipeline::from_loader(FileLoader::new("swiftide-core/").with_extensions(&["rs"]))
        .then_chunk(ChunkCode::try_for_language_and_chunk_size("rust", 10..2048)?)
        .then(MetadataQACode::from_client(openai.clone()).build().unwrap())
        .then_in_batch(transformers::SparseEmbed::new(fastembed_sparse.clone()).with_batch_size(batch_size))
        .then_in_batch(transformers::Embed::new(fastembed.clone()).with_batch_size(batch_size))
        .then_store_with(qdrant.clone())
        .run()
        .await?;

    // 4. Run a hybrid search pipeline
    let openai = openai::OpenAI::builder()
        .default_prompt_model("gpt-4o")
        .build()
        .unwrap();

    let query_pipeline = query::Pipeline::from_search_strategy(
        HybridSearch::default()
            .with_top_n(20)
            .with_top_k(20)
            .to_owned(),
    )
    .then_transform_query(query_transformers::GenerateSubquestions::from_client(openai.clone()))
    .then_transform_query(query_transformers::Embed::from_client(fastembed.clone()))
    .then_transform_query(query_transformers::SparseEmbed::from_client(fastembed_sparse.clone()))
    .then_retrieve(qdrant.clone())
    .then_answer(answers::Simple::from_client(openai.clone()));

    let answer = query_pipeline
        .query("What are the different pipelines in Swiftide and how do they work?")
        .await
        .unwrap();

    println!("{}", answer.answer());
}

延伸阅读

此页面有用吗?

感谢您的反馈!🙏

我们很抱歉听到这个消息。😔 您可以在 GitHub 上编辑此页面,或者创建一个 GitHub 问题。