STORM

STORM 是一个 LLM 系统，它基于 Internet 搜索从头开始编写类似 Wikipedia 的文章。Co-STORM 通过使人类协作 LLM 系统支持更一致和首选的信息搜索和知识管理，进一步增强了其功能。虽然该系统无法生成通常需要大量编辑的可发布文章，但经验丰富的维基百科编辑发现它在他们的预编写阶段很有帮助。

主要功能

写作前阶段：系统进行基于 Internet 的研究以收集参考文献并生成大纲。
写作阶段：系统使用大纲和参考文献生成带有引文的长篇文章。
观点引导式提问：给定输入主题，STORM 通过调查来自相似主题的现有文章来发现不同的观点，并使用它们来控制提问过程。
模拟对话：STORM 模拟 Wikipedia 作者与基于 Internet 资源的主题专家之间的对话，使语言模型能够更新其对主题的理解并提出后续问题。
Co-STORM LLM 专家：这种类型的代理根据外部知识来源生成答案和/或根据话语历史提出后续问题。
主持人：这个代理会生成发人深省的问题，其灵感来自猎犬发现的信息，但在前几轮中并未直接使用。问题生成也可以接地气！
人类用户：人类用户将主动（1）观察话语以更深入地理解主题，或（2）通过注入话语来引导讨论焦点，从而积极参与对话。

安装

要安装 knowledge storm 库，请使用。pip install knowledge-storm

您还可以安装源代码，以便直接修改 STORM 引擎的行为。

克隆 git 存储库。

git clone https://github.com/stanford-oval/storm.git
cd storm

安装所需的软件包。

conda create -n storm python=3.11
conda activate storm
pip install -r requirements.txt

应用程序接口

软件包支持：

语言模型组件：litellm 支持的所有语言模型，如下所示
嵌入模型组件：litellm 支持的所有嵌入模型，如下所示
检索模块组件：、、、、和 ASYouRMBingSearchVectorRMSerperRMBraveRMSearXNGDuckDuckGoSearchRMTavilySearchRMGoogleSearchAzureAISearch

STORM 和 Co-STORM 都工作在信息管理层，你需要设置信息检索模块和语言模型模块来分别创建它们的类。Runner

STORM

STORM 知识管理引擎定义为一个简单的 Python 类。以下是使用 You.com 搜索引擎和 OpenAI 模型的示例。STORMWikiRunner

import os
from knowledge_storm import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs
from knowledge_storm.lm import LitellmModel
from knowledge_storm.rm import YouRM

lm_configs = STORMWikiLMConfigs()
openai_kwargs = {
    'api_key': os.getenv("OPENAI_API_KEY"),
    'temperature': 1.0,
    'top_p': 0.9,
}
# STORM is a LM system so different components can be powered by different models to reach a good balance between cost and quality.
# For a good practice, choose a cheaper/faster model for `conv_simulator_lm` which is used to split queries, synthesize answers in the conversation.
# Choose a more powerful model for `article_gen_lm` to generate verifiable text with citations.
gpt_35 = LitellmModel(model='gpt-3.5-turbo', max_tokens=500, **openai_kwargs)
gpt_4 = LitellmModel(model='gpt-4o', max_tokens=3000, **openai_kwargs)
lm_configs.set_conv_simulator_lm(gpt_35)
lm_configs.set_question_asker_lm(gpt_35)
lm_configs.set_outline_gen_lm(gpt_4)
lm_configs.set_article_gen_lm(gpt_4)
lm_configs.set_article_polish_lm(gpt_4)
# Check out the STORMWikiRunnerArguments class for more configurations.
engine_args = STORMWikiRunnerArguments(...)
rm = YouRM(ydc_api_key=os.getenv('YDC_API_KEY'), k=engine_args.search_top_k)
runner = STORMWikiRunner(engine_args, lm_configs, rm)

该实例可以通过简单的方法触发：STORMWikiRunnerrun

topic = input('Topic: ')
runner.run(
    topic=topic,
    do_research=True,
    do_generate_outline=True,
    do_generate_article=True,
    do_polish_article=True,
)
runner.post_run()
runner.summary()

do_research：如果为 True，则模拟具有不同视角的对话以收集有关该主题的信息;否则，加载结果。
do_generate_outline：如果为 True，则为主题生成大纲;否则，加载结果。
do_generate_article：如果为 True，则根据大纲和收集的信息为该主题生成一篇文章;否则，加载结果。
do_polish_article：如果为 True，则通过添加摘要部分和（可选）删除重复内容来润色文章;否则，加载结果。

CO-STORM

Co-STORM 知识管理引擎定义为一个简单的 Python 类。以下是使用 Bing 搜索引擎和 OpenAI 模型的示例。CoStormRunner

from knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner
from knowledge_storm.lm import LitellmModel
from knowledge_storm.logging_wrapper import LoggingWrapper
from knowledge_storm.rm import BingSearch

# Co-STORM adopts the same multi LM system paradigm as STORM 
lm_config: CollaborativeStormLMConfigs = CollaborativeStormLMConfigs()
openai_kwargs = {
    "api_key": os.getenv("OPENAI_API_KEY"),
    "api_provider": "openai",
    "temperature": 1.0,
    "top_p": 0.9,
    "api_base": None,
} 
question_answering_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=1000, **openai_kwargs)
discourse_manage_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)
utterance_polishing_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=2000, **openai_kwargs)
warmstart_outline_gen_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)
question_asking_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=300, **openai_kwargs)
knowledge_base_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=1000, **openai_kwargs)

lm_config.set_question_answering_lm(question_answering_lm)
lm_config.set_discourse_manage_lm(discourse_manage_lm)
lm_config.set_utterance_polishing_lm(utterance_polishing_lm)
lm_config.set_warmstart_outline_gen_lm(warmstart_outline_gen_lm)
lm_config.set_question_asking_lm(question_asking_lm)
lm_config.set_knowledge_base_lm(knowledge_base_lm)

# Check out the Co-STORM's RunnerArguments class for more configurations.
topic = input('Topic: ')
runner_argument = RunnerArgument(topic=topic, ...)
logging_wrapper = LoggingWrapper(lm_config)
bing_rm = BingSearch(bing_search_api_key=os.environ.get("BING_SEARCH_API_KEY"),
                     k=runner_argument.retrieve_top_k)
costorm_runner = CoStormRunner(lm_config=lm_config,
                               runner_argument=runner_argument,
                               logging_wrapper=logging_wrapper,
                               rm=bing_rm)

可以使用和方法调用实例。CoStormRunnerwarmstart()step(...)

# Warm start the system to build shared conceptual space between Co-STORM and users
costorm_runner.warm_start()

# Step through the collaborative discourse 
# Run either of the code snippets below in any order, as many times as you'd like
# To observe the conversation:
conv_turn = costorm_runner.step()
# To inject your utterance to actively steer the conversation:
costorm_runner.step(user_utterance="YOUR UTTERANCE HERE")

# Generate report based on the collaborative discourse
costorm_runner.knowledge_base.reorganize()
article = costorm_runner.generate_report()
print(article)

小试牛刀

我们在 examples 文件夹中提供了脚本，作为运行具有不同配置的 STORM 和 Co-STORM 的快速入门。

我们建议使用来设置 API 密钥。在根目录下创建文件，并添加以下内容：secrets.tomlsecrets.toml

# ============ language model configurations ============ 
# Set up OpenAI API key.
OPENAI_API_KEY="your_openai_api_key"
# If you are using the API service provided by OpenAI, include the following line:
OPENAI_API_TYPE="openai"
# If you are using the API service provided by Microsoft Azure, include the following lines:
OPENAI_API_TYPE="azure"
AZURE_API_BASE="your_azure_api_base_url"
AZURE_API_VERSION="your_azure_api_version"
# ============ retriever configurations ============ 
BING_SEARCH_API_KEY="your_bing_search_api_key" # if using bing search
# ============ encoder configurations ============ 
ENCODER_API_TYPE="openai" # if using openai encoder

STORM 示例

要使用默认配置的 gpt 系列模型运行 STORM：

运行以下命令。

python examples/storm_examples/run_storm_wiki_gpt.py \
    --output-dir $OUTPUT_DIR \
    --retriever bing \
    --do-research \
    --do-generate-outline \
    --do-generate-article \
    --do-polish-article

要使用您最喜欢的语言模型或基于您自己的语料库运行 STORM：查看 examples/storm_examples/README.md。

Co-STORM 示例

要使用具有默认配置的族模型运行 Co-STORM，gpt

Add 和 toBING_SEARCH_API_KEY="xxx"ENCODER_API_TYPE="xxx"secrets.toml
运行以下命令

python examples/costorm_examples/run_costorm_gpt.py \
    --output-dir $OUTPUT_DIR \
    --retriever bing

更多...

小众AI

主要功能

安装

应用程序接口

STORM

CO-STORM

小试牛刀

STORM 示例

Co-STORM 示例

更多...