Python API#

LLM 除了提供命令行界面外，还提供了用于执行提示的 Python API。

理解此 API 对于编写插件也很重要。

基本提示执行#

要针对 gpt-4o-mini 模型运行提示，请运行此代码：

import llm

model = llm.get_model("gpt-4o-mini")
# key= is optional, you can configure the key in other ways
response = model.prompt(
    "Five surprising names for a pet pelican",
    key="sk-..."
)
print(response.text())

请注意，提示直到您调用 response.text() 方法时才会被评估 - 这是一种延迟加载的形式。

如果您在响应被评估之前检查它，它将看起来像这样：

<Response prompt='Your prompt' text='... not yet done ...'>

llm.get_model() 函数接受模型 ID 或别名。您也可以省略它以使用当前配置的默认模型，如果您未更改默认设置，则默认模型为 gpt-4o-mini。

在此示例中，密钥由 Python 代码设置。您还可以使用 OPENAI_API_KEY 环境变量提供密钥，或者使用 llm keys set openai 命令将其存储在 keys.json 文件中，请参阅API 密钥管理。

response 的 __str__() 方法也返回响应文本，因此您可以这样做：

print(llm.get_model().prompt("Five surprising names for a pet pelican"))

您可以运行此命令查看可用模型及其别名的列表：

llm models

如果您已设置 OPENAI_API_KEY 环境变量，则可以省略 model.key = 行。

使用无效模型 ID 调用 llm.get_model() 将引发 llm.UnknownModelError 异常。

系统提示#

对于接受系统提示的模型，将其作为 system="..." 传入：

response = model.prompt(
    "Five surprising names for a pet pelican",
    system="Answer like GlaDOS"
)

附件#

接受多模态输入（图像、音频、视频等）的模型可以使用 attachments= 关键字参数传递附件。该参数接受一个 llm.Attachment() 实例列表。

此示例显示了两个附件 - 一个来自文件路径，一个来自 URL：

import llm

model = llm.get_model("gpt-4o-mini")
response = model.prompt(
    "Describe these images",
    attachments=[
        llm.Attachment(path="pelican.jpg"),
        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg"),
    ]
)

使用 llm.Attachment(content=b"binary image content here") 直接传递二进制内容。

您可以使用 model.attachment_types 集合检查模型支持哪些附件类型（如果有）：

model = llm.get_model("gpt-4o-mini")
print(model.attachment_types)
# {'image/gif', 'image/png', 'image/jpeg', 'image/webp'}

if "image/jpeg" in model.attachment_types:
    # Use a JPEG attachment here
    ...

Schema#

与CLI 工具一样，一些模型支持传入 JSON schema 用于生成响应。

您可以将其作为 Python 字典或 Pydantic BaseModel 子类传递给 prompt(schema=) 参数：

import llm, json
from pydantic import BaseModel

class Dog(BaseModel):
    name: str
    age: int

model = llm.get_model("gpt-4o-mini")
response = model.prompt("Describe a nice dog", schema=Dog)
dog = json.loads(response.text())
print(dog)
# {"name":"Buddy","age":3}

您也可以直接传递 schema，像这样：

response = model.prompt("Describe a nice dog", schema={
    "properties": {
        "name": {"title": "Name", "type": "string"},
        "age": {"title": "Age", "type": "integer"},
    },
    "required": ["name", "age"],
    "title": "Dog",
    "type": "object",
})

您还可以通过 llm.schema_dsl(schema_dsl) 函数使用 LLM 的替代 schema 语法。这为简单情况构建 JSON schema 提供了一种快速方法：

print(model.prompt(
    "Describe a nice dog with a surprising name",
    schema=llm.schema_dsl("name, age int, bio")
))

传入 multi=True 以生成一个返回与该规范匹配的多个项目的 schema。

print(model.prompt(
    "Describe 3 nice dogs with surprising names",
    schema=llm.schema_dsl("name, age int, bio", multi=True)
))

片段#

还可以从 Python API 访问 CLI 工具中的片段系统，方法是将 fragments= 和/或 system_fragments= 字符串列表传递给 prompt() 方法：

response = model.prompt(
    "What do these documents say about dogs?",
    fragments=[
        open("dogs1.txt").read(),
        open("dogs2.txt").read(),
    ],
    system_fragments=[
        "You answer questions like Snoopy",
    ]
)

这种机制在 Python 中的实用性有限，因为您也可以直接将这些字符串的内容组合到 prompt= 和 system= 字符串中。

如果您正在使用 LLM 将提示存储到 SQLite 数据库的机制，那么片段会变得更有趣，但这部分尚未成为稳定、文档化的 Python API 的一部分。

一些模型插件可能包含利用片段的功能，例如 llm-anthropic 旨在将它们用作利用 Claude 提示缓存系统机制的一部分。

模型选项#

对于支持选项的模型（使用 llm models --options 查看），您可以将选项作为关键字参数传递给 .prompt() 方法：

model = llm.get_model()
print(model.prompt("Names for otters", temperature=0.2))

传递 API 密钥#

接受 API 密钥的模型应在其 model.prompt() 方法中接受一个附加的 key= 参数：

model = llm.get_model("gpt-4o-mini")
print(model.prompt("Names for beavers", key="sk-..."))

如果您不提供此参数，LLM 将尝试从环境变量（OpenAI 为 OPENAI_API_KEY，其他插件使用不同的变量）或使用 llm keys set 命令保存的密钥中查找它。

某些模型插件可能尚未升级以处理 key= 参数，在这种情况下，您需要使用其他机制之一。

来自插件的模型#

您作为插件安装的任何模型也将通过此机制可用，例如，使用 llm-anthropic 使用 Anthropic 的 Claude 3.5 Sonnet 模型：

pip install llm-anthropic

然后在您的 Python 代码中：

import llm

model = llm.get_model("claude-3.5-sonnet")
# Use this if you have not set the key using 'llm keys set claude':
model.key = 'YOUR_API_KEY_HERE'
response = model.prompt("Five surprising names for a pet pelican")
print(response.text())

有些模型完全不使用 API 密钥。

访问底层 JSON#

大多数模型插件还提供了提示响应的 JSON 版本。其结构在不同的模型插件之间会有所不同，因此基于此构建的代码可能只适用于特定的模型提供者。

您可以使用 response.json() 方法将此 JSON 数据作为 Python 字典访问：

import llm
from pprint import pprint

model = llm.get_model("gpt-4o-mini")
response = model.prompt("3 names for an otter")
json_data = response.json()
pprint(json_data)

以下是 GPT-4o mini 的示例输出：

{'content': 'Sure! Here are three fun names for an otter:\n'
            '\n'
            '1. **Splash**\n'
            '2. **Bubbles**\n'
            '3. **Otto** \n'
            '\n'
            'Feel free to mix and match or use these as inspiration!',
 'created': 1739291215,
 'finish_reason': 'stop',
 'id': 'chatcmpl-AznO31yxgBjZ4zrzBOwJvHEWgdTaf',
 'model': 'gpt-4o-mini-2024-07-18',
 'object': 'chat.completion.chunk',
 'usage': {'completion_tokens': 43,
           'completion_tokens_details': {'accepted_prediction_tokens': 0,
                                         'audio_tokens': 0,
                                         'reasoning_tokens': 0,
                                         'rejected_prediction_tokens': 0},
           'prompt_tokens': 13,
           'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0},
           'total_tokens': 56}}

Token 使用量#

许多模型可以返回执行提示时使用的 token 数量的计数。

response.usage() 方法提供了对此的抽象：

pprint(response.usage())

示例输出

Usage(input=5,
      output=2,
      details={'candidatesTokensDetails': [{'modality': 'TEXT',
                                            'tokenCount': 2}],
               'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 5}]})

.input 和 .output 属性是整数，表示输入和输出 token 的数量。.details 属性可能是一个包含其他自定义值的字典，这些值因模型而异。

流式响应#

对于支持的模型，您可以在生成响应时进行流式传输，像这样：

response = model.prompt("Five diabolical names for a pet goat")
for chunk in response:
    print(chunk, end="")

前面描述的 response.text() 方法为您完成了此操作 - 它遍历迭代器并将结果收集到一个字符串中。

如果响应已被评估，response.text() 将继续返回相同的字符串。

异步模型#

一些插件提供了其支持模型的异步版本，适用于与 Python asyncio 一起使用。

要使用异步模型，请使用 llm.get_async_model() 函数而不是 llm.get_model()：

import llm
model = llm.get_async_model("gpt-4o")

然后您可以使用 await model.prompt(...) 运行提示：

response = await model.prompt(
    "Five surprising names for a pet pelican"
)
print(await response.text())

或者使用 async for chunk in ... 在生成响应时进行流式传输：

async for chunk in model.prompt(
    "Five surprising names for a pet pelican"
):
    print(chunk, end="", flush=True)

此 await model.prompt() 方法接受与同步 model.prompt() 方法相同的参数，用于选项、附件以及 key= 等。

对话#

LLM 支持对话，您可以在持续进行的对话中向模型提出后续问题。

要开始新对话，请使用 model.conversation() 方法：

model = llm.get_model()
conversation = model.conversation()

然后您可以使用 conversation.prompt() 方法在此对话中执行提示：

response = conversation.prompt("Five fun facts about pelicans")
print(response.text())

这与 model.prompt() 方法完全相同，不同之处在于对话将跨多个提示进行维护。因此，如果您接下来运行此代码：

response2 = conversation.prompt("Now do skunks")
print(response2.text())

您将获得关于臭鼬的五个有趣事实。

此 conversation.prompt() 方法也支持附件：

response = conversation.prompt(
    "Describe these birds",
    attachments=[
        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg")
    ]
)

访问 conversation.responses 可获取对话中至今返回的所有响应列表。

列出模型#

llm.get_models() 列表返回所有可用模型的列表，包括来自插件的模型。

import llm

for model in llm.get_models():
    print(model.model_id)

使用 llm.get_async_models() 列出异步模型。

for model in llm.get_async_models():
    print(model.model_id)

响应完成后运行代码#

对于某些应用程序，例如跟踪应用程序使用的 token，在响应执行完毕后立即执行代码可能很有用：

您可以使用 response.on_done(callback) 方法来实现此操作，该方法会在响应完成后（所有 token 都已返回）立即调用您的回调函数。

您提供的方法签名是 def callback(response) - 在使用异步模型时，它可以选择是一个 async def 方法。

示例用法

import llm

model = llm.get_model("gpt-4o-mini")
response = model.prompt("a poem about a hippo")
response.on_done(lambda response: print(response.usage()))
print(response.text())

输出如下：

Usage(input=20, output=494, details={})
In a sunlit glade by a bubbling brook,
Lived a hefty hippo, with a curious look.
...

或者使用 asyncio 模型，在这种情况下您需要 await response.on_done(done) 以将回调排入队列：

import asyncio, llm

async def run():
    model = llm.get_async_model("gpt-4o-mini")
    response = model.prompt("a short poem about a brick")
    async def done(response):
        print(await response.usage())
        print(await response.text())
    await response.on_done(done)
    print(await response.text())

asyncio.run(run())

其他函数#

llm 顶级包包含一些有用的实用函数。

set_alias(alias, model_id)#

llm.set_alias() 函数可用于定义新别名：

import llm

llm.set_alias("mini", "gpt-4o-mini")

第二个参数可以是模型标识符或另一个别名，在这种情况下，该别名将被解析。

如果 aliases.json 文件不存在或包含无效 JSON，则会创建或覆盖该文件。

remove_alias(alias)#

从 aliases.json 文件中删除给定名称的别名。

如果别名不存在，则引发 KeyError。

import llm

llm.remove_alias("turbo")

set_default_model(alias)#

这将默认模型设置为给定的模型 ID 或别名。对默认设置的任何更改将保存在 LLM 配置文件夹中，并将影响系统上所有使用 LLM 的程序，包括 llm CLI 工具。

import llm

llm.set_default_model("claude-3.5-sonnet")

get_default_model()#

这返回当前配置的默认模型，如果未设置默认模型，则返回 gpt-4o-mini。

import llm

model_id = llm.get_default_model()

要检测是否未设置默认值，您可以使用以下模式：

if llm.get_default_model(default=None) is None:
    print("No default has been set")

这里的 default= 参数指定了在没有配置默认值时应返回的值。

set_default_embedding_model(alias) 和 get_default_embedding_model()#

这两个方法与 set_default_model() 和 get_default_model() 工作方式相同，但用于默认嵌入模型。