使用#

运行提示的命令是 llm prompt 'your prompt'。这是默认命令，因此您可以使用 llm 'your prompt' 作为快捷方式。

执行提示#

这些示例使用默认的 OpenAI gpt-4o-mini 模型，这要求您首先设置一个 OpenAI API 密钥。

您可以安装 LLM 插件来使用来自其他提供商的模型，包括您可以在自己的计算机上直接运行的开放许可模型。

运行提示，并在标记生成时进行流式传输

llm 'Ten names for cheesecakes'

禁用流式传输并在响应完成后才返回

llm 'Ten names for cheesecakes' --no-stream

从 ChatGPT 4o-mini（默认）切换到 GPT-4o

llm 'Ten names for cheesecakes' -m gpt-4o

您可以使用 -m 4o 作为更短的快捷方式。

传递 --model <模型名称> 以使用不同的模型。运行 llm models 以查看可用模型的列表。

或者如果您知道名称太长无法输入，请使用 -q 一次或多次提供搜索词 - 将使用模型 ID 最短且匹配所有这些词（作为小写子字符串）的模型

llm 'Ten names for cheesecakes' -q 4o -q mini

要更改当前会话的默认模型，请设置 LLM_MODEL 环境变量

export LLM_MODEL=gpt-4.1-mini
llm 'Ten names for cheesecakes' # Uses gpt-4.1-mini

您可以直接将提示发送到标准输入，如下所示

echo 'Ten names for cheesecakes' | llm

如果您将文本发送到标准输入并提供参数，则生成的提示将由管道内容后跟参数组成

cat myscript.py | llm 'explain this code'

将运行一个提示，内容为

<contents of myscript.py> explain this code

对于支持系统提示的模型，系统提示是执行此类提示的更好工具。

模型选项#

一些模型支持选项。您可以使用 -o/--option name value 来传递这些选项 - 例如，要将温度设置为 1.5，请运行此命令

llm 'Ten names for cheesecakes' -o temperature 1.5

使用 llm models --options 命令查看每个模型支持哪些选项。

您还可以使用 llm models options 命令为模型配置默认选项。

附件#

一些模型是多模态的，这意味着它们不仅可以接受文本输入。GPT-4o 和 GPT-4o mini 可以接受图像，而像 Google Gemini 1.5 这样的模型也可以接受音频和视频。

LLM 将这些称为附件。您可以使用 -a 选项传递附件，如下所示

llm "describe this image" -a https://static.simonwillison.net/static/2024/pelicans.jpg

附件可以使用 URL 或文件路径传递，并且您可以将多个附件附加到一个提示中

llm "extract text" -a image1.jpg -a image2.jpg

您也可以通过使用 - 作为文件名将附件通过管道传递给 LLM

cat image.jpg | llm "describe this image" -a -

LLM 将尝试自动检测图像的内容类型。如果这不起作用，您可以改用 --attachment-type 选项（简写为 --at），该选项接受 URL/路径以及显式内容类型

cat myfile | llm "describe this image" --at - image/jpeg

系统提示#

您可以使用 -s/--system '...' 来设置系统提示。

llm 'SQL to calculate total sales by month' \
  --system 'You are an exaggerated sentient cheesecake that knows SQL and talks about cheesecake a lot'

这对于将内容通过管道传递到标准输入非常有用，例如

curl -s 'https://simonwillison.net/2023/May/15/per-interpreter-gils/' | \
  llm -s 'Suggest topics for this post as a JSON array'

或者生成自上次提交以来对 Git 仓库所做更改的描述

git diff | llm -s 'Describe these changes'

不同的模型以不同的方式支持系统提示。

OpenAI 模型特别擅长使用系统提示作为处理作为常规提示一部分发送的额外输入的指令。

其他模型可能会使用系统提示来更改模型的默认语音和态度。

系统提示可以保存为模板以创建可重用工具。例如，您可以像这样创建一个名为 pytest 的模板

llm -s 'write pytest tests for this code' --save pytest

然后像这样使用新模板

cat llm/utils.py | llm -t pytest

有关更多信息，请参阅提示模板。

提取围栏式代码块#

如果您使用 LLM 生成代码，只检索它生成的代码而不包含周围的解释性文本会很有用。

的 -x/--extract 选项将扫描响应以查找第一个 Markdown 围栏式代码块 - 看起来像这样

```python
def my_function():
    # ...
```

它将提取并仅返回该块的内容，不包括围栏式代码分隔符。如果没有围栏式代码块，它将返回完整的响应。

使用 --xl/--extract-last 返回最后一个围栏式代码块而不是第一个。

包括解释性文本的整个响应仍然会记录到数据库中，并且可以使用 llm logs -c 查看。

模式#

一些模型具有返回与提供的JSON 模式匹配的 JSON 的能力。OpenAI、Anthropic 和 Google Gemini 的模型都包含此功能。

有关使用此功能的详细指南，请参阅模式文档。

您可以直接将 JSON 模式传递给 --schema 选项

llm --schema '{
  "type": "object",
  "properties": {
    "dogs": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string"
          },
          "bio": {
            "type": "string"
          }
        }
      }
    }
  }
}' -m gpt-4o-mini 'invent two dogs'

或使用 LLM 的自定义简洁模式语法，如下所示

llm --schema 'name,bio' 'invent a dog'

要对多个项目使用相同的简洁模式，请使用 --schema-multi

llm --schema-multi 'name,bio' 'invent two dogs'

您还可以将 JSON 模式保存到文件并使用 --schema 引用文件名

llm --schema dogs.schema.json 'invent two dogs'

或者像这样将您的模式保存到模板

llm --schema dogs.schema.json --save dogs
# Then to use it:
llm -t dogs 'invent two dogs'

请注意，不同的模型可能支持不同方言的 JSON 模式规范。

有关使用 llm logs --schema X 命令访问您之前使用此选项记录的 JSON 对象的提示，请参阅浏览使用模式创建的已记录 JSON 对象。

片段#

您可以使用 -f/--fragment 选项引用您希望加载到提示中的上下文片段。片段可以指定为 URL、文件路径或先前保存片段的别名。

片段是为运行更长的提示而设计的。LLM 将提示存储在数据库中中，多次重复相同的提示最终可能会存储为多个副本，浪费磁盘空间。一个片段将只存储一次，并被所有使用它的提示引用。

的 -f 选项可以接受磁盘上文件的路径、URL 或先前片段的哈希值或别名。

例如，询问关于 llm.datasette.io 上的 robots.txt 文件的问题

llm -f https://llm.datasette.com.cn/robots.txt 'explain this'

关于受磁盘上的某些 Python 代码启发的诗歌

llm -f cli.py 'a short snappy poem inspired by this code'

您可以使用任意数量的 -f 选项 - 这些片段将按照您提供的顺序连接在一起，并在末尾添加任何额外的提示。

片段也可以使用 --sf/--system-fragment 选项用于系统提示。如果您有一个名为 explain_code.txt 的文件，其中包含这些内容

Explain this code in detail. Include copies of the code quoted in the explanation.

您可以像这样将其作为系统提示运行

llm -f cli.py --sf explain_code.txt

您可以使用 llm fragments set 命令加载片段并为其指定别名，以便在将来的查询中使用

llm fragments set cli cli.py
# Then
llm -f cli 'explain this code'

使用 llm fragments 列出所有已存储的片段

llm fragments

您可以通过传递一个或多个 -q X 搜索字符串进行搜索。这将返回匹配所有这些字符串的结果，涵盖来源、哈希、别名和内容

llm fragments -q pytest -q asyncio

的 llm fragments remove 命令删除别名。它不会删除片段记录本身，因为这些记录链接到先前的提示和响应，不能独立于它们删除。

llm fragments remove cli

继续对话#

默认情况下，每次运行该工具时都会开始新的对话。

您可以通过传递 -c/--continue 选项选择继续上一个对话

llm 'More names' -c

这会将上一个对话的提示和响应作为对语言模型调用的一部分重新发送。请注意，这可能会快速增加代币数量，尤其是在您使用昂贵模型时。

--continue 将自动使用与您正在继续的对话相同的模型，即使您省略了 -m/--model 选项。

要继续不是最近的对话，请使用 --cid/--conversation <id> 选项

llm 'More names' --cid 01h53zma5txeby33t1kbe3xk8q

您可以使用 llm logs 命令查找这些对话 ID。

将 LLM 与 Bash 或 Zsh 一起使用的技巧#

要根据 uname -a 的输出了解有关您计算机操作系统的更多信息，请运行此命令

llm "Tell me about my operating system: $(uname -a)"

在双引号字符串中使用 $(command) 的这种模式是快速组合提示的一种有用方法。

补全提示#

一些模型是补全模型 - 它们不是针对响应聊天风格提示进行调优的，而是旨在完成句子或段落。

一个例子是 gpt-3.5-turbo-instruct OpenAI 模型。

您可以使用与聊天模型相同的方式提示该模型，但请注意，最适合的提示格式可能会有所不同。

llm -m gpt-3.5-turbo-instruct 'Reasons to tame a wild beaver:'

开始交互式聊天#

的 llm chat 命令启动与模型的持续交互式聊天。

这对于在您自己的机器上运行的模型特别有用，因为它可以避免每次向对话添加新提示时都必须将它们加载到内存中。

运行 llm chat，可选地带上 -m model_id，以开始聊天对话

llm chat -m chatgpt

每次聊天都会开始新的对话。每次对话的记录都可以通过日志访问。

您可以传递 -c 以开始一个对话，作为您最近提示的延续。这将自动使用最近使用的模型

llm chat -c

对于支持选项的模型，您可以使用 -o/--option 传递选项

llm chat -m gpt-4 -o temperature 0.5

您可以传递用于您的聊天对话的系统提示

llm chat -m gpt-4 -s 'You are a sentient cheesecake'

您还可以传递一个模板 - 这对于创建您希望返回的聊天角色非常有用。

以下是如何为您的 GPT-4 驱动的芝士蛋糕创建模板

llm --system 'You are a sentient cheesecake' -m gpt-4 --save cheesecake

现在，您可以随时使用此命令与您的芝士蛋糕开始新的聊天

llm chat -t cheesecake

Chatting with gpt-4
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
Type '!edit' to open your default editor and modify the prompt
> who are you?
I am a sentient cheesecake, meaning I am an artificial
intelligence embodied in a dessert form, specifically a
cheesecake. However, I don't consume or prepare foods
like humans do, I communicate, learn and help answer
your queries.

输入 quit 或 exit，然后按 <enter> 结束聊天会话。

有时您可能希望一次将多行文本粘贴到聊天中 - 例如在调试错误消息时。

要做到这一点，输入 !multi 开始多行输入。输入或粘贴您的文本，然后输入 !end 并按 <enter> 完成。

如果您粘贴的文本本身可能包含 !end 行，您可以使用 !multi abc 设置自定义分隔符，并在末尾跟随 !end abc

Chatting with gpt-4
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
Type '!edit' to open your default editor and modify the prompt.
> !multi custom-end
 Explain this error:

   File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

 !end custom-end

您还可以使用 !edit 打开您的默认编辑器，并在将提示发送到模型之前进行修改。

Chatting with gpt-4
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
Type '!edit' to open your default editor and modify the prompt.
> !edit

列出可用模型#

的 llm models 命令列出了可与 LLM 一起使用的每个模型及其别名。这包括使用插件安装的模型。

llm models

示例输出

OpenAI Chat: gpt-4o (aliases: 4o)
OpenAI Chat: gpt-4o-mini (aliases: 4o-mini)
OpenAI Chat: o1-preview
OpenAI Chat: o1-mini
GeminiPro: gemini-1.5-pro-002
GeminiPro: gemini-1.5-flash-002
...

添加一个或多个 -q term 选项以搜索与所有这些搜索词匹配的模型

llm models -q gpt-4o
llm models -q 4o -q mini

使用一个或多个 -m 选项指定特定模型，可以是其模型 ID 或其别名之一

llm models -m gpt-4o -m gemini-1.5-pro-002

添加 --options 以查看每个模型支持的选项文档

llm models --options

输出

OpenAI Chat: gpt-4o (aliases: 4o)
  Options:
    temperature: float
      What sampling temperature to use, between 0 and 2. Higher values like
      0.8 will make the output more random, while lower values like 0.2 will
      make it more focused and deterministic.
    max_tokens: int
      Maximum number of tokens to generate.
    top_p: float
      An alternative to sampling with temperature, called nucleus sampling,
      where the model considers the results of the tokens with top_p
      probability mass. So 0.1 means only the tokens comprising the top 10%
      probability mass are considered. Recommended to use top_p or
      temperature but not both.
    frequency_penalty: float
      Number between -2.0 and 2.0. Positive values penalize new tokens based
      on their existing frequency in the text so far, decreasing the model's
      likelihood to repeat the same line verbatim.
    presence_penalty: float
      Number between -2.0 and 2.0. Positive values penalize new tokens based
      on whether they appear in the text so far, increasing the model's
      likelihood to talk about new topics.
    stop: str
      A string where the API will stop generating further tokens.
    logit_bias: dict, str
      Modify the likelihood of specified tokens appearing in the completion.
      Pass a JSON string like '{"1712":-100, "892":-100, "1489":-100}'
    seed: int
      Integer seed to attempt to sample deterministically
    json_object: boolean
      Output a valid JSON object {...}. Prompt must mention JSON.
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: chatgpt-4o-latest (aliases: chatgpt-4o)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4o-mini (aliases: 4o-mini)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4o-audio-preview
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    audio/mpeg, audio/wav
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4o-audio-preview-2024-12-17
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    audio/mpeg, audio/wav
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4o-audio-preview-2024-10-01
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    audio/mpeg, audio/wav
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4o-mini-audio-preview
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    audio/mpeg, audio/wav
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4o-mini-audio-preview-2024-12-17
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    audio/mpeg, audio/wav
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4.1 (aliases: 4.1)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4.1-mini (aliases: 4.1-mini)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4.1-nano (aliases: 4.1-nano)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-3.5-turbo (aliases: 3.5, chatgpt)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-3.5-turbo-16k (aliases: chatgpt-16k, 3.5-16k)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4 (aliases: 4, gpt4)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4-32k (aliases: 4-32k)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4-1106-preview
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4-0125-preview
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4-turbo-2024-04-09
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4-turbo (aliases: gpt-4-turbo-preview, 4-turbo, 4t)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4.5-preview-2025-02-27
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: gpt-4.5-preview (aliases: gpt-4.5)
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: o1
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
    reasoning_effort: str
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: o1-2024-12-17
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
    reasoning_effort: str
  Attachment types:
    application/pdf, image/gif, image/jpeg, image/png, image/webp
  Features:
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: o1-preview
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: o1-mini
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
  Features:
  - streaming
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: o3-mini
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
    reasoning_effort: str
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: o3
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
    reasoning_effort: str
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Chat: o4-mini
  Options:
    temperature: float
    max_tokens: int
    top_p: float
    frequency_penalty: float
    presence_penalty: float
    stop: str
    logit_bias: dict, str
    seed: int
    json_object: boolean
    reasoning_effort: str
  Features:
  - streaming
  - schemas
  - async
  Keys:
    key: openai
    env_var: OPENAI_API_KEY
OpenAI Completion: gpt-3.5-turbo-instruct (aliases: 3.5-instruct, chatgpt-instruct)
  Options:
    temperature: float
      What sampling temperature to use, between 0 and 2. Higher values like
      0.8 will make the output more random, while lower values like 0.2 will
      make it more focused and deterministic.
    max_tokens: int
      Maximum number of tokens to generate.
    top_p: float
      An alternative to sampling with temperature, called nucleus sampling,
      where the model considers the results of the tokens with top_p
      probability mass. So 0.1 means only the tokens comprising the top 10%
      probability mass are considered. Recommended to use top_p or
      temperature but not both.
    frequency_penalty: float
      Number between -2.0 and 2.0. Positive values penalize new tokens based
      on their existing frequency in the text so far, decreasing the model's
      likelihood to repeat the same line verbatim.
    presence_penalty: float
      Number between -2.0 and 2.0. Positive values penalize new tokens based
      on whether they appear in the text so far, increasing the model's
      likelihood to talk about new topics.
    stop: str
      A string where the API will stop generating further tokens.
    logit_bias: dict, str
      Modify the likelihood of specified tokens appearing in the completion.
      Pass a JSON string like '{"1712":-100, "892":-100, "1489":-100}'
    seed: int
      Integer seed to attempt to sample deterministically
    logprobs: int
      Include the log probabilities of most likely N per token
  Features:
  - streaming
  Keys:
    key: openai
    env_var: OPENAI_API_KEY

运行提示时，您可以将完整的模型名称或任何别名传递给 -m/--model 选项

llm -m 4o \
  'As many names for cheesecakes as you can think of, with detailed descriptions'

设置模型的默认选项#

要配置特定模型的默认选项，请使用 llm models options set 命令

llm models options set gpt-4o temperature 0.5

之后，每当您通过 gpt-4o 模型运行提示时，此选项将自动应用。

默认选项存储在 LLM 配置目录中的 model_options.json 文件中。

您可以使用 llm models options list 命令列出所有模型的默认选项

llm models options list

或者使用 llm models options show <model_id> 显示单个模型的默认选项

llm models options show gpt-4o

要清除默认选项，请使用 llm models options clear 命令

llm models options clear gpt-4o temperature

或者像这样清除模型的所有默认选项

llm models options clear gpt-4o