Allow configuration for passing previous thinking tokens

## Background

In [https://github.com/openai/openai-agents-python/pull/1744](https://github.com/openai/openai-agents-python/pull/1744) and [https://github.com/openai/openai-agents-python/pull/2158](https://github.com/openai/openai-agents-python/pull/2158) we added support for Claude 4 and Gemini 3, where thinking tokens must be returned in the same turn during function calling.

For previous turns, if thinking tokens are sent back, most reasoning models will drop them server-side, similar to the OpenAI API. The main reason is that these tokens increase context window usage. However, starting from Claude Opus 4.5+ and Gemini 3, previous turn thinking tokens are no longer always dropped. If included, the server will preserve them in the context.

For Claude Opus 4.5+ and Gemini, including previous thinking blocks is a developer choice with clear trade-offs.

Pros: Better reasoning and prompt cache
Cons: Increased context window usage and higher token cost

References

* Claude docs: [https://platform.claude.com/docs/en/build-with-claude/extended-thinking](https://platform.claude.com/docs/en/build-with-claude/extended-thinking)
* Gemini docs: [https://ai.google.dev/gemini-api/docs/gemini-3#thought_signatures](https://ai.google.dev/gemini-api/docs/gemini-3#thought_signatures)

In [https://github.com/openai/openai-agents-python/issues/2195](https://github.com/openai/openai-agents-python/issues/2195), there is a proposal to always attach previous thinking tokens for Claude. But I don’t think we should attach thinking tokens to assistant messages by default when they are not strictly required.

I think the default behavior should remain not including previous thinking blocks. This keeps the behavior consistent with previous versions. Since this behavior is optional and has meaningful trade-offs, I suggest adding an explicit configuration option.

## Proposed Design

A simple approach would be an agent-level and RunConfig option, for example:

```python
agent = Agent(
   ...
   include_previous_reasoning=True
)

# and/or

Runner.run(
  agent,
  ...
  include_previous_reasoning=True
)
```

However, this is still not enough, because there is another important decision.

Should previous thinking tokens be stored internally (when converting to raw items in `to_input_list()`), even if they are not sent, in order to keep flexibility for sending them in future turns?

An alternative and more refined design approach is to use a small enum-based policy:

```text
previous_reasoning_policy =
1. KEEP            # keep internally, but do not send
2. KEEP_AND_SEND   # keep internally and send
3. DROP            # do not keep, do not send
```

This gives developers explicit control, instead of silently increasing context usage and token cost.

## Design Question

1. The OpenAI API does not currently fully support this behavior, though I believe this option should be added to give developers explicit control in the future.

    For the OpenAI API, previous reasoning tokens are still dropped server-side. When store=True is used, clients must continue to pass back the reasoning item IDs to avoid API errors, even though the actual reasoning content is not preserved.

    When store=False, the API allows developers to skip sending previous encrypted reasoning content, which helps reduce network bandwidth usage. So the proposed configuration still makes sense and provides real value by allowing explicit control over whether previous encrypted reasoning items are sent.

2. Because this is context management logic, not a model server-side parameter, I am unsure whether this belongs in ModelSettings, especially if it applies to LiteLLM only. I think it may be more appropriate as a top-level agent and run configuration as demonstrated in the code proposal above.

I’d like to get feedback or suggestions on the API design. If this approach makes sense, I can work on an implementation after https://github.com/openai/openai-agents-python/pull/2158  is merged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow configuration for passing previous thinking tokens #2203

Background

Proposed Design

Design Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow configuration for passing previous thinking tokens #2203

Description

Background

Proposed Design

Design Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions