Skip to content

Implement Session Memory (Jules) #749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions docs/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ The most common properties of an agent you'll configure are:
- `instructions`: also known as a developer message or system prompt.
- `model`: which LLM to use, and optional `model_settings` to configure model tuning parameters like temperature, top_p, etc.
- `tools`: Tools that the agent can use to achieve its tasks.
- `memory`: Enables conversation memory for the agent. Can be `bool | SessionMemory | None`.
- `True`: Uses the default `SQLiteSessionMemory` (in-memory by default, suitable for single-process applications).
- `SessionMemory instance`: Uses the provided custom memory implementation (e.g., for persistent storage or custom logic).
- `None` (default): No memory is used. The agent will not remember previous turns, and conversation history must be managed manually by passing all previous messages in the `input` to `Runner.run()`.

```python
from agents import Agent, ModelSettings, function_tool
Expand Down Expand Up @@ -131,6 +135,150 @@ robot_agent = pirate_agent.clone(
)
```

## Agent Memory

The `memory` parameter on the `Agent` class allows you to easily enable conversation memory, so the agent can remember previous turns of a conversation.

When `memory` is enabled, the agent automatically loads history before calling the LLM and saves the new turn's interactions (input and output) after the LLM responds.

### Default Memory

Setting `memory=True` uses the default `SQLiteSessionMemory`, which stores the conversation in an in-memory SQLite database. This is convenient for quick setups and single-process applications.

```python
from agents import Agent, Runner # Ensure Runner is imported
import asyncio # For running async code

# Example for docs
async def run_conversation_with_default_memory():
agent = Agent(
name="ConversationalAgent",
instructions="Remember our previous conversation. Be friendly!",
model="o3-mini", # Assuming o3-mini is a valid model for your setup
memory=True # Enable default SQLite memory
)

# Let's mock the LLM responses for predictable behavior in docs
# In a real scenario, the LLM would generate these
# For this example, we'll assume the LLM just acknowledges or uses memory.

# First turn
# Mocking LLM to just acknowledge.
# In a real scenario, Runner.run would call the LLM.
# For documentation, we often show illustrative interaction patterns.
# Here, we'll simulate the interaction conceptually.

print("Simulating conversation with default memory:")

# Turn 1
user_input_1 = "My favorite color is blue."
print(f"User: {user_input_1}")
# In a real run: result1 = await Runner.run(agent, user_input_1)
# Simulated agent response:
agent_response_1 = "Okay, I'll remember that your favorite color is blue."
print(f"Agent: {agent_response_1}")
# Manually add to memory for simulation continuity if not running real LLM
if agent.memory: # Check if memory is enabled
await agent.memory.add_items([
{"role": "user", "content": user_input_1},
{"role": "assistant", "content": agent_response_1} # Or the structured output
])


# Turn 2
user_input_2 = "What did I say my favorite color was?"
print(f"User: {user_input_2}")
# In a real run: result2 = await Runner.run(agent, user_input_2)
# Simulated agent response (assuming LLM uses memory):
# The LLM would have access to the history: [user: "My fav color is blue", assistant: "Okay..."]
agent_response_2 = "You said your favorite color is blue."
print(f"Agent: {agent_response_2}")

# To actually run this example, you would need a configured model
# and uncomment the Runner.run calls, e.g.:
# agent_llm_mock = ... # setup a mock model for testing if needed
# agent.model = agent_llm_mock
# result1 = await Runner.run(agent, user_input_1)
# print(f"Agent: {result1.final_output}")
# result2 = await Runner.run(agent, user_input_2)
# print(f"Agent: {result2.final_output}")


# Example of how you might run it (if it were a fully runnable example):
# if __name__ == "__main__":
# asyncio.run(run_conversation_with_default_memory())
```

### Custom Memory

For more control, such as using persistent storage (e.g., a different database, file system) or implementing custom history management logic (e.g., summarization, windowing), you can provide your own session memory implementation.

The [`SessionMemory`][agents.memory.SessionMemory] type is a `typing.Protocol` (specifically, a `@runtime_checkable` protocol). This means your custom memory class must define all the methods specified by the protocol (like `get_history`, `add_items`, `add_message`, and `clear`) with matching signatures. While explicit inheritance from `SessionMemory` is not strictly required by the protocol mechanism for runtime checks (thanks to `@runtime_checkable`), inheriting is still good practice for clarity and to help with static type checking.

The example below demonstrates creating a custom memory class by inheriting from `SessionMemory`:

```python
from agents.memory import SessionMemory, TResponseInputItem # Adjust imports as necessary

class MyCustomMemory(SessionMemory):
def __init__(self):
self.history: list[TResponseInputItem] = []

async def get_history(self) -> list[TResponseInputItem]:
# In a real implementation, this might fetch from a DB
print(f"CustomMemory: Getting history (current length {len(self.history)})")
return list(self.history) # Return a copy

async def add_items(self, items: list[TResponseInputItem]) -> None:
# In a real implementation, this might save to a DB
print(f"CustomMemory: Adding {len(items)} items.")
self.history.extend(items)

async def add_message(self, item: TResponseInputItem) -> None:
# Helper, could be part of add_items
print(f"CustomMemory: Adding 1 message.")
self.history.append(item)

async def clear(self) -> None:
print("CustomMemory: Clearing history.")
self.history.clear()

# How to use the custom memory:
custom_memory_instance = MyCustomMemory()
custom_agent = Agent(
name="CustomMemoryAgent",
instructions="I have a special memory.",
model="o3-mini", # Example model
memory=custom_memory_instance
)

# Example usage (conceptual)
async def run_with_custom_memory():
print("\nSimulating conversation with custom memory:")
user_q1 = "My name is Bob."
print(f"User: {user_q1}")
# await Runner.run(custom_agent, user_q1) # Actual run
# Simulated interaction:
await custom_agent.memory.add_items([{"role": "user", "content": user_q1}, {"role": "assistant", "content": "Nice to meet you, Bob!"}])
print(f"Agent: Nice to meet you, Bob!")


user_q2 = "What's my name?"
print(f"User: {user_q2}")
# history_for_llm = await custom_agent.memory.get_history()
# print(f"History provided to LLM for 2nd turn: {history_for_llm}")
# await Runner.run(custom_agent, user_q2) # Actual run
# Simulated interaction:
print(f"Agent: Your name is Bob.") # Assuming LLM uses memory

# if __name__ == "__main__":
# asyncio.run(run_conversation_with_default_memory())
# asyncio.run(run_with_custom_memory())

```
As mentioned, the `SessionMemory` protocol defines `get_history`, `add_items`, `add_message`, and `clear` methods that your custom class must implement.


## Forcing tool use

Supplying a list of tools doesn't always mean the LLM will use a tool. You can force tool use by setting [`ModelSettings.tool_choice`][agents.model_settings.ModelSettings.tool_choice]. Valid values are:
Expand Down
33 changes: 33 additions & 0 deletions docs/running_agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,39 @@ async def main():
# California
```

!!! note "Simplified Conversations with Agent Memory"

The above example demonstrates manual conversation management. If the agent is configured with memory (e.g., `Agent(..., memory=True)`), the history is automatically managed. The same conversation would look like this:

```python
async def main_with_memory():
# Note: Agent is initialized with memory=True
agent_with_memory = Agent(
name="Assistant",
instructions="Reply very concisely. Remember our conversation.",
memory=True # Enables automatic memory management
)

with trace(workflow_name="ConversationWithMemory", group_id=thread_id): # Assuming thread_id is defined
# First turn
# The agent's memory is empty initially.
result1 = await Runner.run(agent_with_memory, "What city is the Golden Gate Bridge in?")
print(result1.final_output)
# Expected: San Francisco
# The agent's memory now contains:
# - User: "What city is the Golden Gate Bridge in?"
# - Assistant: "San Francisco" (or its structured representation)

# Second turn
# Runner.run will automatically use the history from agent_with_memory.memory
result2 = await Runner.run(agent_with_memory, "What state is it in?")
print(result2.final_output)
# Expected: California
# The agent's memory now contains the full conversation.
```
Refer to the [Agent Memory documentation in `agents.md`](agents.md#agent-memory) for more details on configuring memory.


## Exceptions

The SDK raises exceptions in certain cases. The full list is in [`agents.exceptions`][]. As an overview:
Expand Down
15 changes: 15 additions & 0 deletions src/agents/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from .handoffs import Handoff
from .items import ItemHelpers
from .logger import logger
from .memory import SessionMemory, SQLiteSessionMemory
from .mcp import MCPUtil
from .model_settings import ModelSettings
from .models.interface import Model
Expand All @@ -39,6 +40,12 @@ class ToolsToFinalOutputResult:
`output_type` of the agent.
"""

memory: bool | SessionMemory | None = field(default=None, repr=False)
"""If True, a default SQLiteSessionMemory will be used. If a SessionMemory instance is
provided, it will be used directly. If None or False, no memory will be used.
Set to `repr=False` because it can be a complex object.
"""


ToolsToFinalOutputFunction: TypeAlias = Callable[
[RunContextWrapper[TContext], list[FunctionToolResult]],
Expand Down Expand Up @@ -178,6 +185,14 @@ class Agent(Generic[TContext]):
"""Whether to reset the tool choice to the default value after a tool has been called. Defaults
to True. This ensures that the agent doesn't enter an infinite loop of tool usage."""

def __post_init__(self):
if self.memory is True:
# Defaulting to an in-memory SQLite database for now.
# This could be made configurable later if needed (e.g., via Agent constructor or a global config).
self.memory = SQLiteSessionMemory()
elif self.memory is False: # Explicitly setting memory to False
self.memory = None

def clone(self, **kwargs: Any) -> Agent[TContext]:
"""Make a copy of the agent, with the given arguments changed. For example, you could do:
```
Expand Down
117 changes: 117 additions & 0 deletions src/agents/memory.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
from __future__ import annotations

# Removed abc import as it's no longer needed by SessionMemory
import sqlite3
import json
import time
from typing import TYPE_CHECKING, Protocol, runtime_checkable # Added Protocol, runtime_checkable

if TYPE_CHECKING:
from .items import TResponseInputItem


@runtime_checkable
class SessionMemory(Protocol): # Changed from abc.ABC to Protocol
"""Protocol for session memory implementations."""

async def get_history(self) -> list[TResponseInputItem]:
"""Returns the conversation history as a list of input items."""
... # Changed from pass to ...

async def add_message(self, item: TResponseInputItem) -> None:
"""Adds a single message/item to the history."""
... # Changed from pass to ...

async def add_items(self, items: list[TResponseInputItem]) -> None:
"""Adds a list of items to the history."""
... # Changed from pass to ...

async def clear(self) -> None:
"""Clears the entire history."""
... # Changed from pass to ...


class SQLiteSessionMemory(SessionMemory): # SQLiteSessionMemory still "implements" the protocol
"""
A SessionMemory implementation that uses an SQLite database to store conversation history.
Each message is stored as a JSON string in the database.
"""

def __init__(self, db_path: str | None = None, *, table_name: str = "chat_history"):
"""
Initializes the SQLite session memory.

Args:
db_path: Path to the SQLite database file. If None, an in-memory database is used.
table_name: The name of the table to store chat history.
"""
self.db_path = db_path if db_path else ":memory:"
self.table_name = table_name
self._init_db()

def _get_conn(self):
# For a simple default, synchronous sqlite3 is okay.
# For production async, aiosqlite would be better.
return sqlite3.connect(self.db_path)

def _init_db(self):
with self._get_conn() as conn:
cursor = conn.cursor()
# Added session_id to allow for multiple conversations, though not used in this version
cursor.execute(f"""
CREATE TABLE IF NOT EXISTS {self.table_name} (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp REAL NOT NULL,
item_json TEXT NOT NULL
)
""")
cursor.execute(f"CREATE INDEX IF NOT EXISTS idx_timestamp ON {self.table_name} (timestamp)")
conn.commit()

async def get_history(self) -> list[TResponseInputItem]:
"""Returns the conversation history, ordered by timestamp."""
with self._get_conn() as conn:
cursor = conn.cursor()
cursor.execute(f"SELECT item_json FROM {self.table_name} ORDER BY timestamp ASC")
rows = cursor.fetchall()
history = []
for row in rows:
try:
item = json.loads(row[0])
history.append(item)
except json.JSONDecodeError as e:
# In a real app, use logging
print(f"Warning: SQLiteSessionMemory - Could not decode JSON from database: {row[0]}. Error: {e}")
return history

async def add_message(self, item: TResponseInputItem) -> None:
"""Adds a single message/item to the history."""
# This can be implemented more efficiently if needed, but for simplicity:
await self.add_items([item])

async def add_items(self, items: list[TResponseInputItem]) -> None:
"""Adds a list of items to the history."""
current_timestamp = time.time()
with self._get_conn() as conn:
cursor = conn.cursor()
for i, item in enumerate(items):
# Ensure unique timestamp for ordering within a batch
item_timestamp = current_timestamp + (i * 1e-7) # Small offset for ordering
try:
item_json = json.dumps(item)
except TypeError as e:
print(f"Warning: SQLiteSessionMemory - Error serializing item to JSON: {item}. Error: {e}")
continue

cursor.execute(
f"INSERT INTO {self.table_name} (timestamp, item_json) VALUES (?, ?)",
(item_timestamp, item_json),
)
conn.commit()

async def clear(self) -> None:
"""Clears the entire history from the table."""
with self._get_conn() as conn:
cursor = conn.cursor()
cursor.execute(f"DELETE FROM {self.table_name}")
conn.commit()
19 changes: 18 additions & 1 deletion src/agents/result.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,24 @@ def final_output_as(self, cls: type[T], raise_if_incorrect_type: bool = False) -
return cast(T, self.final_output)

def to_input_list(self) -> list[TResponseInputItem]:
"""Creates a new input list, merging the original input with all the new items generated."""
"""
Creates a new list of input items, representing the sequence of interactions
for the specific agent run that produced this result. It merges the
`self.input` (the input items that initiated this particular run) with
all `self.new_items` (items generated during this run, like messages,
tool calls, and tool results).

This method is useful for:
- Manually continuing a conversation if the agent is run without session memory.
- Inspecting the specific inputs and outputs of a single `Runner.run()` call,
even if that run was part of a larger conversation managed by built-in agent memory.
- Extracting a specific segment of a conversation for logging or debugging.

Note: If the agent has active session memory, this list does NOT include
items from turns prior to this specific `RunResult`'s generating run.
To get the complete history from an agent with memory, you would typically
access the memory object directly if needed for other purposes.
"""
original_items: list[TResponseInputItem] = ItemHelpers.input_to_new_input_list(self.input)
new_items = [item.to_input_item() for item in self.new_items]

Expand Down
Loading