Short-Term Memory
短期记忆让应用能够在单个 thread 或 conversation 中记住之前的交互。对 Agent 来说,memory 很关键:它让 Agent 能记住上下文、从反馈中调整行为,并适配用户偏好。
Note
Thread 可以理解为一次会话中的多轮交互,类似邮件系统里同一个 conversation 下的一组消息。
最常见的短期记忆就是 conversation history。问题是,对话越长,messages 列表越长,最终可能超过模型 context window。即使模型支持很长上下文,长上下文也常带来更高成本、更慢响应,以及被过期或无关内容干扰的问题。
Chat model 通过 Messages 接收上下文,包括 system message、human message、AI message 和 tool message。很多应用需要主动“忘记”过期信息,或把旧消息压缩成摘要。
如果需要跨 conversation 保存信息,例如用户长期偏好、账户资料、历史事实,应使用 long-term memory;short-term memory 主要解决单个 thread 内的状态持久化和上下文管理。
使用方式
要给 Agent 添加短期记忆,也就是 thread-level persistence,需要在创建 Agent 时指定 checkpointer。
LangChain Agent 会把短期记忆作为 Agent state 的一部分管理。默认 state 中最重要的字段是 messages。这些 state 会通过 checkpointer 持久化到内存或数据库中,因此同一个 thread_id 可以在后续调用中恢复。
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver
agent = create_agent(
"gpt-5.4",
tools=[get_user_info],
checkpointer=InMemorySaver(),
)
agent.invoke(
{"messages": [{"role": "user", "content": "Hi! My name is Bob."}]},
{"configurable": {"thread_id": "1"}},
)
关键点是 thread_id。不同 thread_id 对应不同会话状态,Agent 可以隔离不同用户或不同会话的上下文。
生产环境
生产环境不要依赖纯内存 checkpointer。可以使用数据库支持的 checkpointer,例如 Postgres:
pip install langgraph-checkpoint-postgres
from langchain.agents import create_agent
from langgraph.checkpoint.postgres import PostgresSaver
DB_URI = "postgresql://postgres:postgres@localhost:5432/postgres?sslmode=disable"
with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
checkpointer.setup()
agent = create_agent(
"gpt-5.4",
tools=[get_user_info],
checkpointer=checkpointer,
)
除了 Postgres,也可以使用 SQLite、Azure Cosmos DB 等 checkpointer 实现。
自定义 Agent Memory
默认情况下,Agent 使用 AgentState 管理短期记忆,其中包含 messages 字段。你可以扩展 AgentState,加入自己的短期状态字段,例如 user_id、preferences、任务阶段、计数器等。
from langchain.agents import create_agent, AgentState
from langgraph.checkpoint.memory import InMemorySaver
class CustomAgentState(AgentState):
user_id: str
preferences: dict
agent = create_agent(
"gpt-5.4",
tools=[get_user_info],
state_schema=CustomAgentState,
checkpointer=InMemorySaver(),
)
result = agent.invoke(
{
"messages": [{"role": "user", "content": "Hello"}],
"user_id": "user_123",
"preferences": {"theme": "dark"},
},
{"configurable": {"thread_id": "1"}},
)
自定义 state 适合放当前会话中需要被后续步骤读取的信息。跨会话持久化的数据应放到 Store 或 long-term memory。
常见模式
启用短期记忆后,长对话可能超过模型上下文窗口。常见处理方式有:
- Trim messages:在调用模型前只保留最近 N 条消息,或按 token 数裁剪。
- Delete messages:从 LangGraph state 中永久删除特定消息或全部消息。
- Summarize messages:把早期消息总结成摘要,用摘要替代完整历史。
- Custom strategies:自定义过滤、分组、压缩或重要性选择策略。
这些策略的目标都是在尽量保留关键信息的同时,避免上下文窗口溢出。
裁剪消息
可以在 @before_model middleware 中裁剪 message history。下面示例保留第一条消息和最近几条消息。
from typing import Any
from langchain.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import before_model
from langgraph.runtime import Runtime
from langchain_core.runnables import RunnableConfig
@before_model
def trim_messages(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
"""Keep only the last few messages to fit context window."""
messages = state["messages"]
if len(messages) <= 3:
return None
first_msg = messages[0]
recent_messages = messages[-3:] if len(messages) % 2 == 0 else messages[-4:]
new_messages = [first_msg] + recent_messages
return {
"messages": [
RemoveMessage(id=REMOVE_ALL_MESSAGES),
*new_messages,
]
}
agent = create_agent(
your_model_here,
tools=your_tools_here,
middleware=[trim_messages],
checkpointer=InMemorySaver(),
)
config: RunnableConfig = {"configurable": {"thread_id": "1"}}
agent.invoke({"messages": "hi, my name is bob"}, config)
agent.invoke({"messages": "write a short poem about cats"}, config)
agent.invoke({"messages": "now do the same but for dogs"}, config)
final_response = agent.invoke({"messages": "what's my name?"}, config)
裁剪时要注意消息历史的合法性。例如一些 provider 要求历史以 user message 开始,带 tool calls 的 assistant message 后面必须跟对应 tool result。
删除消息
可以用 RemoveMessage 从 graph state 中删除消息。默认 AgentState 的 messages 字段已经使用 add_messages reducer,因此支持删除消息。
删除特定消息:
from langchain.messages import RemoveMessage
def delete_messages(state):
messages = state["messages"]
if len(messages) > 2:
return {"messages": [RemoveMessage(id=m.id) for m in messages[:2]]}
删除所有消息:
from langchain.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
def delete_messages(state):
return {"messages": [RemoveMessage(id=REMOVE_ALL_MESSAGES)]}
也可以在 @after_model middleware 中删除旧消息:
from langchain.messages import RemoveMessage
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import after_model
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.runtime import Runtime
@after_model
def delete_old_messages(state: AgentState, runtime: Runtime) -> dict | None:
"""Remove old messages to keep conversation manageable."""
messages = state["messages"]
if len(messages) > 2:
return {"messages": [RemoveMessage(id=m.id) for m in messages[:2]]}
return None
agent = create_agent(
"gpt-5-nano",
tools=[],
system_prompt="Please be concise and to the point.",
middleware=[delete_old_messages],
checkpointer=InMemorySaver(),
)
删除消息会丢失信息,因此更适合清理过期、无用或不应继续保留的上下文。
总结消息
裁剪和删除都会直接丢弃信息。更精细的方式是把早期消息总结成摘要,再用摘要替代完整历史。LangChain 提供内置 SummarizationMiddleware。
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.runnables import RunnableConfig
checkpointer = InMemorySaver()
agent = create_agent(
model="gpt-5.4",
tools=[],
middleware=[
SummarizationMiddleware(
model="gpt-5.4-mini",
trigger=("tokens", 4000),
keep=("messages", 20),
)
],
checkpointer=checkpointer,
)
config: RunnableConfig = {"configurable": {"thread_id": "1"}}
agent.invoke({"messages": "hi, my name is bob"}, config)
agent.invoke({"messages": "write a short poem about cats"}, config)
agent.invoke({"messages": "now do the same but for dogs"}, config)
final_response = agent.invoke({"messages": "what's my name?"}, config)
总结适合长对话,但摘要质量会影响后续回答。重要事实最好结构化保存到 state 或 long-term memory,而不是只依赖自然语言摘要。
访问 Memory
短期记忆本质上是 Agent state。可以在 tools、prompt middleware、before_model 和 after_model 中读取或修改它。
在工具中读取短期记忆
工具可以通过 ToolRuntime 读取 state。runtime 参数对模型隐藏,不会出现在 tool schema 里。
from langchain.agents import create_agent, AgentState
from langchain.tools import tool, ToolRuntime
class CustomState(AgentState):
user_id: str
@tool
def get_user_info(runtime: ToolRuntime) -> str:
"""Look up user info."""
user_id = runtime.state["user_id"]
return "User is John Smith" if user_id == "user_123" else "Unknown user"
agent = create_agent(
model="gpt-5-nano",
tools=[get_user_info],
state_schema=CustomState,
)
result = agent.invoke({
"messages": "look up user information",
"user_id": "user_123",
})
从工具写入短期记忆
工具可以返回 Command 更新 state。这适合保存中间结果,或让后续工具和 prompt 能读取新信息。
from pydantic import BaseModel
from langchain.tools import tool, ToolRuntime
from langchain.messages import ToolMessage
from langchain.agents import create_agent, AgentState
from langgraph.types import Command
class CustomState(AgentState):
user_name: str
class CustomContext(BaseModel):
user_id: str
@tool
def update_user_info(runtime: ToolRuntime[CustomContext, CustomState]) -> Command:
"""Look up and update user info."""
user_id = runtime.context.user_id
name = "John Smith" if user_id == "user_123" else "Unknown user"
return Command(update={
"user_name": name,
"messages": [
ToolMessage(
"Successfully looked up user information",
tool_call_id=runtime.tool_call_id,
)
],
})
@tool
def greet(runtime: ToolRuntime[CustomContext, CustomState]) -> str | Command:
"""Use this to greet the user once you found their info."""
user_name = runtime.state.get("user_name", None)
if user_name is None:
return Command(update={
"messages": [
ToolMessage(
"Please call the 'update_user_info' tool first.",
tool_call_id=runtime.tool_call_id,
)
]
})
return f"Hello {user_name}!"
agent = create_agent(
model="gpt-5-nano",
tools=[update_user_info, greet],
state_schema=CustomState,
context_schema=CustomContext,
)
在 Prompt 中访问短期记忆
可以用 dynamic_prompt middleware 根据 runtime context 或 state 动态生成 system prompt。
from typing import TypedDict
from langchain.agents import create_agent
from langchain.agents.middleware import dynamic_prompt, ModelRequest
class CustomContext(TypedDict):
user_name: str
def get_weather(city: str) -> str:
"""Get the weather in a city."""
return f"The weather in {city} is always sunny!"
@dynamic_prompt
def dynamic_system_prompt(request: ModelRequest) -> str:
user_name = request.runtime.context["user_name"]
return f"You are a helpful assistant. Address the user as {user_name}."
agent = create_agent(
model="gpt-5-nano",
tools=[get_weather],
middleware=[dynamic_system_prompt],
context_schema=CustomContext,
)
Before Model
@before_model middleware 会在模型调用前运行,适合做消息裁剪、上下文注入、敏感信息过滤、动态 prompt 准备等。
%%{
init: {
"fontFamily": "monospace",
"flowchart": {
"curve": "basis"
}
}
}%%
graph TD
S(["\_\_start\_\_"])
PRE(before_model)
MODEL(model)
TOOLS(tools)
END(["\_\_end\_\_"])
S --> PRE
PRE --> MODEL
MODEL -.-> TOOLS
MODEL -.-> END
TOOLS --> PRE
classDef blueHighlight fill:#E5F4FF,stroke:#006DDD,color:#030710;
classDef neutral fill:#F2FAFF,stroke:#40668D,stroke-width:2px,color:#2F4B68;
class S blueHighlight;
class END blueHighlight;
class PRE,MODEL,TOOLS neutral;
from typing import Any
from langchain.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import before_model
from langgraph.runtime import Runtime
@before_model
def trim_messages(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
"""Keep only the last few messages to fit context window."""
messages = state["messages"]
if len(messages) <= 3:
return None
first_msg = messages[0]
recent_messages = messages[-3:] if len(messages) % 2 == 0 else messages[-4:]
new_messages = [first_msg] + recent_messages
return {
"messages": [
RemoveMessage(id=REMOVE_ALL_MESSAGES),
*new_messages,
]
}
agent = create_agent(
"gpt-5-nano",
tools=[],
middleware=[trim_messages],
checkpointer=InMemorySaver(),
)
After Model
@after_model middleware 会在模型调用后运行,适合做响应校验、消息删除、敏感内容过滤,或根据模型输出更新 state。
%%{
init: {
"fontFamily": "monospace",
"flowchart": {
"curve": "basis"
}
}
}%%
graph TD
S(["\_\_start\_\_"])
MODEL(model)
POST(after_model)
TOOLS(tools)
END(["\_\_end\_\_"])
S --> MODEL
MODEL --> POST
POST -.-> END
POST -.-> TOOLS
TOOLS --> MODEL
classDef blueHighlight fill:#E5F4FF,stroke:#006DDD,color:#030710;
classDef greenHighlight fill:#F6FFDB,stroke:#6E8900,color:#2E3900;
classDef neutral fill:#F2FAFF,stroke:#40668D,stroke-width:2px,color:#2F4B68;
class S blueHighlight;
class END blueHighlight;
class POST greenHighlight;
class MODEL,TOOLS neutral;
from langchain.messages import RemoveMessage
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import after_model
from langgraph.runtime import Runtime
@after_model
def validate_response(state: AgentState, runtime: Runtime) -> dict | None:
"""Remove messages containing sensitive words."""
STOP_WORDS = ["password", "secret"]
last_message = state["messages"][-1]
if any(word in last_message.content for word in STOP_WORDS):
return {"messages": [RemoveMessage(id=last_message.id)]}
return None
agent = create_agent(
model="gpt-5-nano",
tools=[],
middleware=[validate_response],
checkpointer=InMemorySaver(),
)
小结
Short-term memory 解决的是单个 thread 内的上下文持久化和管理。它通过 checkpointer 保存 Agent state,并通过 thread_id 恢复会话。随着对话变长,需要配合 trim、delete、summarize 和 middleware 管理上下文窗口。对真实 Agent 来说,短期记忆不只是保存聊天记录,还包括状态 schema、工具读写、动态 prompt 和消息生命周期管理。