LangGraph

LangGraph 是一个低层级的 Agent orchestration framework 和 runtime，面向长期运行、有状态、需要可靠控制的 Agent 或 workflow。它重点解决的是编排问题：如何让 Agent 在复杂任务中持久执行、恢复状态、流式输出、接入人工审核，并管理短期与长期记忆。

和 LangChain 的关系

和 LangChain 的关系可以这样理解：

LangChain：提供模型、工具、Agent loop 等上层抽象，适合快速构建常见 Agent。
LangGraph：提供底层编排能力，适合自定义复杂状态机、确定性流程、人工介入和长任务。
Deep Agents：在 LangGraph 之上提供更完整的 Agent harness，例如规划、子 Agent、文件系统工具和上下文管理。
LangSmith：负责 tracing、debugging、evaluation、prompt 和 deployment。

如果只是刚开始构建 Agent，通常先用 LangChain 的 create_agent。当你需要更精细地控制节点、边、状态、恢复、streaming 或 human-in-the-loop 时，再使用 LangGraph。

安装

pipuv

pip install -U langgraph

uv add langgraph

LangGraph 的核心是 graph：节点表示执行步骤，边表示步骤之间如何流转，state 表示跨节点传递和更新的数据。

from langgraph.graph import StateGraph, MessagesState, START, END

def mock_llm(state: MessagesState):
    return {"messages": [{"role": "ai", "content": "hello world"}]}

graph = StateGraph(MessagesState)
graph.add_node(mock_llm)
graph.add_edge(START, "mock_llm")
graph.add_edge("mock_llm", END)
graph = graph.compile()

graph.invoke({"messages": [{"role": "user", "content": "hi!"}]})

这个例子里：

MessagesState 是 graph 的 state schema，用来保存消息列表。
mock_llm 是一个 node，接收 state 并返回 state update。
START 和 END 是图的起点和终点。
compile() 会把定义好的 graph 编译成可执行对象。

核心能力

LangGraph 不抽象 prompt 或 Agent 架构，而是提供长期运行、有状态 workflow 需要的底层能力。

Durable Execution

Durable execution 让 Agent 可以在失败、中断或长时间运行后恢复执行。对于需要执行很多步骤、调用外部系统、等待人工反馈的 Agent，这一点很重要。

Human-in-the-loop

LangGraph 支持在任意节点或工具调用处中断，让人类检查、修改或批准 Agent 的状态和动作。适合高风险操作，例如发送邮件、修改数据库、执行交易或调用生产系统。

Memory

LangGraph 支持有状态 Agent，可以管理当前 conversation 的 short-term memory，也可以接入跨 session 的 long-term memory。状态是 graph 的一等公民，节点之间通过 state 传递信息。

Streaming

LangGraph 可以流式输出 Agent 执行过程中的状态变化、LLM token、自定义事件和子图输出。这样应用层可以实时展示 Agent 正在做什么，而不是等最终结果。

Debugging with LangSmith

LangGraph 可以和 LangSmith 配合，用于 tracing、debugging 和 evaluation。复杂 Agent 的行为通常跨多个节点和工具，LangSmith 可以帮助观察执行路径、状态变化和运行指标。

Production Deployment

LangGraph 适合部署长期运行、有状态的 Agent workflow。结合 LangSmith / LangGraph Server，可以获得更完整的部署、调试和观测能力。

适合场景

适合使用 LangGraph 的场景：

Agent 流程不只是简单的 LLM + tool loop。
需要明确的节点、边和状态转移。
需要 durable execution 和失败恢复。
需要 human-in-the-loop 审批或编辑。
需要长期运行、多步骤、可恢复的 workflow。
需要同时组合确定性逻辑和 Agentic 决策。
需要对 streaming、memory、subgraph 有更细粒度控制。

如果需求只是“模型根据用户问题调用工具并回答”，LangChain Agent 通常更简单。如果需求是“设计一个可靠、可恢复、可审核的复杂 Agent 系统”，LangGraph 更合适。

生态关系

LangGraph 可以独立使用，也可以和 LangChain 生态一起使用：

LangChain 提供模型、工具和 Agent 抽象，很多 LangChain Agent 底层基于 LangGraph。
LangSmith Observability 用于 trace、evaluate 和 monitor Agent。
LangSmith Deployment 用于部署和扩展长期运行、有状态的 Agent workflow。
Deep Agents 在 LangGraph 之上提供更高层的 Agent harness。

小结

LangGraph 的定位不是“更简单地写 Agent”，而是“更可控地编排 Agent”。它把 Agent 执行过程拆成 graph、node、edge 和 state，让开发者可以清楚地控制流程、持久化状态、处理中断、接入人工审核并支持流式输出。对于复杂 Agent 工程，LangGraph 是比单纯 prompt + tool calling 更可靠的底层运行时。