How to add message history
This guide assumes familiarity with the following concepts:
This guide previously covered the RunnableWithMessageHistory abstraction. You can access this version of the guide in the v0.2 docs.
As of the v0.3 release of LangChain, we recommend that LangChain users take advantage of LangGraph persistence to incorporate memory
into new LangChain applications.
If your code is already relying on RunnableWithMessageHistory
or BaseChatMessageHistory
, you do not need to make any changes. We do not plan on deprecating this functionality in the near future as it works for simple chat applications and any code that uses RunnableWithMessageHistory
will continue to work as expected.
Please see How to migrate to LangGraph Memory for more details.
Passing conversation state into and out a chain is vital when building a chatbot. LangGraph implements a built-in persistence layer, allowing chain states to be automatically persisted in memory, or external backends such as SQLite, Postgres or Redis. Details can be found in the LangGraph persistence documentation.
In this guide we demonstrate how to add persistence to arbitrary LangChain runnables by wrapping them in a minimal LangGraph application. This lets us persist the message history and other elements of the chain's state, simplifying the development of multi-turn applications. It also supports multiple threads, enabling a single application to interact separately with multiple users.
Setupโ
Let's initialize a chat model:
- OpenAI
- Anthropic
- Azure
- Cohere
- NVIDIA
- FireworksAI
- Groq
- MistralAI
- TogetherAI
pip install -qU langchain-openai
import getpass
import os
os.environ["OPENAI_API_KEY"] = getpass.getpass()
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
pip install -qU langchain-anthropic
import getpass
import os
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass()
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-3-5-sonnet-20240620")
pip install -qU langchain-openai
import getpass
import os
os.environ["AZURE_OPENAI_API_KEY"] = getpass.getpass()
from langchain_openai import AzureChatOpenAI
llm = AzureChatOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
)
pip install -qU langchain-google-vertexai
import getpass
import os
os.environ["GOOGLE_API_KEY"] = getpass.getpass()
from langchain_google_vertexai import ChatVertexAI
llm = ChatVertexAI(model="gemini-1.5-flash")
pip install -qU langchain-cohere
import getpass
import os
os.environ["COHERE_API_KEY"] = getpass.getpass()
from langchain_cohere import ChatCohere
llm = ChatCohere(model="command-r-plus")
pip install -qU langchain-nvidia-ai-endpoints
import getpass
import os
os.environ["NVIDIA_API_KEY"] = getpass.getpass()
from langchain import ChatNVIDIA
llm = ChatNVIDIA(model="meta/llama3-70b-instruct")
pip install -qU langchain-fireworks
import getpass
import os
os.environ["FIREWORKS_API_KEY"] = getpass.getpass()
from langchain_fireworks import ChatFireworks
llm = ChatFireworks(model="accounts/fireworks/models/llama-v3p1-70b-instruct")
pip install -qU langchain-groq
import getpass
import os
os.environ["GROQ_API_KEY"] = getpass.getpass()
from langchain_groq import ChatGroq
llm = ChatGroq(model="llama3-8b-8192")
pip install -qU langchain-mistralai
import getpass
import os
os.environ["MISTRAL_API_KEY"] = getpass.getpass()
from langchain_mistralai import ChatMistralAI
llm = ChatMistralAI(model="mistral-large-latest")
pip install -qU langchain-openai
import getpass
import os
os.environ["TOGETHER_API_KEY"] = getpass.getpass()
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://api.together.xyz/v1",
api_key=os.environ["TOGETHER_API_KEY"],
model="mistralai/Mixtral-8x7B-Instruct-v0.1",
)
Example: message inputsโ
Adding memory to a chat model provides a simple example. Chat models accept a list of messages as input and output a message. LangGraph includes a built-in MessagesState
that we can use for this purpose.
Below, we:
- Define the graph state to be a list of messages;
- Add a single node to the graph that calls a chat model;
- Compile the graph with an in-memory checkpointer to store messages between runs.
The output of a LangGraph application is its state. This can be any Python type, but in this context it will typically be a TypedDict
that matches the schema of your runnable.
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
# Define a new graph
workflow = StateGraph(state_schema=MessagesState)
# Define the function that calls the model
def call_model(state: MessagesState):
response = llm.invoke(state["messages"])
# Update message history with response:
return {"messages": response}
# Define the (single) node in the graph
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)
# Add memory
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
When we run the application, we pass in a configuration dict
that specifies a thread_id
. This ID is used to distinguish conversational threads (e.g., between different users).
config = {"configurable": {"thread_id": "abc123"}}
We can then invoke the application:
query = "Hi! I'm Bob."
input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print() # output contains all messages in state
==================================[1m Ai Message [0m==================================
It's nice to meet you, Bob! I'm Claude, an AI assistant created by Anthropic. How can I help you today?
query = "What's my name?"
input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()
==================================[1m Ai Message [0m==================================
Your name is Bob, as you introduced yourself at the beginning of our conversation.
Note that states are separated for different threads. If we issue the same query to a thread with a new thread_id
, the model indicates that it does not know the answer:
query = "What's my name?"
config = {"configurable": {"thread_id": "abc234"}}
input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()
==================================[1m Ai Message [0m==================================
I'm afraid I don't actually know your name. As an AI assistant, I don't have personal information about you unless you provide it to me directly.
Example: dictionary inputsโ
LangChain runnables often accept multiple inputs via separate keys in a single dict
argument. A common example is a prompt template with multiple parameters.
Whereas before our runnable was a chat model, here we chain together a prompt template and chat model.
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
[
("system", "Answer in {language}."),
MessagesPlaceholder(variable_name="messages"),
]
)
runnable = prompt | llm
For this scenario, we define the graph state to include these parameters (in addition to the message history). We then define a single-node graph in the same way as before.
Note that in the below state:
- Updates to the
messages
list will append messages; - Updates to the
language
string will overwrite the string.
from typing import Sequence
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict
class State(TypedDict):
messages: Annotated[Sequence[BaseMessage], add_messages]
language: str
workflow = StateGraph(state_schema=State)
def call_model(state: State):
response = runnable.invoke(state)
# Update message history with response:
return {"messages": [response]}
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
config = {"configurable": {"thread_id": "abc345"}}
input_dict = {
"messages": [HumanMessage("Hi, I'm Bob.")],
"language": "Spanish",
}
output = app.invoke(input_dict, config)
output["messages"][-1].pretty_print()
==================================[1m Ai Message [0m==================================
ยกHola, Bob! Es un placer conocerte.
Managing message historyโ
The message history (and other elements of the application state) can be accessed via .get_state
:
state = app.get_state(config).values
print(f'Language: {state["language"]}')
for message in state["messages"]:
message.pretty_print()
Language: Spanish
================================[1m Human Message [0m=================================
Hi, I'm Bob.
==================================[1m Ai Message [0m==================================
ยกHola, Bob! Es un placer conocerte.
We can also update the state via .update_state
. For example, we can manually append a new message:
from langchain_core.messages import HumanMessage
_ = app.update_state(config, {"messages": [HumanMessage("Test")]})
state = app.get_state(config).values
print(f'Language: {state["language"]}')
for message in state["messages"]:
message.pretty_print()
Language: Spanish
================================[1m Human Message [0m=================================
Hi, I'm Bob.
==================================[1m Ai Message [0m==================================
ยกHola, Bob! Es un placer conocerte.
================================[1m Human Message [0m=================================
Test
For details on managing state, including deleting messages, see the LangGraph documentation: