LangChain Memory Patterns: How to Give Stateless LLMs Conversational Context

LLMs don't remember anything. Every API call starts cold. LangChain's memory abstractions work around this by injecting conversation history back into each prompt at call time — but the pattern you pick has real consequences for token cost, latency, and how well context survives long sessions.

A March 14 guide on the LoopPass blog covers five patterns in Python, each with a different approach to that trade-off:

-- Transcript (ConversationBufferMemory): stores the full verbatim message history. Simple and complete, but token costs compound with every turn. Best suited to short, high-precision tasks. -- Window (ConversationBufferWindowMemory): retains only the last k turns. Token usage stays predictable; anything older is gone. -- Summary (ConversationSummaryMemory): runs a secondary LLM call to maintain a compressed digest of the conversation. Lower overhead for long sessions, but introduces extra latency and model-dependent compression quality. -- Entity (ConversationEntityMemory): extracts structured facts about named people or concepts and surfaces them on each turn. The go-to for personalized assistants that need to recall user-specific details across a session. -- Vector Retrieval: stores history in a vector database like FAISS and uses semantic search to pull only the most relevant past exchanges. No fixed window, no summary artifacts — relevant context surfaces based on what the current query resembles.

Vector Retrieval is the most architecturally distinct of the five. It treats conversation history as a searchable corpus rather than a linear buffer, which scales better for multi-session applications and longer-running agents. The guide illustrates each pattern with minimal Python and examples including a password-reset support bot and a personalized coding tutor.

One thing the guide skips: LangChain has been steering practitioners toward LangGraph for stateful agent design, where memory is managed as explicit graph state rather than injected abstractions. These five patterns remain practical for chatbot-style applications. For autonomous agents with branching logic and complex state requirements, LangGraph is increasingly the default starting point. Right now, knowing both is the baseline.