LLMs don't remember anything. Every API call starts cold. LangChain's memory abstractions work around this by injecting conversation history back into each prompt at call time — but the pattern you pick has real consequences for token cost, latency, and how well context survives long sessions.
A March 14 guide on the LoopPass blog covers five patterns in Python, each with a different approach to that trade-off:
-- Transcript (ConversationBufferMemory): stores the full verbatim message history. Simple and complete, but token costs compound with every turn. Best suited to short, high-precision tasks. -- Window (ConversationBufferWindowMemory): retains only the last k turns. Token usage stays predictable; anything older is gone. -- Summary (ConversationSummaryMemory): runs a secondary LLM call to maintain a compressed digest of the conversation. Lower overhead for long sessions, but introduces <a href="/news/2026-03-14-context-gateway-llm-compression-proxy">extra latency and model-dependent compression quality</a>. -- Entity (ConversationEntityMemory): extracts structured facts about named people or concepts and surfaces them on each turn. The go-to for personalized assistants that need to recall user-specific details across a session. -- Vector Retrieval: stores history in a vector database like FAISS and uses semantic search to pull only the most relevant past exchanges. No fixed window, no summary artifacts — relevant context surfaces based on what the current query resembles.
Vector Retrieval is the most architecturally distinct of the five. It treats conversation history as a searchable corpus rather than a linear buffer, which scales better for <a href="/news/2026-03-15-self-evolving-skill-pattern-claude-code-five-gate-governance">multi-session applications</a> and longer-running agents. The guide illustrates each pattern with minimal Python and examples including a password-reset support bot and a personalized coding tutor.
One thing the guide skips: LangChain has been steering practitioners toward LangGraph for stateful agent design, where memory is managed as explicit graph state rather than injected abstractions. These five patterns remain practical for chatbot-style applications. For autonomous agents with branching logic and complex state requirements, LangGraph is increasingly the default starting point. Right now, knowing both is the baseline.