April 28, 2025

Building Organizational Memory with Zep: A Developer’s Guide

In today’s AI-driven landscape, creating systems with long-term memory capabilities has become increasingly important. Whether you’re building a customer service chatbot that remembers interactions with its users and maintaining history longer than the context window are crucial parts of a successful system. While there are many components that go into building a fully functional long-term memory solution, Zep can be a powerful tool to help developers implement long-term memory in their AI applications.

What is Zep?

Zep is an API-based solution that enables developers to store structured and unstructured data as well as memories from various conversations. It provides a way to store, retrieve, and contextualize conversation history and general knowledge, assisting LLM’s to maintain context over extended periods.

Key Features and Architecture

Zep’s memory system is organized into a graph where all data is stored and maintained. The graph data is synthesized and managed by Graphiti Zep’s underlying memory storage solution. Graphiti creates a knowledge graph about the data which contains a network of interconnected facts, such as “Customer John wants to close his account.” Each of the facts contain a “triplet” represented by two entities, or nodes (”John”, “Account”), and their relationship, also known as edges (”close”). Zep’s API wraps on top of graffiti and doesn’t require you to know or understand the base level tech to use the API.

Depending on your needs, Zep offers two main ways to store, manage, and retrieve your data. Lets look at how you can store various types of information in your knowledge graph:

1. User Data

Many AI integrated applications have some concept of a user or individual interacting with your platform. Oftentimes you may want to remember specific information pertaining to that user and not “mix” that information between other users. Certain data could potentially be private or confusing to combine with the user’s Graph so accessing the information only when needed can help reduce any confusion to the user.This is where user data comes into play. User data represents individual interactions with your application. Each user has their own segmented graph with sessions that track conversation progression over time. This creates personalized memory contexts for each user.

from zep_cloud.client import Zep

# Initialize the Zep client
client = Zep(api_key="your_api_key")

# Add a new session for this user
session = client.memory.add_session(
    user_id="user_123",
    session_id="session_456"
)

# Add messages to the session
messages = [
Message(
      	role_type="user", 
           content="response"
      )
]

client.memory.add(
    user_id="user_123",
    session_id="session_456",
    messages=messages
)

# Create or access a user memory
memory = client.graph.search(
        query="query", 
        user_id="user_123"
)

2. Group Data

Sometimes, you may want to store some general information about your product or other organizational information into the graph to help add further context when sending information back to your LLM. Group data allows shared context across multiple users. This prevents the need for duplicating the data into each user’s memory graph. Each group similar to user data contains its own memory graph which can allow you to organize memory in various groups for various needs (“group_product_info”, “group_employee_names”, etc). However due to the nature of how the knowledge graph is built and indexed, static data or information that may rarely change might be the best use of this feature (more on that later).

# Create or access a group memory
client.group.add(group_id="group_id")

# Add data to the group
client.graph.add(
group_id="group_id",
type="text",
data="data"
)

Integrating Zep into your LLM

Now that we understand the basic structure of how data is organized in Zep, let’s explore the areas where this tool could be used in your applications.

Long-Term Conversation History

Zep excels at maintaining comprehensive chat histories between users and AI. The API allows you to perform several actions on the graph depending on your needs. Below are some of the ways you can interact with the session objects stored in the user graph.

You can pull individual sessions and their respective chat messages
Perform natural language queries to be run on specific sessions to pull out certain contextual information
Each session object returns its own context string that is updated as the conversation evolves.

When storing and retrieving data from your knowledge graph, it’s important to understand the ingestion process—the phase where data is indexed into the graph for future use. This process utilizes LLMs on Zep’s side to generate the knowledge graph, which can sometimes be time-consuming depending on the type and volume of uploaded data. While there are strategies to mitigate these delays, integrating complementary data stores alongside Zep will significantly enhance the quality and responsiveness of your AI interactions.

# Retrieve recent context for a user
recent_context = client.memory.get(
    session_id="last_session_id"
).context

# Incorporate this context into your LLM prompt
llm_prompt = f"""
User history context:
{recent_context}

Now respond to the current query: {current_query}
"""

Memory Segmentation

The API’s architecture allows you to segment memories by groups, making it easier to retrieve relevant context. This is best used for gathering general “high-level” context of information regarding memory in a group’s knowledge graph. The data stored within groups should ideally be static data or information that doesn’t require very specific knowledge about your organization. For more granular specific information about your org. such as research documents, you would need to integrate some other data retrieval mechanism to return data for your LLM’s context. It’s also important to keep in mind the size of the queries as that can affect the performance of retrieving the results back from Zep.

# Search for specific memories related to a topic
search_results = client.graph.search(
    group_id="department_roles",
    query="Who is the head of engineering?",
    limit=1
)



def get_memory_facts(edges):
    if edges:
        print("\n".join([edge.fact for edge in edges]))
    else: 
        print("None")


facts = get_memory_facts(search_results.edges)
print(f"facts:\n{facts}")

Building a Complete Memory Solution

While Zep provides great conversation memory capabilities, it represents just one piece of a comprehensive memory system. Data retrieval from the API is best suited for general big picture context. For more complex applications where precision is needed, consider using Zep alongside other specialized tools:

Use Zep for: Conversation history, user preferences, and basic contextual memory
Complement with: Vector databases (like Pinecone or Weaviate) for semantic search across large document collections
Consider adding: A structured database for fast retrieval of specific facts or records

# Example of a hybrid approach
def get_context_for_query(user_id, query):
    # Get user context from Zep
    user_context = client.memory.get(
    	session_id="session_id"
    ).context
    
    # Get relevant documents from vector DB
    relevant_docs = vector_db.similarity_search(query, k=2)
    
    # Get structured data from traditional DB
    structured_data = sql_db.query(f"SELECT * FROM product_info WHERE keywords LIKE '%{query}%'")
    
    # Combine all contexts
    return {
        "user_history": user_context,
        "knowledge_base": relevant_docs,
        "product_data": structured_data
    }

Conclusion

Zep provides an accessible entry point into building systems with organizational memory. Its intuitive API and straightforward integration make it a valuable tool, especially for managing conversational history and high level context to AI agents.

While it has limitations, particularly around ingestion speed and complex querying, these can be mitigated through proper architecture design. For most applications, Zep works best as part of a broader memory system, complemented by other specialized tools for different types of memory and retrieval needs.

As you begin implementing your own organizational memory system, consider starting with Zep for conversation management, and then expand your architecture as your needs grow more complex.

Note: The code examples in this post are simplified for illustration purposes. Refer to the official Zep documentation for comprehensive implementation details.