Building Dynamic LLM Workflows with EmbeddingRouterChain: A Guide to Content-Based Routing in LangChain

April 28, 2025 6 minute read

Building Dynamic LLM Workflows with EmbeddingRouterChain: A Guide to Content-Based Routing in LangChain

In modern LLM applications, the ability to intelligently route user inputs to different processing chains is becoming increasingly important. Rather than creating rigid, rule-based routing logic, developers are turning to content-based routing that can dynamically determine the appropriate path for each input. LangChain’s EmbeddingRouterChain provides a powerful solution for this challenge by using embeddings to make intelligent routing decisions. In this article, we’ll explore how to implement this approach in your LLM applications.

Understanding Content-Based Routing

Before diving into the implementation details, let’s understand what content-based routing means in the context of LLM applications. Content-based routing refers to directing inputs to different processing paths based on the semantic meaning of the content rather than explicit rules or keywords.

For example, you might want to route:

Questions about pricing to a pricing information chain
Technical support inquiries to a troubleshooting chain
Feature requests to a product feedback chain

Traditional approaches might use keyword matching or classification models, but EmbeddingRouterChain takes a more sophisticated approach by using vector embeddings to understand the semantic meaning of inputs.

How EmbeddingRouterChain Works

EmbeddingRouterChain is a specialized implementation of LangChain’s RouterChain that uses embeddings to route between different destination chains. Here’s how it works:

You define a set of possible destinations, each with descriptive text
These descriptions are converted to embeddings and stored in a vector store
When an input arrives, it’s converted to an embedding
The input embedding is compared with the destination descriptions
The input is routed to the destination with the closest semantic match

This approach allows for nuanced, meaning-based routing that can handle variations in phrasing and intent.

Implementing EmbeddingRouterChain

Let’s walk through implementing an EmbeddingRouterChain in your LangChain application.

Step 1: Install Required Dependencies

First, ensure you have the necessary packages installed:

pip install langchain openai faiss-cpu

Step 2: Import Required Components

from langchain.chains.router.embedding_router import EmbeddingRouterChain
from langchain.chains import ConversationChain, LLMChain
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate

Step 3: Create Destination Chains

First, let’s create several chains that our router will direct traffic to:

# Technical support chain
tech_template = """You are a technical support specialist.
Help solve the following technical problem: {input}"""
tech_prompt = PromptTemplate(template=tech_template, input_variables=["input"])
tech_chain = LLMChain(llm=OpenAI(), prompt=tech_prompt)

# Pricing information chain
pricing_template = """You are a pricing specialist.
Provide information about pricing for: {input}"""
pricing_prompt = PromptTemplate(template=pricing_template, input_variables=["input"])
pricing_chain = LLMChain(llm=OpenAI(), prompt=pricing_prompt)

# General information chain
general_template = """You are a helpful assistant.
Provide information about: {input}"""
general_prompt = PromptTemplate(template=general_template, input_variables=["input"])
general_chain = LLMChain(llm=OpenAI(), prompt=general_prompt)

Step 4: Define Destination Descriptions

Next, we’ll define descriptions for each destination. These descriptions help the router understand when to use each chain:

destinations = [
    ("tech_support", tech_chain, "Questions about technical issues, error messages, troubleshooting, bugs, and how to fix technical problems"),
    ("pricing", pricing_chain, "Questions about pricing, costs, subscriptions, plans, and payment options"),
    ("general", general_chain, "General questions, company information, and anything not related to technical support or pricing")
]

# Format for EmbeddingRouterChain
destination_chains = {name: chain for name, chain, _ in destinations}
names_and_descriptions = [(name, [description]) for name, _, description in destinations]

Step 5: Create the Router Chain

Now, we’ll create our EmbeddingRouterChain:

# Initialize embeddings
embeddings = OpenAIEmbeddings()

# Create the router chain
router_chain = EmbeddingRouterChain.from_names_and_descriptions(
    names_and_descriptions=names_and_descriptions,
    vectorstore_cls=FAISS,
    embeddings=embeddings,
    routing_keys=["input"]
)

# Create the full chain
from langchain.chains.router import MultiRouteChain

full_chain = MultiRouteChain(
    router_chain=router_chain,
    destination_chains=destination_chains,
    default_chain=general_chain
)

Step 6: Use the Router in Your Application

Finally, let’s see the router in action:

# Test with different inputs
responses = []

# Technical question
tech_question = "I'm getting a 404 error when trying to access the dashboard"
responses.append(full_chain.invoke({"input": tech_question}))

# Pricing question
pricing_question = "What's the cost of the enterprise plan?"
responses.append(full_chain.invoke({"input": pricing_question}))

# General question
general_question = "What year was your company founded?"
responses.append(full_chain.invoke({"input": general_question}))

# Print responses
for i, response in enumerate(responses):
    print(f"Response {i+1}: {response}")

Advanced Usage Patterns

Custom Vector Stores

While our example uses FAISS for simplicity, you can use any vector store supported by LangChain:

from langchain.vectorstores import Chroma, Pinecone

# Using Chroma
router_chain = EmbeddingRouterChain.from_names_and_descriptions(
    names_and_descriptions=names_and_descriptions,
    vectorstore_cls=Chroma,
    embeddings=embeddings,
    routing_keys=["input"]
)

# Or Pinecone (requires setup)
import pinecone
pinecone.init(api_key="your-api-key", environment="your-environment")

router_chain = EmbeddingRouterChain.from_names_and_descriptions(
    names_and_descriptions=names_and_descriptions,
    vectorstore_cls=Pinecone,
    embeddings=embeddings,
    routing_keys=["input"]
)

Using Memory

You can add memory to your router-based system to maintain context across interactions:

from langchain.memory import ConversationBufferMemory

# Create memory objects for each chain
tech_memory = ConversationBufferMemory(memory_key="chat_history")
pricing_memory = ConversationBufferMemory(memory_key="chat_history")
general_memory = ConversationBufferMemory(memory_key="chat_history")

# Update chains to use memory
tech_chain = LLMChain(llm=OpenAI(), prompt=tech_prompt, memory=tech_memory)
pricing_chain = LLMChain(llm=OpenAI(), prompt=pricing_prompt, memory=pricing_memory)
general_chain = LLMChain(llm=OpenAI(), prompt=general_prompt, memory=general_memory)

# Update destination chains
destination_chains = {
    "tech_support": tech_chain,
    "pricing": pricing_chain,
    "general": general_chain
}

Streaming Responses

For a more responsive user experience, you can implement streaming with the router:

from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

# Create streaming LLM
streaming_llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])

# Update chains to use streaming
tech_chain = LLMChain(llm=streaming_llm, prompt=tech_prompt)
pricing_chain = LLMChain(llm=streaming_llm, prompt=pricing_prompt)
general_chain = LLMChain(llm=streaming_llm, prompt=general_prompt)

Best Practices for Effective Routing

To get the most out of EmbeddingRouterChain, consider these best practices:

Write Clear Descriptions: The quality of your destination descriptions greatly affects routing accuracy. Make them detailed and distinct.
Balance Specificity and Coverage: Descriptions should be specific enough to distinguish between destinations but broad enough to capture variations.
Include Multiple Phrasings: Consider including multiple ways users might express the same intent in your descriptions.
Add a Default Chain: Always include a default chain to handle inputs that don’t clearly match any destination.
Monitor and Refine: Track routing decisions and adjust descriptions over time based on performance.

Example: Customer Support Router

Let’s implement a more comprehensive customer support router:

from langchain.chains.router.embedding_router import EmbeddingRouterChain
from langchain.chains import LLMChain
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate

# Define chains for different support departments
billing_template = """You are a billing specialist.
Help with the following billing issue: {input}
Be friendly and professional."""
billing_prompt = PromptTemplate(template=billing_template, input_variables=["input"])
billing_chain = LLMChain(llm=OpenAI(), prompt=billing_prompt)

technical_template = """You are a technical support engineer.
Help with the following technical issue: {input}
Provide step-by-step troubleshooting guidance."""
technical_prompt = PromptTemplate(template=technical_template, input_variables=["input"])
technical_chain = LLMChain(llm=OpenAI(), prompt=technical_prompt)

account_template = """You are an account manager.
Help with the following account question: {input}
Ensure the customer feels valued."""
account_prompt = PromptTemplate(template=account_template, input_variables=["input"])
account_chain = LLMChain(llm=OpenAI(), prompt=account_prompt)

general_template = """You are a customer support representative.
Help with the following question: {input}
Be helpful and courteous."""
general_prompt = PromptTemplate(template=general_template, input_variables=["input"])
general_chain = LLMChain(llm=OpenAI(), prompt=general_prompt)

# Define descriptions for routing
destinations = [
    ("billing", billing_chain, "Questions about payments, invoices, charges, refunds, subscription fees, pricing plans, or billing issues"),
    ("technical", technical_chain, "Questions about software bugs, error messages, installation problems, configuration issues, or how to use specific features"),
    ("account", account_chain, "Questions about account access, login problems, account settings, profile updates, or account security"),
    ("general", general_chain, "General questions, company information, or inquiries not related to billing, technical issues, or account management")
]

# Set up router chain
destination_chains = {name: chain for name, chain, _ in destinations}
names_and_descriptions = [(name, [description]) for name, _, description in destinations]

embeddings = OpenAIEmbeddings()
router_chain = EmbeddingRouterChain.from_names_and_descriptions(
    names_and_descriptions=names_and_descriptions,
    vectorstore_cls=FAISS,
    embeddings=embeddings,
    routing_keys=["input"]
)

from langchain.chains.router import MultiRouteChain
support_chain = MultiRouteChain(
    router_chain=router_chain,
    destination_chains=destination_chains,
    default_chain=general_chain
)

# Example usage
query = "I can't log into my account after changing my password yesterday"
response = support_chain.invoke({"input": query})
print(response)

Conclusion

EmbeddingRouterChain provides a powerful, flexible approach to content-based routing in LLM applications. By leveraging embeddings to understand the semantic meaning of user inputs, you can create more intelligent, dynamic workflows that direct queries to the most appropriate processing chains.

This approach offers several advantages over traditional routing methods:

Semantic Understanding: Routes based on meaning rather than just keywords
Flexibility: Handles variations in how users phrase their requests
Maintainability: Easier to extend with new destinations without complex rule changes
Scalability: Works well even as the number of possible destinations grows

As you build more complex LLM applications, consider implementing EmbeddingRouterChain to create more dynamic, responsive workflows that can intelligently direct user inputs to the most appropriate processing paths.

By combining the power of embeddings with LangChain’s flexible architecture, you can create more sophisticated LLM applications that better understand and respond to user needs.

This post was originally written in my native language and then translated using an LLM. I apologize if there are any grammatical inconsistencies.

Share on

X Facebook LinkedIn Bluesky

Hand

Building Dynamic LLM Workflows with EmbeddingRouterChain: A Guide to Content-Based Routing in LangChain

Building Dynamic LLM Workflows with EmbeddingRouterChain: A Guide to Content-Based Routing in LangChain

Understanding Content-Based Routing

How EmbeddingRouterChain Works

Implementing EmbeddingRouterChain

Step 1: Install Required Dependencies

Step 2: Import Required Components

Step 3: Create Destination Chains

Step 4: Define Destination Descriptions

Step 5: Create the Router Chain

Step 6: Use the Router in Your Application

Advanced Usage Patterns

Custom Vector Stores

Using Memory

Streaming Responses

Best Practices for Effective Routing

Example: Customer Support Router

Conclusion

Share on

You may also enjoy

Implementing High-Performance Vector Search with FAISS in LangChain: A Complete Guide to Building Advanced RAG Applications

Integrating ChatYuan2: A Comprehensive Guide to Chinese Language Models in LangChain Applications

Building High-Performance RAG Systems with ThirdAI’s NeuralDBRetriever in LangChain: A Comprehensive Guide

Implementing Privacy-Focused AI: A Comprehensive Guide to Local LLM Deployment with LlamaCpp and LangChain