Mastering Structured Outputs with LangChain’s BaseChatOpenAI: A Comprehensive Guide to Schema Validation and Tool Calling
Mastering Structured Outputs with LangChain’s BaseChatOpenAI: A Comprehensive Guide to Schema Validation and Tool Calling
In modern AI applications, getting structured, predictable outputs from language models is essential for building reliable systems. LangChain’s BaseChatOpenAI
class provides robust capabilities for working with OpenAI’s chat models while ensuring outputs conform to specific schemas. This comprehensive guide will explore how to use BaseChatOpenAI
for structured outputs and tool calling, two powerful features that can dramatically improve the reliability of your AI applications.
Understanding BaseChatOpenAI
BaseChatOpenAI
is a foundational class in LangChain that implements the BaseChatModel
interface. It provides a seamless way to interact with OpenAI’s chat models while taking advantage of LangChain’s Runnable Interface, which includes useful methods like with_config
, with_types
, with_retry
, and more.
Let’s start with a basic example of initializing the model:
from langchain_openai import BaseChatOpenAI
# Initialize the model
chat_model = BaseChatOpenAI(
model="gpt-4o",
temperature=0.7,
max_tokens=500
)
# Simple invocation
response = chat_model.invoke("Explain quantum computing in one sentence.")
print(response.content)
Schema Validation with Structured Outputs
One of the most powerful features of BaseChatOpenAI
is its ability to ensure outputs conform to a specific schema. This is crucial for applications where you need predictable, structured data rather than free-form text.
Using with_structured_output()
The with_structured_output()
method allows you to define a schema that the model output should follow. This can be specified in various ways:
- As an OpenAI function/tool schema
- As a JSON Schema
- As a TypedDict class
- As a Pydantic class
Here’s an example using a Pydantic model:
from pydantic import BaseModel, Field
from typing import List
class MovieRecommendation(BaseModel):
title: str = Field(description="The title of the recommended movie")
year: int = Field(description="The year the movie was released")
genres: List[str] = Field(description="List of genres the movie belongs to")
director: str = Field(description="The director of the movie")
reason: str = Field(description="Why this movie is being recommended")
# Create a structured output model
structured_model = chat_model.with_structured_output(MovieRecommendation)
# Get a recommendation
recommendation = structured_model.invoke(
"Recommend a sci-fi movie from the 1980s with a brief explanation."
)
print(f"Title: {recommendation.title}")
print(f"Year: {recommendation.year}")
print(f"Director: {recommendation.director}")
print(f"Genres: {recommendation.genres}")
print(f"Reason: {recommendation.reason}")
Method Options for Structured Outputs
When using with_structured_output()
, you can specify different methods for generating structured data:
- Function Calling (default): Uses OpenAI’s tool-calling API
- JSON Mode: Uses OpenAI’s JSON mode, requiring instructions in your prompt
- JSON Schema: Uses OpenAI’s Structured Output API (for newer models like gpt-4o)
# Using JSON Schema method
structured_model = chat_model.with_structured_output(
MovieRecommendation,
method="json_schema",
strict=True
)
Handling Raw Responses and Errors
You can get both the parsed output and the raw model response by setting include_raw=True
:
structured_model = chat_model.with_structured_output(
MovieRecommendation,
include_raw=True
)
result = structured_model.invoke("Recommend a comedy movie from the 1990s.")
print("Raw Response:", result["raw"])
print("Parsed Response:", result["parsed"])
print("Parsing Error:", result["parsing_error"]) # None if no error
Tool Calling with BaseChatOpenAI
Tool calling (formerly known as function calling) allows models to call specific functions or tools based on user input. This is particularly useful for creating agents that can interact with external systems.
Binding Tools to the Model
You can bind tools to your BaseChatOpenAI
instance using the bind_tools()
method:
import requests
from langchain_core.tools import tool
@tool
def get_weather(location: str, unit: str = "celsius") -> str:
"""Get the current weather in a given location."""
# Simulated API call
weather_info = {"temperature": 22, "condition": "Sunny", "humidity": 60}
return f"Weather in {location}: {weather_info['temperature']}°{unit}, {weather_info['condition']}, {weather_info['humidity']}% humidity"
@tool
def search_database(query: str, limit: int = 5) -> List[dict]:
"""Search the database for information related to the query."""
# Simulated database search
results = [{"title": f"Result {i}", "content": f"Content for {query} {i}"} for i in range(limit)]
return results
# Bind tools to the model
tool_model = chat_model.bind_tools(
tools=[get_weather, search_database],
tool_choice="auto" # Let the model decide which tool to use
)
# Use the model with tools
response = tool_model.invoke(
"What's the weather like in New York today? Also, find information about climate change."
)
print(response.content)
Tool Choice Options
The tool_choice
parameter controls how tools are selected:
"auto"
: Model decides whether to use a tool"none"
: No tools are used"any"
or"required"
orTrue
: Force at least one tool to be called- A specific tool name: Force the use of a specific tool
- A dictionary specifying a tool:
{"type": "function", "function": {"name": "tool_name"}}
Parallel Tool Calls
For newer models that support it, you can enable parallel tool calls:
tool_model = chat_model.bind_tools(
tools=[get_weather, search_database],
tool_choice="auto",
parallel_tool_calls=True # Enable parallel tool calls
)
Advanced Features
Token Counting
BaseChatOpenAI
provides methods to count tokens, which is useful for ensuring inputs fit within context windows:
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me about artificial intelligence."}
]
# Count tokens in messages
token_count = chat_model.get_num_tokens_from_messages(messages)
print(f"Token count: {token_count}")
Streaming Responses
For long responses, streaming can provide a better user experience:
from langchain_core.messages import HumanMessage
# Enable streaming
streaming_model = BaseChatOpenAI(
model="gpt-4o",
streaming=True
)
# Stream the response
for chunk in streaming_model.stream([HumanMessage(content="Write a short story about a robot learning to paint.")]):
print(chunk.content, end="", flush=True)
Caching Responses
You can enable caching to avoid redundant API calls:
from langchain.globals import set_llm_cache
from langchain.cache import InMemoryCache
# Set up caching
set_llm_cache(InMemoryCache())
# Create model with caching
cached_model = BaseChatOpenAI(
model="gpt-4o",
cache=True
)
Retry Mechanism
For handling transient errors, you can add automatic retries:
# Create model with retry capability
retry_model = chat_model.with_retry(
stop_after_attempt=3,
wait_exponential_jitter=True
)
Combining Structured Outputs with Tool Calling
You can combine structured outputs with tool calling for even more powerful capabilities:
class SearchResults(BaseModel):
query: str = Field(description="The search query")
results: List[dict] = Field(description="The search results")
summary: str = Field(description="A summary of the search results")
# Create a model with both structured output and tools
combined_model = chat_model.with_structured_output(
SearchResults,
tools=[search_database],
include_raw=True
)
# Use the combined model
response = combined_model.invoke("Find information about quantum computing")
print(response["parsed"].summary)
Performance Considerations
When working with BaseChatOpenAI
, there are several parameters you can adjust to optimize performance:
optimized_model = BaseChatOpenAI(
model="gpt-4o",
temperature=0.2, # Lower for more deterministic outputs
top_p=0.95, # Control diversity
frequency_penalty=0.5, # Reduce repetition
presence_penalty=0.5, # Encourage diversity
max_tokens=1000, # Control response length
request_timeout=30.0 # Timeout for requests
)
Error Handling with Fallbacks
You can create fallback chains to handle potential errors:
from langchain_openai import ChatOpenAI
# Create primary and fallback models
primary_model = BaseChatOpenAI(model="gpt-4o")
fallback_model = ChatOpenAI(model="gpt-3.5-turbo")
# Create a model with fallback
robust_model = primary_model.with_fallbacks(
fallbacks=[fallback_model],
exceptions_to_handle=(Exception,)
)
Conclusion
LangChain’s BaseChatOpenAI
provides a powerful interface for working with OpenAI’s chat models, with special emphasis on structured outputs and tool calling. These features enable you to build more reliable AI applications by ensuring outputs conform to expected schemas and allowing models to interact with external tools and systems.
By mastering these capabilities, you can create AI applications that not only generate high-quality text but also produce structured data that can be directly used in your application logic. Whether you’re building chatbots, content generation tools, or complex AI agents, BaseChatOpenAI
offers the flexibility and control you need to create robust, production-ready systems.
Remember that the choice of method for structured outputs (function_calling
, json_mode
, or json_schema
) depends on your specific use case and the models you’re working with. Newer models like GPT-4o support more advanced features like parallel tool calls and the JSON Schema method, while older models may have different capabilities and limitations.
With the techniques covered in this guide, you’re well-equipped to build sophisticated AI applications that leverage the full power of LangChain and OpenAI’s chat models.
This post was originally written in my native language and then translated using an LLM. I apologize if there are any grammatical inconsistencies.