In the ever-evolving landscape of artificial intelligence, AI agents have emerged as a groundbreaking approach to building more sophisticated and autonomous systems. These agents represent a significant leap forward in AI development, offering capabilities beyond traditional static models.
What makes AI assistants particularly powerful is their ability to orchestrate complex tasks and continuously self-assess their operations. This self-monitoring capability ensures higher-quality outputs and more reliable performance.
To facilitate the development of these intelligent agents, OpenAI has introduced its new OpenAI Agents SDK, which replaces its previous Swarm framework. The latest open-source SDK provides developers a robust toolkit for building and deploying AI agents.
In this article, we'll explore the practical aspects of this SDK, diving into its core functionalities, examining how to build basic and advanced applications, and even looking at integrating external Large Language Model (LLM) providers for enhanced flexibility.
Principles
The OpenAI Agents SDK is built upon four fundamental principles that work together to create robust, reliable AI systems. Let's explore how these principles form the foundation of agent-based development:
At the heart of the system are Agents - the primary actors that drive the execution process. These specialized components handle LLM interactions, utilize tools, and generate responses based on their designated tasks.
The system enables smooth transitions between agents through handoffs. A recommended approach is implementing a triaging agent that intelligently routes tasks to the most appropriate specialized agent.
Guardrails serve as crucial safety mechanisms, performing two essential functions: screening user inputs for potential malicious content and validating agent outputs to ensure quality and appropriateness.
The tracing system provides detailed visibility into the agent's operations. Through the OpenAI dashboard, developers can monitor tool usage and track generated outputs, enabling thorough analysis and optimization.
These four principles work together to create a robust framework that enables developers to build sophisticated, reliable AI agent systems while maintaining control and visibility throughout the process.
Building our first system
We start with a simple example to understand the core principles of building an Agent system. As a very minimal thing, we need to define an agent and give it instructions on what it should do. We can give it a tool, such as the built-in WebSearchTool
, so that it can automatically use it.
12345678910111213141516from agents import Agent, Runner, WebSearchTool research_agent = Agent( name="Research Agent", instructions="You are a helpful research assistant.", tools=[WebSearchTool], ) result = Runner.run_sync( research_agent, "Are AI agents sentient already?", ) print(result.final_output) # Output: No, AI agents are not sentient. They do not have consciousness, self-awareness, emotions, or subjective experiences. AI processes information and makes decisions based on data and algorithms. Sentience involves having thoughts, feelings, and an awareness of one's own existence, which AI lacks.
Finally, we use a Runner
to provide the research_agent
with a query and can get the result (in this case synchronously).
We can also create our own tools and provide an agent with them like this:
123456789@function_tool def get_weather(city: str) -> str: return f"It's always sunny in {city}." weather_agent = Agent( name="Weather Agent", instructions="You are a helpful weather assistant.", tools=[get_weather] )
Of course, this is a simplified example, but we could also call a weather API inside the get_weather
function.
The real power comes in when we use multiple agents simultaneously. We can then define handoffs
to select which agent to use. Let's look at an example and create a third agent, call it triage_agent
, and make it choose between using the research_agent
and the weather_agent
:
1234567891011121314triage_agent = Agent( name="Triage Agent", instructions="You decide which agent to handoff to work with.", handoffs=[weather_agent, research_agent], ) result = Runner.run_sync( triage_agent, "What is the weather in San Francisco?", ) print(result.final_output) # Output: It's always sunny in San Francisco! There you have it!
Now that we understand how the system generally works, let's dive deeper into the different functionalities.
Options for creating agents
Agents can have different roles. With that, they have a lot of customization options. The only necessary properties an agent must have are:
- A
name
: indicates what the agent is used for. instructions
: are similar to a system prompt and should be used to describe the task the agent should fulfill clearly.
There are many other options that we can set, and some of them are relevant. One of the more common ones is the model
, which can range from anything OpenAI offers (such as o1
, o3-mini
, gpt-4o
) to external providers (see later in the article for an example using Gemini).
Also, we can pass context to an Agent
to add custom data from our database (or any other source). This can be done using the context
type. We can provide any Python object that we want for it, so for example, a ChatObject
could look like this:
12345678910@dataclass class ChatContext: channels: list[Channel] users: list[User] async def fetch_channels() -> list[Channel]: return ... agent = Agent[ChatContext]( ... )
Usually, the output of an agent will be plain text (so a str
object). However, we can change that by adding the output_type
option to an agent. With that, we can define a Pydantic object we expect as output.
We first need to define the object and then hand it to the agent upon initialization:
12345678910111213from pydantic import BaseModel from agents import Agent class WeatherEvent(BaseModel): forecast: str temperature: float hourly_temperature: list[float] agent = Agent( name="Weather agent", instructions="Get the weather for a certain location", output_type=WeatherEvent, )
The last interesting parameter we want to cover is handoffs
. As mentioned, they can be considered sub-agents to whom the primary agent can hand tasks. Let's assume we have an e-commerce app where people want to either shop for something or ask for a refund. We can create a triage_agent
that delegates the initial task to whatever agents fit best:
1234567891011121314from agents import Agent shopping_agent = Agent(...) refund_agent = Agent(...) triage_agent = Agent( name="Triage agent", instructions=( "Help the user with their questions." "If they ask about shopping, handoff to the shopping agent." "If they ask about refunds, handoff to the refund agent." ), handoffs=[shopping_agent, refund_agent], )
Using an external LLM provider
When we specify the model for the Agent
objects, we can select from the variety of OpenAI models. These are:
However, we can also use models from other providers, at least if they offer an OpenAI-compatible API endpoint (which has become an industry standard by now). So, let's look at an example of integrating the Gemini API from Google (integrating other providers like Anthropic, DeepSeek, or even local models is very similar).
We can take different approaches, but we'll go over the one that specifically sets the provider for a single agent so that we can mix and match as we like.
We need to define a client using the AsyncOpenAI
type (the regular OpenAI
version works as well) and provide it with a BASE_URL
(in the case of Gemini, this is https://generativelanguage.googleapis.com/v1beta/openai/
). In addition, an API key is required, and then we can initialize the client like this:
1234567import os from openai import AsyncOpenAI BASE_URL = "<https://generativelanguage.googleapis.com/v1beta/openai/>" API_KEY = os.getenv("GOOGLE_API_KEY") client = AsyncOpenAI(base_url=BASE_URL, api_key=API_KEY)
Next, we can define the model using the OpenAIChatCompletionsModel
object from the agents
SDK. We can hand this model conveniently to an Agent
object:
12345678910model = OpenAIChatCompletionsModel( model="gemini-2.0-flash", openai_client=client ) agent = Agent( name="Assistant", instructions="Help the user with their research", model=model )
The same principle applies to other providers as well. Just a note that it might be necessary to disable tracing when using external providers as this requires calls to the OpenAI dashboard, which will fail if we don't have an OPENAI_API_KEY
defined:
1set_tracing_disabled(disabled=True)
With this technique, we can incorporate any other provider into our system.
Guardrails
Another principle that the Agents SDK is built upon is guardrails. They are intended for one of two use cases (or both):
- Input guardrails: verify the user input before handing it into a more complex flow
- Output guardrails: check the output of an agent before handing it to the user
The idea is that we can check for a condition, and in case a guardrail is activated, we can signal a tripwire. This is equal to raising either an InputGuardrailTripwireTriggered
or an OutputGuardrailTripwireTriggered
exception.
Let's assume we want to check if a user is trying to solve their math homework using our system. We can use an input guardrail for that. The first thing we define is an output type and a guardrail_agent
:
123456789class MathHomeworkOutput(BaseModel): is_math_homework: bool reasoning: str guardrail_agent = Agent( name="Guardrail check", instructions="Check if the user is asking you to do their math homework.", output_type=MathHomeworkOutput, )
Then, we can create a function that takes the context (through the RunContextWrapper
from the agents SDK) and returns a GuardrailFunctionOutput
object. We hand this through an agent using the input_guardrails
parameter and are done:
12345678910111213141516@input_guardrail async def math_guardrail( ctx: RunContextWrapper[None], agent: Agent, input: str | list[TResponseInputItem] ) -> GuardrailFunctionOutput: result = await Runner.run(guardrail_agent, input, context=ctx.context) return GuardrailFunctionOutput( output_info=result.final_output, tripwire_triggered=result.final_output.is_math_homework, ) agent = Agent( name="Customer support agent", instructions="You help customers with their questions.", input_guardrails=[math_guardrail], )
The output guardrails work similarly, so we won't cover that here.
Tracing
The last important part of the entire system is first-class support for tracing. We can go far with it, but I want to quickly demonstrate how much we're getting out of the box.
By default tracing is enabled and will give us a ton of detail in the Traces dashboard. It supports information about each workflow you define. Inside you can determine which agents were called, which handoffs happened. Per agent we see the tools being used as well as the queries and responses they have given.
Here's a short walkthrough of all the features:
Tracing can be described in more detail, so let us know if you're interested in learning more, but this should be sufficient for a quick overview.
Summary
In summary, the OpenAI Agents SDK is a great, intuitive framework. It makes entering the whole Agent field straightforward and supports more complex use cases.
The guardrails feature is cumbersome to set up but very powerful if done well. The out-of-the-box support for traces is excellent and helps you reason about your system and analyze usage, bottlenecks, and more.
We're excited about the SDK's potential and would like to give OpenAI huge kudos for making the entire framework open-source. Integrating external providers is also very welcome, and we're optimistic that the SDK will improve in future iterations.
Ready to explore the power of AI in your applications? Stream offers a comprehensive suite of SDKs. Our latest set of tutorials on integrating Voice Agents using our SDKs is a great way to get a taste of what’s possible when using our Chat and Video SDKs.