First Look: New Agents SDK From OpenAI
A beautiful abstraction for TypeScript, let's take a quick look
Be on the look out for my first podcast episode next week, I was able to interview the cofounder of Chonkie.ai and discuss their embeddings system.
OpenAI released a new open source agents sdk for TypeScript about a week ago.
I am much happier with this API than Anthropic’s official MCP implementations.
I know agents aren’t MCP’s, but in general I expect better implementations from Anthropic compared to what’s currently being provided.
This SDK is quite functional and very extendable.
Here’s a quick look at the setup required to run a local model, which isn’t fully shown in their current documentation. You need to install a vercel sdk in order to pull in local model support, which is very… interesting:
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
import { aisdk } from "@openai/agents-extensions";
const lmstudio = createOpenAICompatible({
name: "lmstudio",
baseURL: "http://localhost:1234/v1",
});
export const devstral = aisdk(lmstudio("devstral-small-2505-mlx"));
export const ministral = aisdk(lmstudio("ministral-8b-instruct-2410"));
With the code above, I am able to use LM Studio as my AI server, and then specify which model with the lines below that.
Unfortunately, this doesn’t really work well… As soon as I played around with agents handing off tasks to one another. For instance, a “switchboard” agent that needs to route requests to the correct agent for a task, it fails to pass zod validation unless we use an OpenAI model.
I believe this is due to a weird requirement with input and output token counts being required as part of function calls in the agent loop.
I may need to try it with a few more open source tool calling models, as I only tried two local models so far.
Regardless, this abstraction is very new, and you can still use local models for some basic interactions.
Let’s make an agent now
import { Agent, tool } from "@openai/agents";
import { ministral, o4mini } from "./models";
import { weatherAgent } from "./weather";
import { RECOMMENDED_PROMPT_PREFIX } from "@openai/agents-core/extensions";
import z from "zod";
const getLinearTasks = tool({
name: "get_linear_tasks",
description: "Get the current linear tasks",
parameters: z.object({}),
async execute() {
return `The tasks are: eat, sleep, code`;
},
});
export const mainAgent = new Agent({
name: "Assistant",
model: ministral,
handoffs: [weatherAgent],
tools: [getLinearTasks],
instructions: RECOMMENDED_PROMPT_PREFIX,
});
I’m playing around with an agent that can do one thing, get my tasks from linear, and also hand off tasks to another agent.
This agent can get linear tasks, or hand off requests to the weather agent. For now it’s just returning hard coded tasks instead of actually calling the linear API.
Let’s take a look at the dummy weather agent.
import { tool } from "@openai/agents";
import { Agent } from "@openai/agents";
import z from "zod";
import { devstral, ministral, o4mini } from "./models";
const getWeather = tool({
name: "get_weather",
description: "Return the weather for a given city.",
parameters: z.object({ city: z.string() }),
async execute({ city }) {
return `The weather in ${city} is sunny.`;
},
});
export const weatherAgent = new Agent({
name: "Weather bot",
instructions: "You are a helpful weather bot.",
model: ministral,
tools: [getWeather],
handoffs: [],
});
We give the weather agent a single tool, and we hard code a basic response that the weather is sunny in whatever city is passed in.
Now, we have to orchestrate them to run together. This is a part that seems to be lacking in the current implementation.
I notice that I am able to write custom code in a loop that will hand off to the weather agent, but there is no way to hand off back to the original agent afterwards.
Here’s how that looks, this was mostly taken from their docs, but the initial handoff logic didn’t work without some adjustments that I made:
import {
Agent,
run,
StreamedRunResult,
withTrace,
type AgentInputItem,
} from "@openai/agents";
import { randomUUID } from "@openai/agents-core/_shims";
import readline from "node:readline/promises";
import { mainAgent } from "./agents/mainAgent";
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
async function main() {
const conversationId = randomUUID().replace(/-/g, "").slice(0, 16);
let userMsg = await rl.question(
"Hi! I can do a lot of stuff. How can I help?\n"
);
let agent: Agent<any, any> = mainAgent;
let inputs: AgentInputItem[] = [{ role: "user", content: userMsg }];
while (true) {
let result: StreamedRunResult<any, Agent<any, any>> | undefined;
await withTrace(
"Routing example",
async () => {
result = await run(agent, inputs, { stream: true });
result
.toTextStream({ compatibleWithNodeStreams: true })
.pipe(process.stdout);
await result.completed;
},
{ groupId: conversationId }
);
if (!result) {
throw new Error("No result");
}
inputs = result.history;
process.stdout.write("\n");
if (
result.currentAgent?.name !== agent.name &&
inputs[inputs.length - 1]?.content
) {
console.log("Handing off to", result.currentAgent?.name);
} else {
userMsg = await rl.question("Enter a message:\n");
inputs.push({ role: "user", content: userMsg });
}
agent = result.currentAgent ?? agent;
}
}
main().catch((error) => {
console.error("Error:", error);
process.exit(1);
});
Specifically the part where I am checking if the currentAgent changed, in the official docs, it would change, but it would make you type again. For example:
Me: what is the weather in tokyo?
Agent: I'll hand you off to the weather agent!
Agent: Enter a message:
This felt weird to me, with my changes it is now like this:
Me: what is the weather in tokyo?
Agent: I'll hand you off to the weather agent!
Agent: The weather in Tokyo is sunny
It worked!
Our request was handed off correctly.
Although… now we are stuck with the weather agent.
It seems like you are supposed to add your own way to restart to get back to the main agent, because a cycle dependency where they both reference each other in the handoffs array wouldn’t work.
Also this is code you have to write. One downside of this current approach is that the end developer has to handle manual handoff logic, and set a variable `agent`
that is mutated depending on what happens. Not the greatest when variables are mutated like this at runtime.
Realistically, you could have one mega agent with many tools. There may not be a reason to have handoffs depending on what you are building.
The separation of concerns with a handoff is nice to have for tasks that are very different from one another, plus having the ability to use a different model for different agents would also be missed.
Overall Thoughts
It seems like a nice abstraction that just needs a bit more tooling, a few more higher level features, like better agent handoff logic.
I love that each agent can use a different model. They have examples using the browser and more complicated tasks. I can see this being useful for a lot of actions and separating your tool use in logical agent sections.
Sure, anyone could have coded a similar approach to this, but it is nice having a library like this from one of the big providers. It helps us learn the paradigms they’re following internally.
One of the biggest drawbacks is the tracking though… You must use an OpenAI account to view traces behind the scenes:
They seem to be walking a fine line of trying to be open source, but also push you to OpenAI. It would be nice to have tracing all locally in the library.
I can see it being useful for building internal tooling, like a sophisticated content generation tool for articles.
I was also thinking of making a daily standup agent that can send to slack what tasks I have been working on...
Check out the library yourself and see what you think!
This is awesome