OpenAI has launched AgentKit, a set of tools for building, deploying, and optimising agents. It aims to enable developers to design workflows visually and embed agentic UIs faster using new building blocks.

New building blocks
AgentKit’s new building blocks, including the Agent Builder (available in beta), Connector Registry (beginning its beta rollout to some API, ChatGPT Enterprise and Edu customers with a Global Admin Console), and ChatKit, offer a flexible toolkit for developers to create and manage multi-agent workflows, data connections, and customisable chat-based agent experiences.
Evals capabilities
OpenAI has significantly expanded its evaluation capabilities with the introduction of new features. Launched last year to help developers test prompts and measure model behaviour, OpenAI has added four new capabilities to its Evals platform, including datasets, trace grading, automated prompt optimisation, and third-party model support, to measure and improve agent performance.
Reinforcement fine-tuning
To enhance agent performance, OpenAI is making Reinforcement Fine-tuning (RFT) generally available on OpenAI o4-mini and in private beta for GPT-5. The solution enables developers to customise OpenAI’s reasoning models. New features in the RFT beta include custom tool calls that can train models to select the right tools at the right time for improved reasoning, as well as custom graders that can set specific evaluation criteria for what matters most in various use cases.