Ultimate Guide to Chatkit Widgets: How to Actually Use Them

Updated June 2026. Rewritten for the current OpenAI ChatKit widgets model, with working patterns, the gotchas we hit in production, and when a widget is the wrong call.

OpenAI ChatKit widgets are the part of ChatKit that turns a plain text chat into something that looks like a real app. Instead of the assistant replying with a wall of markdown, it returns a card, a list, a form, or a set of buttons that the user can actually interact with. Done well, it is the difference between a chatbot and a product. Done badly, it is a slow form with extra steps.

We are Osher Digital, a Brisbane-based AI consultancy, and we have shipped ChatKit interfaces for clients who wanted a chat surface that did more than talk. This guide is what we wish the docs led with: what the widgets are, how they fit together, the patterns that hold up, and the places they bit us. If you want the broader picture of the SDK first, start with our guide to OpenAI ChatKit and come back here for the widget detail.

One note before we start. The ChatKit widget schema moves quickly, so treat the code here as patterns rather than gospel and check the field names against the official ChatKit documentation before you ship. The shapes change. The thinking does not.

What ChatKit Widgets Are

A ChatKit widget is a structured block of UI that the assistant emits as part of its response, rendered natively by the ChatKit front end. Rather than describing a product in a paragraph, the assistant returns a product card with an image, a price, and a “Buy” button. Rather than asking five questions in sequence, it returns a form with five fields. The model decides when a widget fits and what goes in it; ChatKit handles the rendering.

This matters because chat is a terrible interface for some things. Picking a date, choosing from twelve options, confirming an order: all of these are slower and more error-prone as free text than as a tappable control. Widgets let you keep the conversational flow where it helps and drop into structured UI where it does not. That is the whole point of them.

If you are building on the React side, the @openai/chatkit-react package gives you the components and hooks to mount the chat surface and receive widget interactions.

The server side is where you define what the assistant can render and what happens when a user acts on a widget. That split, dumb client and smart server, is the same one you want for any agent that takes real actions.

The ChatKit Widget Types You Will Actually Use

There are more widget primitives than most projects need. In practice, a handful do almost all the work.

Cards group a title, some text, an optional image, and one or more actions. This is the workhorse: a product, a booking, a search result, a record summary.
Lists render a set of items, usually cards, with consistent layout. Search results and option pickers live here.
Forms collect structured input: text fields, selects, date pickers, checkboxes. They submit back to your server as a single payload rather than as parsed free text.
Buttons and action rows turn a decision into a tap. Confirm, cancel, “show me more”, “book this”.

You compose these. A search result is a list of cards, each card carrying a button that fires an action. Keep the composition shallow. Deeply nested widgets are hard to read on a phone and harder to debug when a field comes back empty.

How to Render a ChatKit Widget

The flow has three parts: mount the chat surface on the client, define the widget your server returns, and handle the action when the user interacts. Here is the client side, kept minimal.

import { ChatKit, useChatKit } from "@openai/chatkit-react";

export function SupportChat() {
  const chat = useChatKit({
    api: { url: "/api/chatkit" },
    // Fired when a user interacts with a widget the assistant rendered
    onWidgetAction: async (action) => {
      if (action.type === "book_appointment") {
        await fetch("/api/book", {
          method: "POST",
          body: JSON.stringify(action.payload),
        });
      }
    },
  });

  return <ChatKit control={chat.control} className="h-[600px]" />;
}

On the server, you return a widget as a structured object rather than a string. The exact schema is defined by ChatKit, but the shape is consistent: a type, some content, and actions that carry a payload back to your onWidgetAction handler. A card with a single action looks roughly like this.

// Server-side: the widget your tool or handler returns
const appointmentCard = {
  type: "card",
  title: "Tuesday, 11 June at 2:00 PM",
  text: "30 minute consultation with Dr Lee",
  actions: [
    {
      type: "button",
      label: "Confirm booking",
      // This payload arrives in onWidgetAction on the client
      action: { type: "book_appointment", payload: { slotId: "slot_8842" } },
    },
  ],
};

The key idea is that the action carries an explicit payload you control. Do not rely on the model to re-state the slot ID in text. Put the ID in the payload when you build the widget, so the confirmation is exact and the model cannot fumble it. This single habit removes a whole category of bugs.

Building an Interactive ChatKit Widget With a Form

Forms are where widgets earn their keep, because they replace a fragile multi-turn text exchange with one clean submission. Instead of asking for name, then email, then preferred time across three messages, you return one form and get back a structured payload.

const intakeForm = {
  type: "form",
  title: "Book a consultation",
  fields: [
    { name: "full_name", label: "Full name", type: "text", required: true },
    { name: "email", label: "Email", type: "email", required: true },
    { name: "service", label: "Service", type: "select",
      options: ["Initial consult", "Follow-up", "Second opinion"] },
    { name: "preferred_date", label: "Preferred date", type: "date" },
  ],
  submit: { label: "Request booking", action: { type: "submit_intake" } },
};

When the user submits, the whole field set arrives as one payload on submit_intake. You validate it server-side (never trust the client form alone), write it to your system, and return a confirmation card. The conversation stays in the chat, but the data collection happened in a control built for it. For agents that need to take real actions off the back of these submissions, the wiring is the same work we describe in AI agent development.

When to Use a Widget and When Not To

Reach for a widget when the user has to choose, confirm, or enter structured data. Those are the cases where a tappable control beats text every time:

Picking a date or time, where free text is slow and ambiguous.
Choosing one option from a set, such as a service or a plan.
Confirming an action that has consequences, like a booking or a payment.
Collecting a few related fields at once, instead of asking for them one message at a time.
Showing a record the user then acts on, such as an order or a ticket.

Skip the widget when plain text is genuinely better. An explanation, a summary, an answer to a question: forcing those into a card adds friction and hides the content. We have seen teams widget-ify everything because they were excited about the feature, and the result felt like filling in a tax return one card at a time. Text is a feature, not a fallback.

The other case to skip widgets is when you actually need a full custom interface. If the interaction is the core of your product and demands precise layout, animation, or complex state, build it as a real app surface and use ChatKit for the conversational parts. Widgets are excellent for structured moments inside a chat. They are not a replacement for a designed application.

ChatKit Widgets: The Production Gotchas

The demos are smooth. Production is where you meet the edges. The ones that cost us time, so they do not cost you any.

The model omits a required field. If you let the model populate widget content freely, it will occasionally leave out a field your renderer expects, and the widget renders blank or breaks. Build widgets from your own data in code wherever you can, and only let the model choose which widget to show, not assemble its internals byte by byte.

Stale actions. A user scrolls up and taps “Confirm” on a card from four messages ago. The slot is long gone. Every action payload needs server-side validation against current state, and your handler needs a graceful “that option is no longer available” path. Assume every button can be pressed late.

Mobile layout. A card that looks tidy on a wide screen wraps badly on a phone, and most real ChatKit traffic is mobile, a good share of it iOS. Test every widget at a narrow width before you call it done, and keep titles and labels short.

Double submission. Users tap twice when a network is slow. Without an idempotency key on the action, you get two bookings. We learned this the way everyone does, with a duplicate in production. Put a unique key in the payload and dedupe on the server.

ChatKit Widgets vs Building Your Own UI

The honest comparison. ChatKit widgets give you native rendering, a consistent look, and a tested interaction model for free. You write a schema, not a component library. For most chat-first products, that is exactly the trade you want, and it gets you to production far faster than hand-rolling.

You give up fine control. If your brand demands a very specific look, or you need behaviour the widget set does not support, you will hit the ceiling. At that point you either accept the constraints or move that interaction out of ChatKit into your own React components and keep ChatKit for the conversation. There is no shame in a hybrid; it is usually the right answer for a mature product. ChatKit sits alongside the rest of OpenAI’s agent tooling, which we cover in our guide to OpenAI AgentKit if you are choosing across the stack. The ChatKit examples and source are worth a read before you commit either way, and they are the fastest way to see the current widget schema in practice.

What ChatKit Widgets Cost to Run

The widgets themselves are part of ChatKit, so there is no separate widget fee. Your cost is the model usage behind the conversation plus your own hosting. For a typical support or booking assistant, the model spend lands in the low tens to low hundreds of dollars a month depending on volume and which model you run; a build for an Australian client handling a few thousand sessions a month sat around $150 to $400 AUD a month in model fees.

The build effort is the real cost. Wiring widgets to real systems, validation, and the edge cases above usually run a few weeks of engineering for a first interface.

If you want a chat surface that books, buys, or files something rather than just answering questions, that is the work we do. Book a call and we will scope it with you.

Frequently Asked Questions

What are OpenAI ChatKit widgets?

ChatKit widgets are structured UI blocks, such as cards, lists, forms, and buttons, that the assistant renders inside a ChatKit chat instead of replying with plain text. They let users tap, choose, and submit data through controls built for the job, rather than typing everything as free text.

How do I add a widget to ChatKit?

Mount the chat surface with the ChatKit React components, define the widget as a structured object your server returns, and handle the user’s interaction in an action handler on the client. The widget carries an explicit payload so the action that comes back is exact rather than re-stated by the model.

What widget types does ChatKit support?

The ones you will use most are cards, lists, forms, and buttons or action rows. You compose them, so a search result is typically a list of cards each carrying an action. Check the official documentation for the full current set, since the schema is still evolving.

When should I use a widget instead of text?

Use a widget when the user has to choose, confirm, or enter structured data, such as picking a date or confirming an order. Use plain text for explanations and answers. Putting everything in widgets adds friction and makes the conversation feel like a form.

Do ChatKit widgets work on mobile?

Yes, and most real traffic is mobile, so design for it. Cards and forms that look fine on a wide screen can wrap awkwardly on a phone. Keep titles and labels short and test every widget at a narrow width before shipping.

How much do ChatKit widgets cost?

There is no separate fee for widgets; you pay for the model usage behind the chat plus your own hosting. A typical support or booking assistant runs from the low tens to the low hundreds of dollars a month in model fees depending on volume. The bigger cost is the engineering to wire widgets into your systems properly.

Can ChatKit widgets replace a custom app interface?

For structured moments inside a chat, yes. For an interaction that is the core of your product and needs precise layout or complex state, no. Many mature products use a hybrid: ChatKit for the conversation and custom React components for the parts that need full control.

Does ChatKit run on iOS and native apps, or only the web?

ChatKit is built around a web surface and the React SDK, which is what most teams embed. You can render that surface inside a native iOS or Android app through a web view, and the widgets behave the same way. If you need a fully native control set rather than an embedded web view, that is the point where you move the interaction into your own app code and keep ChatKit for the conversation.

ChatKit widgets are one of those features that look simple in a demo and reveal their depth in production. Get the payload discipline and the edge cases right and they are a fast way to ship a chat that actually does things. If you want help building one that holds up, get in touch.

OpenAI ChatKit Widgets: From Markdown to Interactive UI