Shipping LLM Features with Structured Outputs

Last week I wrote about rapid prototyping for consulting work and how a single working session can take an idea from “what if?” to “click here.” This post covers the next step: turning model output into reliable data your app can trust.

The short version: ask models for structured results and validate them. Don’t rely on “please return JSON.” Use JSON Schema or typed models, enforce it, and build a simple fallback when the response slips. This is the difference between a cool demo and a feature you can ship. For background, see OpenAI’s Structured Outputs guide and the OpenAI Cookbook intro.

What “structured outputs” actually mean

With structured outputs, the model returns a payload that conforms to a schema (JSON Schema or a typed class), and your SDK can parse to that shape. That’s more reliable than generic “JSON mode” and pairs well with tool/function calling. If you’re on Azure, their docs mirror the approach and list the supported subset of JSON Schema: Azure OpenAI structured outputs.

Rule of thumb: if the result will be consumed by code, treat it like an API contract. Give the model the contract, not just a vibe.

A tiny, production-shaped pattern

Below is a minimal Node/TypeScript example using OpenAI’s Responses API with a JSON Schema, then validating with Zod before persistence.

1) Define the contract once

// schema.ts
import { z } from "zod";

export const Ticket = z.object({
  priority: z.enum(["low", "medium", "high"]),
  title: z.string().min(1),
  summary: z.string().min(1),
  tags: z.array(z.string()).default([]),
});
export type Ticket = z.infer<typeof Ticket>;

2) Ask the model for the exact shape

// llm.ts
import OpenAI from "openai";
import { Ticket } from "./schema";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });

export async function triageToTicket(userText: string) {
  const response = await client.responses.create({
    model: "gpt-4.1-mini",
    input: `Turn this user text into a triage ticket.\nTEXT:\n${userText}`,
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "ticket",
        schema: {
          type: "object",
          properties: {
            priority: { enum: ["low", "medium", "high"] },
            title: { type: "string" },
            summary: { type: "string" },
            tags: { type: "array", items: { type: "string" } }
          },
          required: ["priority", "title", "summary"],
          additionalProperties: false
        },
        strict: true
      }
    }
  });

  const raw = response.output[0].content[0].text; // adapt to your SDK version
  return JSON.parse(raw);
}

Structured outputs via response_format are documented in OpenAI’s guide and the Cookbook

3) Validate + repair

// guard.ts
import { Ticket } from "./schema";

export function guard(result: unknown) {
  const ok = Ticket.safeParse(result);
  if (ok.success) return { kind: "ok" as const, value: ok.data };

  // Optional “repair” pass: ask the model to fix to schema using the errors
  return { kind: "error" as const, issues: ok.error.format() };
}

Prompt patterns that reduce flakiness

You’re still extracting structured data from a conversation, so precision matters.. This means you need to give attention to the prompt you are passing in to try and get the most consistent result.

1) Give the model the job and the contract.

You are a senior triage assistant. Produce a JSON object that matches the provided JSON Schema.
Only return the JSON object, nothing else.
If required fields are missing, infer sensible defaults and continue.

2) Provide examples for edge cases.

Edge case examples:
- If vague user text → set priority to "low" and include a "needs-clarification" tag.
- If PII present → include a "pii" tag and summarize without PII.

3) Tell it how to behave if uncertain.

If you cannot satisfy the schema, return a minimal valid object with a "needs-review" tag.

These patterns align with how structured outputs and function/tool schemas are enforced. See the Structured Outputs guide for details.

When you need action, not just data: tool/function calling

If the next step is “create a JIRA issue,” define a tool with a parameter schema and let the model call it. The same strict schema applies to tool arguments, which reduces brittle string parsing. Start with one tool, log each invocation, and gate execution behind human approval in production. The basics are in Function calling and the API reference.

const tools = [
  {
    type: "function",
    function: {
      name: "create_ticket",
      description: "Create a ticket in the tracker",
      parameters: {
        type: "object",
        properties: {
          title: { type: "string" },
          summary: { type: "string" },
          priority: { enum: ["low", "medium", "high"] }
        },
        required: ["title", "summary", "priority"],
        additionalProperties: false
      }
    }
  }
];

// Call the model with tools. Route tool calls to your code.
// Keep a review switch in front of actual side effects.

Fallbacks that keep your app upright

The API usage is generally cheap enough that you can retry once with a “repair” prompt that includes any validation errors. Building in a “needs review” flag or something like it for your app when you get a minimal valid object will help cover a lot of edge cases while still keeping things functional.

One extra area of coverage you need is in supporting a refusal with structured output. When OpenAI says no, the response may, or may not be returned using the structured format. So look for a type of refusal and support a user friendly explanation.

  "output": [{
    "id": "msg_10987654321",
    "type": "message",
    "role": "assistant",
    "content": [
      {
        "type": "refusal",
        "refusal": "I'm sorry, I cannot assist with that request."
      }
    ]
  }],

Devin Vinson

Shipping LLM Features with Structured Outputs

What “structured outputs” actually mean

A tiny, production-shaped pattern

Prompt patterns that reduce flakiness

When you need action, not just data: tool/function calling

Fallbacks that keep your app upright

Further reading (and where I’ve used these patterns)

Collaboration is a Loop, Not a Hand-Off

Unlocking Peak Flow with Gemini’s New Updates for Vibe Coding

10 A/B Tests to Run Before Black Friday