SwiftUI Guide

Build an AI Chat App in SwiftUI: Streaming From Scratch vs the Pre-Wired Version

A working AI chat screen is deceptively hard: the bubble UI takes an afternoon, but token streaming, cancellation, and secure key handling are where weeks disappear. This guide walks the from-scratch path step by step, then shows what's already wired in The Swift Kit.

Last updated: 2026-06-09 8 min read By Ahmed Gagan, iOS Engineer

Quick Answer

To build an AI chat app in SwiftUI you stream responses from an LLM (OpenAI, Anthropic Claude, or Apple Foundation Models) into a scrolling list of message bubbles, appending tokens to the last message as they arrive over a URLSession bytes stream. The hard parts are not the UI — they are server-side key handling, per-token state updates without stutter, request cancellation, and rate limiting. You can write this from scratch in roughly a week, or start from The Swift Kit, where the streaming chat, proxied keys, and rate limiting are already wired.

From-scratch effort

~1 week for a robust streaming chat (UI, proxy, cancellation, limits)

Recommended transport

URLSession.bytes(for:) + SSE parsing for true token streaming

Pre-wired in The Swift Kit

OpenAI + Claude + Apple Foundation Models, keys proxied, per-user rate limiting

Why the bubbles are the easy 10%

Most 'build a SwiftUI ChatGPT clone' tutorials stop at a pretty bubble list and a single non-streaming request. That demo looks finished but ships almost none of the real work. The hard 90% is everything around the UI: streaming tokens so the response feels alive instead of arriving as a frozen block, keeping the key off the device so it can't be scraped from your binary, cancelling an in-flight generation cleanly, retrying after a dropped connection without losing the partial answer, and stopping a single user from burning your entire OpenAI budget in an afternoon. None of that shows up in a screenshot, which is exactly why it gets underestimated. Streaming in particular changes the architecture: your view model must mutate the last message in place on the main actor, your network layer reads bytes incrementally rather than awaiting a full body, and your provider abstraction has to normalize different SSE formats between OpenAI and Claude.

Streaming changes your whole network + state layer, not just the view
Server-side key proxying is non-negotiable for a shipped app
Cancellation, retry, and rate limiting are the parts demos skip

From scratch vs the pre-wired version

Writing it yourself is the right call when AI chat IS the product and you want full control over the provider abstraction, prompt pipeline, and streaming internals — you will learn a lot and own every line. The trade-off is honest: expect roughly a week to reach something robust, plus ongoing maintenance as provider APIs shift. The pre-wired path is The Swift Kit, where streaming chat against OpenAI, Anthropic Claude, and free on-device Apple Foundation Models is already built, with keys proxied through Supabase Edge Functions and per-user rate limiting in place. You start at the feature work — your prompts, your UX — instead of the plumbing. The Swift Kit is $99 one-time for unlimited commercial projects with lifetime updates and a 14-day refund, so the math is simply whether a week-plus of plumbing is worth more than $99 to you. If chat is one feature in a larger app rather than the whole app, the pre-wired version almost always wins.

From scratch: full control, ~1 week + maintenance, best when chat is the core
Pre-wired: ship today, $99 one-time, best when chat is one feature among many
Both end at the same UI — the difference is the week of plumbing underneath

Build a streaming AI chat screen in SwiftUI

Do it yourself with the steps below, or skip the plumbing — The Swift Kit ships this exact stack (streaming chat, server-proxied keys, per-user rate limiting) pre-wired so you start at step 7.

1
Model your message + conversation state
Define an Identifiable Message with role and a mutable content string, and hold them in an @Observable view model. The mutable content is what lets you append streamed tokens to the last bubble in place.
```
struct Message: Identifiable {
  let id = UUID()
  let role: Role   // .user / .assistant
  var text: String // var: appended during streaming
}

@Observable final class ChatVM {
  var messages: [Message] = []
  var isStreaming = false
}
```

Build the bubble list UI

Render messages in a ScrollViewReader + LazyVStack so you can auto-scroll to the newest token. Keep user and assistant bubbles visually distinct and pin a composer at the bottom.

ScrollViewReader { proxy in
  ScrollView {
    LazyVStack(spacing: 8) {
      ForEach(vm.messages) { msg in
        BubbleView(message: msg).id(msg.id)
      }
    }
  }
  .onChange(of: vm.messages.last?.text) { _, _ in
    withAnimation { proxy.scrollTo(vm.messages.last?.id, anchor: .bottom) }
  }
}

3
Proxy your API key server-side
Never ship an OpenAI or Anthropic key in the app binary — it can be extracted in minutes. Put the key in a serverless function (Supabase Edge Function) that calls the LLM and streams the response back. The Swift Kit does exactly this so the key never touches the client.
```
# Edge Function holds the secret; the app only knows this URL
supabase functions deploy chat
supabase secrets set OPENAI_API_KEY=sk-...
```

Open a streaming request with URLSession bytes

Use URLSession.bytes(for:) to read the response line by line as it arrives, instead of awaiting the full body. Each Server-Sent-Events line carries a token delta you parse and append.

let (bytes, _) = try await URLSession.shared.bytes(for: request)
for try await line in bytes.lines {
  guard line.hasPrefix("data: "), line != "data: [DONE]" else { continue }
  let token = try parseDelta(line)
  await MainActor.run { vm.messages[vm.messages.count - 1].text += token }
}

5
Append tokens on the main actor without stutter
Mutate the last message's text on the MainActor so SwiftUI redraws smoothly. Batch very fast token streams if you see frame drops, and flip isStreaming so the composer shows a stop button.
6
Handle cancellation and errors
Store the streaming Task so a Stop button can call task.cancel(). Wrap the loop to catch network drops mid-stream and surface a retry, leaving the partial message intact rather than discarding it.
```
streamTask = Task {
  do { try await stream() }
  catch is CancellationError { /* keep partial text */ }
  catch { vm.error = error.localizedDescription }
  vm.isStreaming = false
}
```
7
Add rate limiting and a paywall
Free AI usage is a cost sink — gate requests with per-user limits in your Edge Function and put heavier usage behind a paywall. The Swift Kit wires Supabase Edge Function rate limiting and RevenueCat entitlements so this is a config change, not a build.

From scratch vs The Swift Kit

The Swift Kit vs Build from scratch comparison
Feature	The Swift Kit	Build from scratch
Token streaming (SSE)	Pre-wired	Build yourself
Server-side key proxy	Supabase Edge Functions	DIY backend
Per-user rate limiting	Included	Build yourself
Providers	OpenAI + Claude + Apple FM	Wire each yourself
Time to first chat	Same day	~1 week
Cost	$99 one-time	Your time

Frequently Asked Questions

How do I stream AI responses token-by-token in SwiftUI?

Use URLSession.bytes(for:) and iterate the response with for try await line in bytes.lines. Each Server-Sent-Events line carries a token delta; parse it and append to the last message's text on the MainActor so SwiftUI redraws the bubble as text grows. The Swift Kit ships this streaming loop already built for both OpenAI and Claude.

Is it safe to put my OpenAI or Claude API key in the app?

No. Keys embedded in an iOS binary can be extracted with standard tools in minutes, exposing you to unlimited billing. Proxy the key through a serverless function instead — The Swift Kit uses Supabase Edge Functions so the key lives server-side and the app only ever talks to your endpoint.

OpenAI, Claude, or Apple Foundation Models — which should I use?

Apple Foundation Models run free and on-device, ideal for privacy and cost but limited in capability. OpenAI and Anthropic Claude are more powerful but billed per token and require a server proxy. The Swift Kit wires all three behind one interface so you can switch or mix without rewriting your chat layer.

How long does it take to build a streaming AI chat app from scratch?

The bubble UI is an afternoon, but a robust version — streaming, server-proxied keys, cancellation, retry on dropped connections, and per-user rate limiting — is realistically about a week, plus ongoing upkeep as provider APIs change. Starting from The Swift Kit collapses that to configuration.

How do I stop one user from running up my AI bill?

Enforce limits server-side, not in the app, since client checks are trivially bypassed. The Swift Kit implements per-user rate limiting inside Supabase Edge Functions and pairs it with RevenueCat entitlements so you can gate heavier usage behind a paywall.

Keep exploring

Related guides

tutorial

Add a Paywall to a SwiftUI App

tutorial

ChatGPT Wrapper Boilerplate for iOS

tutorial

Best AI App Boilerplate for iOS

tutorial

Add Auth to a SwiftUI App

From the blog

Free tools

Skip the week of plumbing

The Swift Kit ships streaming AI chat across OpenAI, Claude, and Apple Foundation Models — keys proxied, rate limiting wired, paywall ready. $99 one-time, unlimited commercial projects, lifetime updates, 14-day refund. Start at the feature work, not the plumbing.

Get The Swift Kit — $99

One-time purchase · Lifetime updates · 14-day refund

Build an AI Chat App in SwiftUI: Streaming From Scratch vs the Pre-Wired Version

Why the bubbles are the easy 10%

From scratch vs the pre-wired version

Build a streaming AI chat screen in SwiftUI

Model your message + conversation state

Build the bubble list UI

Proxy your API key server-side

Open a streaming request with URLSession bytes

Append tokens on the main actor without stutter

Handle cancellation and errors

Add rate limiting and a paywall

From scratch vs The Swift Kit

Frequently Asked Questions

Keep exploring

Skip the week of plumbing