Build an AI Chat App in SwiftUI: Streaming From Scratch vs the Pre-Wired Version
A working AI chat screen is deceptively hard: the bubble UI takes an afternoon, but token streaming, cancellation, and secure key handling are where weeks disappear. This guide walks the from-scratch path step by step, then shows what's already wired in The Swift Kit.
To build an AI chat app in SwiftUI you stream responses from an LLM (OpenAI, Anthropic Claude, or Apple Foundation Models) into a scrolling list of message bubbles, appending tokens to the last message as they arrive over a URLSession bytes stream. The hard parts are not the UI — they are server-side key handling, per-token state updates without stutter, request cancellation, and rate limiting. You can write this from scratch in roughly a week, or start from The Swift Kit, where the streaming chat, proxied keys, and rate limiting are already wired.
Why the bubbles are the easy 10%
Most 'build a SwiftUI ChatGPT clone' tutorials stop at a pretty bubble list and a single non-streaming request. That demo looks finished but ships almost none of the real work. The hard 90% is everything around the UI: streaming tokens so the response feels alive instead of arriving as a frozen block, keeping the key off the device so it can't be scraped from your binary, cancelling an in-flight generation cleanly, retrying after a dropped connection without losing the partial answer, and stopping a single user from burning your entire OpenAI budget in an afternoon. None of that shows up in a screenshot, which is exactly why it gets underestimated. Streaming in particular changes the architecture: your view model must mutate the last message in place on the main actor, your network layer reads bytes incrementally rather than awaiting a full body, and your provider abstraction has to normalize different SSE formats between OpenAI and Claude.
- Streaming changes your whole network + state layer, not just the view
- Server-side key proxying is non-negotiable for a shipped app
- Cancellation, retry, and rate limiting are the parts demos skip
From scratch vs the pre-wired version
Writing it yourself is the right call when AI chat IS the product and you want full control over the provider abstraction, prompt pipeline, and streaming internals — you will learn a lot and own every line. The trade-off is honest: expect roughly a week to reach something robust, plus ongoing maintenance as provider APIs shift. The pre-wired path is The Swift Kit, where streaming chat against OpenAI, Anthropic Claude, and free on-device Apple Foundation Models is already built, with keys proxied through Supabase Edge Functions and per-user rate limiting in place. You start at the feature work — your prompts, your UX — instead of the plumbing. The Swift Kit is $99 one-time for unlimited commercial projects with lifetime updates and a 14-day refund, so the math is simply whether a week-plus of plumbing is worth more than $99 to you. If chat is one feature in a larger app rather than the whole app, the pre-wired version almost always wins.
- From scratch: full control, ~1 week + maintenance, best when chat is the core
- Pre-wired: ship today, $99 one-time, best when chat is one feature among many
- Both end at the same UI — the difference is the week of plumbing underneath
Build a streaming AI chat screen in SwiftUI
Do it yourself with the steps below, or skip the plumbing — The Swift Kit ships this exact stack (streaming chat, server-proxied keys, per-user rate limiting) pre-wired so you start at step 7.
- 1
Model your message + conversation state
Define an Identifiable Message with role and a mutable content string, and hold them in an @Observable view model. The mutable content is what lets you append streamed tokens to the last bubble in place.
struct Message: Identifiable { let id = UUID() let role: Role // .user / .assistant var text: String // var: appended during streaming } @Observable final class ChatVM { var messages: [Message] = [] var isStreaming = false } - 2
Build the bubble list UI
Render messages in a ScrollViewReader + LazyVStack so you can auto-scroll to the newest token. Keep user and assistant bubbles visually distinct and pin a composer at the bottom.
ScrollViewReader { proxy in ScrollView { LazyVStack(spacing: 8) { ForEach(vm.messages) { msg in BubbleView(message: msg).id(msg.id) } } } .onChange(of: vm.messages.last?.text) { _, _ in withAnimation { proxy.scrollTo(vm.messages.last?.id, anchor: .bottom) } } } - 3
Proxy your API key server-side
Never ship an OpenAI or Anthropic key in the app binary — it can be extracted in minutes. Put the key in a serverless function (Supabase Edge Function) that calls the LLM and streams the response back. The Swift Kit does exactly this so the key never touches the client.
# Edge Function holds the secret; the app only knows this URL supabase functions deploy chat supabase secrets set OPENAI_API_KEY=sk-... - 4
Open a streaming request with URLSession bytes
Use URLSession.bytes(for:) to read the response line by line as it arrives, instead of awaiting the full body. Each Server-Sent-Events line carries a token delta you parse and append.
let (bytes, _) = try await URLSession.shared.bytes(for: request) for try await line in bytes.lines { guard line.hasPrefix("data: "), line != "data: [DONE]" else { continue } let token = try parseDelta(line) await MainActor.run { vm.messages[vm.messages.count - 1].text += token } } - 5
Append tokens on the main actor without stutter
Mutate the last message's text on the MainActor so SwiftUI redraws smoothly. Batch very fast token streams if you see frame drops, and flip isStreaming so the composer shows a stop button.
- 6
Handle cancellation and errors
Store the streaming Task so a Stop button can call task.cancel(). Wrap the loop to catch network drops mid-stream and surface a retry, leaving the partial message intact rather than discarding it.
streamTask = Task { do { try await stream() } catch is CancellationError { /* keep partial text */ } catch { vm.error = error.localizedDescription } vm.isStreaming = false } - 7
Add rate limiting and a paywall
Free AI usage is a cost sink — gate requests with per-user limits in your Edge Function and put heavier usage behind a paywall. The Swift Kit wires Supabase Edge Function rate limiting and RevenueCat entitlements so this is a config change, not a build.
From scratch vs The Swift Kit
| Feature | The Swift Kit | Build from scratch |
|---|---|---|
| Token streaming (SSE) | Pre-wired | Build yourself |
| Server-side key proxy | Supabase Edge Functions | DIY backend |
| Per-user rate limiting | Included | Build yourself |
| Providers | OpenAI + Claude + Apple FM | Wire each yourself |
| Time to first chat | Same day | ~1 week |
| Cost | $99 one-time | Your time |
Frequently Asked Questions
How do I stream AI responses token-by-token in SwiftUI?
Is it safe to put my OpenAI or Claude API key in the app?
OpenAI, Claude, or Apple Foundation Models — which should I use?
How long does it take to build a streaming AI chat app from scratch?
How do I stop one user from running up my AI bill?
Keep exploring
Skip the week of plumbing
The Swift Kit ships streaming AI chat across OpenAI, Claude, and Apple Foundation Models — keys proxied, rate limiting wired, paywall ready. $99 one-time, unlimited commercial projects, lifetime updates, 14-day refund. Start at the feature work, not the plumbing.
Get The Swift Kit — $99One-time purchase · Lifetime updates · 14-day refund