NewThe Flutter Kit — Flutter boilerplate$149$69
Tutorial

Ship an On-Device AI Feature with Foundation Models in 1 Hour (SwiftUI 2026)

One-hour playbook for shipping a production AI feature via Apple Foundation Models. @Generable guided generation, tool calling, fallbacks, paywalling without a backend.

Ahmed GaganAhmed Gagan
14 min read

Skip 100+ hours of setup. Get The Swift Kit $149 $99 one-time

Get it now →

Apple's Foundation Models framework is the fastest way to ship a real AI feature in a SwiftUI app in 2026. No API keys, no server, no monthly bill, no rate limits, no network latency. The 3-billion-parameter on-device model can summarize, classify, rewrite, and produce typed Swift objects. This is the one-hour playbook: from zero to a shippable AI feature that passes App Review, gracefully handles devices without Apple Intelligence, and paywalls properly for subscription apps.

Short version: add import FoundationModels, create aLanguageModelSession, call respond(to:) with an@Generable struct, show the output in SwiftUI. Everything else (fallback UI, streaming, paywall gate, tool calls) is a 10-minute add-on.

The minimum shippable feature (10 minutes)

We'll build a journaling mood tagger. User writes an entry, the app extracts mood, top topics, and a one-sentence summary. Entirely on-device. Works in airplane mode. Free per call.

Prerequisites:

  • Xcode 26 or newer (Xcode 26.3 recommended).
  • iOS 26 deployment target.
  • A Mac with Apple silicon (Foundation Models does not build on Intel Macs).
  • Apple Intelligence enabled on your test device (iPhone 15 Pro or newer, any iPad with M1+).

Step 1: the @Generable type

Foundation Models' killer feature for developers is guided generation via@Generable. You define a Swift struct with the shape you want, and the model returns a typed instance. No JSON parsing, no string matching, no hallucinated keys.

import FoundationModels

@Generable
struct JournalAnalysis {
    @Guide(description: "One-word mood label, like 'calm', 'anxious', 'excited'")
    let mood: String

    @Guide(description: "2-4 top topics mentioned in the entry")
    let topics: [String]

    @Guide(description: "One-sentence reflection under 140 characters")
    let summary: String
}

The @Guide descriptions are how you steer the model. Short and specific beats long and abstract. Treat them as prompts embedded in the type.

Step 2: the session call

Create a session, call respond(to:generating:), await the response. One line of real logic.

import FoundationModels

final class JournalAI {
    private let session = LanguageModelSession()

    func analyze(_ text: String) async throws -> JournalAnalysis {
        let response = try await session.respond(
            to: "Analyze this journal entry: \(text)",
            generating: JournalAnalysis.self
        )
        return response.content
    }
}

That is the entire AI layer. You now have a working feature.

Step 3: the SwiftUI view

Display the input, the result, and a loading state. Use the standard Taskpattern.

struct JournalView: View {
    @State private var entry = ""
    @State private var analysis: JournalAnalysis?
    @State private var isAnalyzing = false
    let ai = JournalAI()

    var body: some View {
        VStack(alignment: .leading, spacing: 16) {
            TextEditor(text: $entry)
                .frame(minHeight: 120)
                .overlay(RoundedRectangle(cornerRadius: 12).stroke(.secondary.opacity(0.3)))

            Button("Analyze") { Task { await run() } }
                .disabled(entry.isEmpty || isAnalyzing)
                .buttonStyle(.borderedProminent)

            if isAnalyzing { ProgressView("Analyzing...") }

            if let a = analysis {
                VStack(alignment: .leading, spacing: 8) {
                    Label("Mood: \(a.mood)", systemImage: "heart")
                    Text("Topics: \(a.topics.joined(separator: ", "))")
                    Text(a.summary).italic()
                }.padding().background(.regularMaterial).cornerRadius(12)
            }
        }.padding()
    }

    @MainActor
    private func run() async {
        isAnalyzing = true
        defer { isAnalyzing = false }
        analysis = try? await ai.analyze(entry)
    }
}

Step 4: handle devices without Apple Intelligence

Foundation Models only runs on Apple Intelligence-enabled devices. Older iPhones, first-gen iPad Pros, and Intel Macs return LanguageModel.unavailable errors.

enum AIAvailability {
    static var isAvailable: Bool {
        SystemLanguageModel.default.availability == .available
    }
}

In your view, check before calling. Show a graceful fallback: "AI analysis requires an iPhone 15 Pro or newer. Upgrade or enable Apple Intelligence in Settings." Never show the feature as broken; explain the requirement.

Step 5: streaming output for longer responses

For features where the output is longer (summaries, rewrites, story generation), stream token by token so the UI feels responsive. Use streamResponse(to:) instead ofrespond(to:).

for try await partial in session.streamResponse(
    to: prompt,
    generating: JournalAnalysis.self
) {
    await MainActor.run { analysis = partial.content }
}

Partial responses contain the progressive state. You see mood appear first, then topics, then summary, with the UI updating incrementally. Feels fast because no network is involved.

Step 6: tool calls for custom functions

Foundation Models supports tool calling, which lets the model invoke Swift functions you register. Use this when the model needs external data to produce a good answer.

final class WeatherTool: Tool {
    let name = "currentWeather"
    let description = "Get the current weather in a city"

    @Generable
    struct Arguments { let city: String }

    func call(arguments: Arguments) async throws -> String {
        // call your weather service
        return "72°F, sunny in \(arguments.city)"
    }
}

let session = LanguageModelSession(tools: [WeatherTool()])

When the model determines it needs weather, it calls your function, receives the output, and integrates it into the final response. Entirely on-device and automatic.

Step 7: paywalling AI features without a backend

One of the biggest advantages of Foundation Models is that you can paywall AI features without a backend. No API keys to protect, no rate limits to enforce, no cost per call. Use RevenueCat entitlements as the only gate.

@MainActor
private func analyze() async {
    guard RevenueCatEntitlements.shared.hasPro else {
        showPaywall = true
        return
    }
    // proceed with on-device call
    analysis = try? await ai.analyze(entry)
}

The user subscribes via RevenueCat's StoreKit integration, RevenueCat syncs entitlement locally, the paywall gate flips open, and the AI feature works forever without any server ever hearing about it. Lowest-cost business model in AI right now.

Performance notes

  • Latency. First-token time is typically 100-400 ms on iPhone 15 Pro, a bit longer on older A17 Pro chips. On iPhone 16 Pro (A18 Pro) expect 70-250 ms.
  • Throughput. 20-40 tokens per second sustained on-device. Slower than cloud GPT-4o (~60-80 t/s) but faster than cold-starting a cloud API for short prompts.
  • Memory. ~3 GB peak while the model is loaded. Free after the session deallocates. Keep session lifecycles short; do not hold a session between user actions.
  • Battery. A typical 5-second inference uses roughly 2-4 percent of the Neural Engine's hourly budget. Hundreds of short inferences per day are fine.

What Foundation Models cannot do (yet)

Honest limitations in April 2026:

  • Not multimodal. Text in, text (or structured) out. No image understanding, no audio. Use Vision Framework for images.
  • Not state-of-the-art on complex reasoning. A 3B model is not GPT-4o. For multi-step reasoning, coding tasks, or long-form generation beyond 2-3K tokens, cloud models still win.
  • No memory across sessions. Each session starts fresh. You build your own conversation history if needed.
  • Limited to English and 9 other languages. Check availability for your target markets.
  • Cannot fine-tune. You steer via prompts and @Guide, not by training.

Production checklist before shipping

  • Handle SystemLanguageModel.availability cases explicitly with UI for each state.
  • Wrap every respond call in a try/catch with specific error messaging.
  • Respect LanguageModelSession's token limits (~4K context). Truncate user input if needed.
  • Privacy manifest: because Foundation Models runs on-device, no NSPrivacyCollectedDataType entry needed for AI inputs.
  • App Review note: state that AI runs on-device, no data leaves the device. Reviewers look for this.
  • Test on iPhone 15 Pro (slowest Apple Intelligence device) and iPhone 16 Pro (fastest) for realistic latency.
  • Handle the "Apple Intelligence not available" case with a graceful fallback, not a crash.

What The Swift Kit ships

The Swift Kit ships Foundation Models pre-wired through a provider- agnostic AIService protocol with three built-in example features: journal mood tagger, weekly summary generator, and a smart categorizer for any list. Device availability is handled, fallback UI for non-Apple-Intelligence devices is included, and RevenueCat gating is one line of code.

$99 one-time, unlimited commercial projects. See every integration on the features page or jump to pricing.

Final recommendation

If your SwiftUI app has any natural place for AI (summaries, categorization, smart input, contextual suggestions), ship it with Foundation Models before trying a cloud alternative. The economics are better (zero marginal cost), the latency is better (no network), the privacy posture is better (nothing leaves the device), and App Review treats on-device AI more kindly than cloud AI. Use cloud models only when Foundation Models genuinely cannot do the task.

Share this article
Limited-time · price rises to $149 soon

Ready to ship your iOS app faster?

The Swift Kit gives you a production-ready SwiftUI codebase with onboarding, paywalls, auth, AI integrations, and a full design system. Stop rebuilding boilerplate — start building your product.

$149$99one-time · save $50
  • Full source code
  • Unlimited projects
  • Lifetime updates
  • 50+ makers shipping