Boilerplate

AI Photo App Boilerplate for iOS — Vision + Image-Gen for the Calorie-Counter / Plant-ID Class

Point the camera at a plate of food or a leaf, get structured data back. The Swift Kit ships the exact pipeline that the calorie-counter and plant-ID apps live on — Vision capture, server-proxied image analysis, and DALL·E generation — wired end to end so you build the prompt, not the plumbing.

Last updated: 2026-06-06 6 min read By Ahmed Gagan, iOS Engineer
Quick Answer

The Swift Kit is an AI photo app boilerplate for iOS, $99 one-time, built for the calorie-counter / plant-ID class of app where a photo goes in and structured data comes out. It wires Apple Vision capture to OpenAI Vision and DALL·E image generation through Supabase Edge Functions, so your API keys stay server-side and every user gets per-user rate limiting. You get auth, paywall, and the photo-to-analysis loop already connected — you write the classification prompt and the result UI, not the infrastructure.

Price
$99 one-time, lifetime updates
Vision pipeline
Camera capture → OpenAI Vision → structured result
Image gen
DALL·E via Supabase Edge Functions
Rate limiting
Per-user, server-side (Edge Functions)

The photo-to-data loop, already wired

Every app in the calorie-counter / plant-ID family is the same shape under the hood: capture an image, send it to a model, parse the answer into something your UI can render. The Swift Kit ships that loop intact. The camera and photo-picker hand a frame to a Vision call, the call runs through a Supabase Edge Function (so your OpenAI key never ships in the binary), and the streamed response lands back in SwiftUI. You're not gluing AVFoundation to a networking layer to a JSON decoder from scratch — that path already exists. What's left is the part that's actually yours: the prompt that turns 'photo of a plate' into calories and macros, or 'photo of a leaf' into a species name and care notes.

  • PhotosPicker + camera capture feeding a single analysis entry point
  • OpenAI Vision request proxied through a Supabase Edge Function
  • Structured response decoded into a Swift model ready for your result screen

Why server-side proxying matters for this exact app

Photo-AI apps are the ones that get key-scraped. A calorie counter that ships its OpenAI key in the app bundle gets that key pulled and drained within days of any traction — Vision calls are expensive enough that abuse hurts fast. The Swift Kit routes every model call through Supabase Edge Functions, which means the key lives on the server and each user hits a per-user rate limit before they can run up your bill. For a freemium plant-ID app where free users get, say, three scans a day before the paywall, that limit is the whole business model — and it's already enforced at the edge, not on the honor system in the client.

  • API keys proxied server-side via Edge Functions — never in the app
  • Per-user rate limiting to gate free scans before the paywall
  • Apple Foundation Models available on-device and free for cheap pre-checks

Generate, not just recognize

Vision answers 'what is this'; image generation answers 'show me this.' The Swift Kit wires DALL·E through the same Edge Function layer, which opens the second half of the photo-AI category — the apps that take your photo and give you a transformed one back. Think a meal-logging app that renders a plated 'goal' version of a recipe, or a garden app that generates a styled illustration of the plant it just identified. Because recognition and generation share one proxy and one rate-limiter, you can mix them in a single flow without standing up a second backend. Apple Foundation Models run on-device for free when you want a fast, private pre-classification before paying for a cloud Vision call.

What you still have to build

Honest scope: the boilerplate gives you the pipeline, not the domain. It does not know that 200g of grilled chicken is 330 calories, and it does not ship a food or plant database. You bring the prompt engineering, the result-screen design, and any reference data or post-processing your accuracy depends on. The Swift Kit's job is to delete the four weeks of camera, networking, key-security, paywall, and auth work that sit between you and your first real Vision call — so the time you spend is on the part that makes your app yours, like tuning the prompt until macro estimates stop drifting.

The Swift Kit vs. building the photo-AI pipeline from scratch

The Swift Kit vs Build from scratch comparison
FeatureThe Swift KitBuild from scratch
Camera + photo-picker to analysis entry pointPre-wiredBuild AVFoundation + PhotosPicker glue yourself
OpenAI Vision / DALL·E key securityProxied via Supabase Edge FunctionsRisk shipping key in bundle or stand up own backend
Per-user rate limiting for free scansEnforced at the edgeDesign and host it yourself
Auth + RevenueCat paywall around scansFeature-flagged, includedIntegrate two SDKs from zero
On-device free model optionApple Foundation Models wiredManual Core ML / Foundation Models setup
Cost$99 one-time3–5 weeks of engineering time
Domain accuracy (food/plant data)You provide itYou provide it

Frequently Asked Questions

Can I build a calorie-counter app with this boilerplate?
Yes — it's one of the two reference shapes this page is written around. The camera-to-Vision pipeline is wired, so you point it at food, write a prompt that returns calories and macros, and render the result. You supply the nutrition logic and any reference database; the boilerplate supplies the capture, proxying, rate-limiting, and paywall.
Does it include a food or plant database?
No, and that's deliberate. The Swift Kit ships the pipeline, not the domain data. A calorie or species database is the part that differentiates your app, so you bring it (or rely on the model plus your own post-processing). What you don't rebuild is the camera, networking, key security, and subscription layer.
How are my OpenAI Vision and DALL·E keys protected?
They never ship in the app. Every model call routes through a Supabase Edge Function, so the keys live server-side and each user passes through per-user rate limiting first. For a photo-AI app — the category most prone to key scraping and bill-draining abuse — this is the difference between a viable freemium model and a surprise invoice.
Can I cap free scans before showing the paywall?
Yes. The per-user rate limiting in the Edge Functions is what gates free usage, and RevenueCat handles the paywall and entitlements. So 'three free plant IDs a day, then upgrade' is wired through the two systems already in the kit rather than something you bolt on after launch.
When would I be better off not using this?
If your app is pure on-device image classification with a Core ML model you already trained and no cloud calls, no subscriptions, and no accounts, most of this kit goes unused — the value here is the server-proxied AI plus paywall plus auth stack. A single-screen offline classifier doesn't need it.
Does it use cloud AI only, or can it run on-device?
Both. OpenAI Vision and DALL·E run in the cloud through the Edge Function proxy, and Apple Foundation Models run on-device for free. A common pattern is a fast, private on-device pre-check before spending money on a cloud Vision call — both paths are available in the kit.

Keep exploring

Ship the photo-to-data loop, not the plumbing

Get the Vision capture, server-proxied image analysis, DALL·E generation, paywall, and per-user rate limiting wired and ready. $99 once — spend your time on the prompt that turns a photo into your app's answer.

Get The Swift Kit — $99

One-time purchase · Lifetime updates · 14-day refund