AI Photo App Boilerplate for iOS — Vision + Image-Gen for the Calorie-Counter / Plant-ID Class
Point the camera at a plate of food or a leaf, get structured data back. The Swift Kit ships the exact pipeline that the calorie-counter and plant-ID apps live on — Vision capture, server-proxied image analysis, and DALL·E generation — wired end to end so you build the prompt, not the plumbing.
The Swift Kit is an AI photo app boilerplate for iOS, $99 one-time, built for the calorie-counter / plant-ID class of app where a photo goes in and structured data comes out. It wires Apple Vision capture to OpenAI Vision and DALL·E image generation through Supabase Edge Functions, so your API keys stay server-side and every user gets per-user rate limiting. You get auth, paywall, and the photo-to-analysis loop already connected — you write the classification prompt and the result UI, not the infrastructure.
The photo-to-data loop, already wired
Every app in the calorie-counter / plant-ID family is the same shape under the hood: capture an image, send it to a model, parse the answer into something your UI can render. The Swift Kit ships that loop intact. The camera and photo-picker hand a frame to a Vision call, the call runs through a Supabase Edge Function (so your OpenAI key never ships in the binary), and the streamed response lands back in SwiftUI. You're not gluing AVFoundation to a networking layer to a JSON decoder from scratch — that path already exists. What's left is the part that's actually yours: the prompt that turns 'photo of a plate' into calories and macros, or 'photo of a leaf' into a species name and care notes.
- PhotosPicker + camera capture feeding a single analysis entry point
- OpenAI Vision request proxied through a Supabase Edge Function
- Structured response decoded into a Swift model ready for your result screen
Why server-side proxying matters for this exact app
Photo-AI apps are the ones that get key-scraped. A calorie counter that ships its OpenAI key in the app bundle gets that key pulled and drained within days of any traction — Vision calls are expensive enough that abuse hurts fast. The Swift Kit routes every model call through Supabase Edge Functions, which means the key lives on the server and each user hits a per-user rate limit before they can run up your bill. For a freemium plant-ID app where free users get, say, three scans a day before the paywall, that limit is the whole business model — and it's already enforced at the edge, not on the honor system in the client.
- API keys proxied server-side via Edge Functions — never in the app
- Per-user rate limiting to gate free scans before the paywall
- Apple Foundation Models available on-device and free for cheap pre-checks
Generate, not just recognize
Vision answers 'what is this'; image generation answers 'show me this.' The Swift Kit wires DALL·E through the same Edge Function layer, which opens the second half of the photo-AI category — the apps that take your photo and give you a transformed one back. Think a meal-logging app that renders a plated 'goal' version of a recipe, or a garden app that generates a styled illustration of the plant it just identified. Because recognition and generation share one proxy and one rate-limiter, you can mix them in a single flow without standing up a second backend. Apple Foundation Models run on-device for free when you want a fast, private pre-classification before paying for a cloud Vision call.
What you still have to build
Honest scope: the boilerplate gives you the pipeline, not the domain. It does not know that 200g of grilled chicken is 330 calories, and it does not ship a food or plant database. You bring the prompt engineering, the result-screen design, and any reference data or post-processing your accuracy depends on. The Swift Kit's job is to delete the four weeks of camera, networking, key-security, paywall, and auth work that sit between you and your first real Vision call — so the time you spend is on the part that makes your app yours, like tuning the prompt until macro estimates stop drifting.
The Swift Kit vs. building the photo-AI pipeline from scratch
| Feature | The Swift Kit | Build from scratch |
|---|---|---|
| Camera + photo-picker to analysis entry point | Pre-wired | Build AVFoundation + PhotosPicker glue yourself |
| OpenAI Vision / DALL·E key security | Proxied via Supabase Edge Functions | Risk shipping key in bundle or stand up own backend |
| Per-user rate limiting for free scans | Enforced at the edge | Design and host it yourself |
| Auth + RevenueCat paywall around scans | Feature-flagged, included | Integrate two SDKs from zero |
| On-device free model option | Apple Foundation Models wired | Manual Core ML / Foundation Models setup |
| Cost | $99 one-time | 3–5 weeks of engineering time |
| Domain accuracy (food/plant data) | You provide it | You provide it |
Frequently Asked Questions
Can I build a calorie-counter app with this boilerplate?
Does it include a food or plant database?
How are my OpenAI Vision and DALL·E keys protected?
Can I cap free scans before showing the paywall?
When would I be better off not using this?
Does it use cloud AI only, or can it run on-device?
Keep exploring
Ship the photo-to-data loop, not the plumbing
Get the Vision capture, server-proxied image analysis, DALL·E generation, paywall, and per-user rate limiting wired and ready. $99 once — spend your time on the prompt that turns a photo into your app's answer.
Get The Swift Kit — $99One-time purchase · Lifetime updates · 14-day refund