Fitly

A native iOS wardrobe app. Photograph your clothes, describe the occasion, and an LLM curates the outfit while nano-banana generates a real mockup of you wearing it. Shelved due to image model limitations.

// Why I Built This

I took a business entrepreneurship course in college. One of the teams proposed a smart mirror that generated outfits and put your clothes on you in real time. This was 2019, way before any AI image generator could do this. The team positioned it as something for the home, but I couldn't help but think stores would be dying to have a mirror where when you walk by it, it shows the item for sale on you. I imagined then, and I do now, that converting the clothing to a mesh and draping it on you in the mirror is probably the best way to solve that engineering problem — but I digress.

With agentic coding we've seen the App Store become, in a way, the new dropshipping. People are building apps with hard paywalls, weekly subscriptions, little to no value outside of viral potential, and making a killing — so naturally I wanted in on the action. Kidding. Partially. I think my idea actually has value.

The idea was simple: take photos of your entire wardrobe — shirts, pants, outerwear, accessories, shoes — and type in the chatbox what the occasion is. An LLM picks the items, and nano-banana uses a reference picture of you and the images of your clothes to create a real mockup of your outfit.

Outfit generator view with selected wardrobe items and generated mockup

// How It Works

Three-step generation pipeline

The app follows a three-step pipeline. First, analyze: when you photograph a clothing item, GPT-5 nano generates a detailed description (“Blue denim jeans, straight fit, medium wash with subtle distressing”). Second, curate: you type an occasion (“classy fall fit”) and the LLM selects 3-6 items from your wardrobe, ranks them by visual complexity, and explains why they work together. Third, generate: your reference photo and the selected clothing items go to nano-banana, which produces a photorealistic mockup of you wearing the outfit.

Smart complexity ranking

The curation step pre-computes which items have the most visual detail — patterns, logos, textures. The top two most complex items get sent to nano-banana as images so the model can see exact patterns and colors. Simpler items get sent as text descriptions to keep the input manageable. This hybrid approach balances output quality against the model's input limits.

All data on-device

The iOS app stores everything locally with SwiftData. Clothing photos, AI descriptions, reference photos, saved outfit generations — all on your device. The backend is a stateless Express server that's just a pass-through to the AI APIs. No user data is stored server-side.

// Architecture

┌──────────────────────────────────────┐
│          SwiftUI (iOS)              │
│                                      │
│  ┌────────────┐  ┌────────────────┐  │
│  │  Closet    │  │  Outfit        │  │
│  │  Manager   │  │  Generator     │  │
│  └─────┬──────┘  └───────┬────────┘  │
│        │                 │           │
│  ┌─────▼─────────────────▼────────┐  │
│  │  SwiftData (local storage)    │  │
│  │  (clothes, photos, outfits)   │  │
│  └───────────────┬───────────────┘  │
└──────────────────┼──────────────────┘
                   │ REST
     ┌─────────────▼──────────────┐
     │    Express (stateless)     │
     │                            │
     │  ┌──────────────────────┐  │
     │  │  GPT-5 nano          │  │
     │  │  (analyze + curate)  │  │
     │  └──────────────────────┘  │
     │                            │
     │  ┌──────────────────────┐  │
     │  │  Gemini 2.5 Flash    │  │
     │  │  Image (nano-banana) │  │
     │  │  (generate mockup)   │  │
     │  └──────────────────────┘  │
     └────────────────────────────┘

The iOS app is native SwiftUI with SwiftData for persistence. An earlier version used Expo, but I ported it to actual iOS using Cursor (this project predates my Claude Code era). The backend is a stateless Express server — no database, just a pass-through to GPT-5 nano for clothing analysis and curation, and Gemini 2.5 Flash Image for the mockup generation. Total cost per outfit generation is around 4 cents, dominated by the image generation.

// Decisions I Made

Native iOS over Expo

An earlier version of this app was built with Expo. I ported it to native SwiftUI using Cursor — this project predates my Claude Code era. Native gives better camera integration, SwiftData for local persistence, and a path to the App Store without the Expo build service overhead.

Hybrid image + text input

Sending 4-5 clothing photos directly to the image model was unreliable. The complexity ranking step solved this — the most visually complex items (patterns, logos, textures) get sent as images where the model needs to see exact details, while simpler items (“khaki shorts”) are described in text. This keeps the input within the model's sweet spot.

Shelved, not abandoned

When it worked, it worked well. But half the time nano-banana couldn't handle the multi-image inputs reliably enough to produce a good output. Nano-banana-pro improves on this, but the personal limit of 200 generations per day per API account, plus the cost at around 15 cents an image, makes scaling this idea challenging. I've since seen this exact concept built successfully by others — I see it left and right on X. The model capabilities will catch up.

// Stack

iOS

SwiftUI + SwiftData

Backend

Node.js + Express (stateless)

Curation

GPT-5 nano (vision + reasoning)

Image Gen

Gemini 2.5 Flash Image

Storage

On-device (SwiftData)

Cost

~$0.04 per outfit generation