A generation mode where the user drops one or more screenshots of an existing UI and the AI rebuilds it as a working application. VULK pairs vision models with the brand engine to match colors, fonts, and layout.

Screenshot-to-App

Screenshot-to-app turns a static image — a Dribbble shot, a competitor's homepage, a hand-drawn sketch — into a live, editable codebase. A vision-language model extracts the visual hierarchy (sections, components, copy, fonts, colors), the brand engine maps the palette and typography to the closest production-ready tokens, and the generation agent emits the corresponding React or Next.js project.

In VULK, screenshot-to-app is invoked by dragging an image into the chat composer or pasting from clipboard. The image is sent to a multimodal model (Gemini 3.1 Pro, Claude, or GPT-4o depending on the task), the output is a structured spec, and the spec is passed to the same intent modeler used for text prompts. The brand engine then locks the color palette and font pairing — typically within 90 seconds the live preview matches the screenshot at the section level.

See /docs/creating-projects/writing-prompts.

Screenshot-to-App

Screenshot-to-App

On this page