Nano Banana Can Spruce Up Messy Wireframes- Kind Of

Like writing your great ideas on cafe napkins? Nano Banana might help you convert those into musings that other people can actually read

Nano Banana Can Spruce Up Messy Wireframes- Kind Of

To much excitement in the AI world, late last month, Google began rolling out Gemini 2.5 Flash Image (product name: Nano Banana).

The image generation model is available via the Gemini app as well as AI Studio (and other Google AI offerings including Vertex AI). Most attention, however, has focused on its inpainting abilities: the model allows users to pair prompts with existing images to achieve new improved generations.

Since release, users have been discovering use-cases at a rapid pace. Today, I stumbled upon a reasonablly successful one: using Nano Banana to take rough jotted-down-on-a-napkin type sketches and bring them forward, iteratively, into far more polished versions.

The process was imperfect; for one, text adherence remains a major challenge even for state of the art models, and unless each AI generation is carefully reviewed, small botched details are easy to miss - which can be potentially catastrophic in diagrammatic heavy workflows like tech writing.

While Gemini did initially succeed in taking my chicken scrawl level notes into legible text, as I moved from V2 to V3 the "Home" button in the bottom nav bar suddenly read "Lome":

Nano Banana introducing textual errors during iterative inpainting runs. Screenshot: 09th September.

Nano Banana introducing textual errors during iterative inpainting runs. Screenshot: 09th September.

But even if it's not 100% there, I can say, with confidence, that there is huge potential here and that the potential use-cases for this one image refinement workflow alone are, themselves, quite vast.

Moving beyond the niche context of wireframing and UI/UX workflows, this sketch refinement model could be applied to just about any form of note-taking in which scrappy handwriting needs to get cleaned up a little before it can be passed over to other human beings.

From Messy Sketch To Clear Sketch (Kind Of). Or Sketch To Diagram. You Choose.

A huge boon here: the ability to use prompts to dictate all manner of edits. This is "just" classic inpainting, sure. But Google bakes it into a conventional chat interface which makes it feel like you have room to be as expansive as you please in describing how you would like the source image changed.

By way of note: this is fast becoming the defining characteristic for how Google is attempting to differentiate its multimodal AI offering (as loaded onto Gemini) in an increasingly crowded market. Gemini TTS is - in my humble opinion - one of the most exciting and interesting developments in AI-enhanced TTS for some time.

Eleven Labs (et al) have moved TTS far beyond the robotic synth voices of yesteryear, but just doing "natural sounding" voices is no longer a significant differentiator. Cleverly, Gemini has brought a novel feature to TTS that's actually rather significant: in addition to providing the text to be narrated you can provide expansive instructions for your voice "actor." The results, I've found, can be powerful.

In Nano Banana, too, I found there to be enormous creative latitude just waiting to be tapped into with some astute prompting and some imagination in the use-cases that can be supported.

Today I had one to try it out with: I'm working, currently, on a heads up display (or really, heads down).

Heads up display

Heads up display

My wife recently gave birth to a wonderful baby boy. As the house's night owl (and its reigning tech fiend in chief), I've set up a few IP cams which I keep a periodic eye on during the night. By day, I keep one vigilant eye on my agenda/calendar. My project: blend NVR and DAKboard-style calendar viewer into one dashboard to run on this display. ChatGPT encouraged me to jot down wireframes as not only is Figma-to-code possible but apparently sketch-to-Figma is now, too.

So for the first time in ... it may be years ... I got to wireframing. And I use that term probably much too generously. Because this is what emerged on paper:

My first hand drawn wireframe. Unredacted. What do you mean you could tell?

My first hand drawn wireframe. Unredacted. What do you mean you could tell?

That raw input, coupled with my cleanup prompt, got me to this:

Gemini's first enhanced wireframe

Gemini's first enhanced wireframe

V1 to V2 was probably actually my most successful iteration.

Not only are the absent lines added, but the text is clearer.

Nano Banana's first pass edit on a raw sketch improving the clarity of the text and boxes

Nano Banana's first pass edit on a raw sketch improving the clarity of the text and boxes

If we look really closely, however, we can see text adherence flaws cropping up even here. When it comes to dashboards (like news consumption), I believe that less is more. But as an asthmatic (and living somewhere with relatively poor air quality often exacerbated by forest fires) local air pollution is one of thew few potentially useful indicator that I can gain useful information from. PM2 is a good metric that's also readily available from free APIs.

But zoom into Nano Banana's edit and you'll see that the unit of measure has mysteriously become "pmp2" instead:

Nano Banana introduces a textual error during an image iteration.

Nano Banana introduces a textual error during an image iteration.

Zoom in further and you'll see that the likely culprit is the beginning of an upward pen stroke which Gemini interpreted as being a partial attempt at a 'p' rather than a standard feature of my chicken scrawl. It tried to be helpful. But in this case it did not succeed and the unit of measurement was botched.

These may seem sound like minor inconveniences but are worth pointing out because they represent, in fact, very significant (but hopefully not insurmountable hurdles) in the way of this technology being applied in diagramming workflows where (as a technical writer!) I see much potential.

However here too we see the disconnect between AI marketing and reality that often feels crazy-making. Google boasts that Gemini 2.5 sports "high fidelity" text rendering. But like the Flux models, one doesn't need to try more than a few image-gen prompts with text instructions (for example: "a man is wearing a name badge that says Daniel") to form a rather jaundiced view of such claims.

Can Nano Banana Turn Napkin Scribbles Into Boardroom-Worthy Powerpoints?

So is Nano Banana the secret weapon that can allow you to write messy ideas on the back of coffee cups in Starbucks (or napkins in Subway) and AI-magic-them into beautiful diagrams ready to be printed in your tech docs (or boardroom pitch)?

Gemini's second wireframe which took my model of an embedded NVR page for my dashboard and layered on credible simulated camera frames.

Gemini's second wireframe which took my model of an embedded NVR page for my dashboard and layered on credible simulated camera frames.

At least at the time of writing, my vote is with "not quite yet."

Like almost all generative AI hitting the market these days, there's an appreciable gap between what the model's makers say it will do well and what it can actually handle in real gritty prompting.

Wireframing and diagrammatic workflows may be a strong litmus test for how well and faithfully Nano Banana can both render text and not alter existing text (unless instructed to do so, of course).

But even on the V2 to V3 iteration (asking Nano Banana to edit wireframes with its own intelligible renderings of my handwriting) I saw pseudotext creep in, nonsensical edits, and overall spotty success. Like most AI tools, the end result was imperfect and the result, even then, of successive rounds of undoing mistakes before adding touches.

The potential and capabilities, on the other hand, are great. For tech doc writers looking for a bright new assistant - give it just a little bit more time to marinate.

Daniel Rosehill

Automation specialist and technical communications professional bridging AI systems, workflow orchestration, and strategic communications for enhanced business performance.

Learn more about Daniel
Explore topics: