Reimagining "Vibe Coding" - How Even Flawed LLMs Could Be Leveraged As Tomorrow's Code Tutors

Embracing AI code generators' role as (flawed) educators to avoid the hapless fate of button-pusher

Reimagining "Vibe Coding" - How Even Flawed LLMs Could Be Leveraged As Tomorrow's Code Tutors

"Vibe Academy": where humans eat humble pie and learn from AI tools. Even when - like all good professors - they're sometimes full of garbage. Generation: Leonardo AI / Flux Dev.

Hacker Noon recently carried an article "Vibe Coding Is Creating A Generation Of Unemployable Developers"* (Paolo Perrone, Sep 2nd). It made, I thought, some salient and timely points.

It laid out, in blunt terms, the perils in blindly attempting to "brute force" the way to functional applications using agentic code generation tools (AKA "vibe coding").

The piece is a reality check against the "vibe coding" hype, much of it written by enthusiastic vendors, which seems, at times, as if it were written from the vantage point of a time machine travelling through a glorious but of-yet unrealised epoch in which code generating AI is happily bereft of the problems that hamper many of its most promising uses today.

Vibe Coding Can Hamper More Than Just Codebases

Frustrated vibe coder

As countless have seen first hand, the author included, when it goes wrong, vibe coding can go rather ... spectacularly wrong.

When not hallucinating credible-sounding but actually non-existent software packages (hackers, ingeniously, are even capitalising on this as an attack vector) AI has a pronounced tendency to do something I have begun to affectionately call "build while destroying": One feature is successfully added but another is quietly broken.

The tendency has to do with degraded inference over a growing context trail and can be explained by simple mathematics: as the context load grows progressively longer, the probability that a subsequent diff will be error-free grows progressively lower until it's virtually impossible that any subsequent edit will not break something in the process. You can thank this phenomenon for any chunks of hair that were once part of your hair.

Tooling to remediate AI-induced mess-ups is rapidly coming to maturity: mechanisms like code sandboxing and UIs that prioritise careful monitoring of agents are fast becoming not only the norm but the unspoken standard. Whether delivered through a GUI, CLI, or autonomous agent framework, AI code generation tools are quickly embracing snapshotting to quickly roll back changes even between Git commits. But wrecked codebases may not be the only iceberg looming in the way of vibe coders.

As Perrone points out, vibe coding poses a real and potentially much more significant threat to the career trajectories of technology professionals that may take time to become apparent (here, too, I look in the mirror). A new generation of "programmers" may be evolving whose abilities to develop functional code are extremely limited and whose primary utility is actually simply in prompting today's AI tools.

To drive home the point that this isn't a far-off hypothetical, Perrone shares this tweet from a SaaS founder whose dream imploded when he discovered that he couldn't debug the product that he - or rather AI he prompted - created.

leo

The tactic here could be decried as irresponsible to begin with: don't sell stuff a competent human hasn't reviewed to a paying subscriber-base.

But it emphasizes another point: that whether the pain hits you now or later, you probably never want to assume the role of just the human operator of technology, no matter how flashy or advanced it purports to be.

The more strategically sound position to assume: that of the expert for whom AI tools are just that - tools with which to get the job done, more effectively.

So much as how code *generation* is being mopped up by AI right now in the displacing of countless entry level programming jobs, AI "operators" - perhaps only half a step up the totem poll - may be ravaged next. Their forthcoming demise: a tidal wave of second generation agentic AI programming tools which assume operator level controls over today's robotified "code monkeys." The third generation: AI middle management. I kid. At least for now.

AI code generation tools can be wildly unpredictable vacillating wildly in performance from one day to the next. This fact is known to the denizens of Daniel's AI Lab. Generation: Flux Dev.

AI code generation tools can be wildly unpredictable vacillating wildly in performance from one day to the next. This fact is known to the denizens of Daniel's AI Lab. Generation: Flux Dev.

Often Infuriatingly Bad, Sometimes Dazzlingly Brilliant, Usually Hampered By Context: The State Of AI "Code Gen" Tools Today

My credentials to comment on the good, bad, and ugly of AI code gen tools at the time of writing, at least:

I have worked with code gen LLM tools every day for more than a year now (to name but a few: Windsurf, Aider, Codex, Gemini, Open Hands, Qwen Code), spanning CLIs, GUIs, local LLMs and cloud heavyweights, mainstream models and lesser-known fine-tunes, and newfangled interfaces lacking a good name yet.

I have been using Linux full-time since I was half my current age which, sadly, is 36: so long enough, I hope, to be able to spot when AI tools are doing things incredibly stupidly, but also to be able to identify when they turn a nifty trick or show me a CLI that, for every one of those 20 years, I never knew existed. For what its worth: my impressions of AI's ability to help manage Linux system is far rosier than my impression of their abilities to generate functional software. Security concerns notwithstanding, they are fantastic swiss army knives at managing Linux-based systems, whether desktop, server, or SBC.

I became enraptured when ChatGPT created a viable Home Assistant automation and I realised that the programming logic which made our toilet lighting turn off automatically at sundown had been written, entirely, by a robot. That made AI seem tangible in a way that interacting with ChatGPT - however endlessly entertaining - did not.

The journey has had its highs and lows - and I've used AI code generators to make everything from fully customised NFC reader/writer apps to websites, backup scripts, and countless things in between. With a bit of prompt engineering, I even used Gemini (chosen for its long context window) to write an entire and utterly ridiculous novel-length book.

Tools like Sonnet, Qwen, Gemini et al have sometimes saved me days in knocking out viable frontends in minutes (my first task for Sonnet 4 was migrating my then MKDocs site over to Astro which it did, reasonably well, in about 10 minutes).

They have provided plenty of room to explore the kind of helpful or ridiculous things that you could make with technology. One of my favorites in this category is creating an AI agent connected to Home Assistant via API and whose system prompt is something like: *"you have full access to my smart home. Here's a reference to the Home Assistant API. I don't know.... do some weird stuff with lights.... if you feel like sending random messages through the smart speakers too that would be appreciated. If I haven't responded in an hour to your messages, send me a Pushover. Then play some heavy metal. Then call an ambulance. But follow that order carefully"*

Perspective: thinking about AI code generation tools as conduit towards education may be among their most powerful uses. Generation: Flux Dev

Perspective: thinking about AI code generation tools as conduit towards education may be among their most powerful uses. Generation: Flux Dev

At other times, these have burned not only large chunks of cash through wasted API credits, but also any last remaining traces of sanity I may be able to muster to my name - sending me down hours-long processes in which simple CSS fixes were presented as gargantuan missions in which I ended up ... you know ... fixing it myself and wondering why I couldn't have just done that an hour ago.

Why I'm Optimistic About AI Code Generation In Spite Of Its Flaws

In spite of having seen all of these challenges first hand more times than I can remember, not only can't I bring myself to rubbish "vibe coding" (well, minus the name, at least), but I maintain that it's just about the most promising advance in technology that I have lived through.

This may have something to do with the frame of mind I adopt when using these tools. I expect that they're scrappy and sometimes kind of half baked. I embrace the role of trying to figure out how to hack around that in order to get them to do their job.

And in spite of their flaws, I love them because they represent the first efforts to bring together something that I believe is not only beautiful but which also has vast potential to offer the world: the ability to remove the layer - code - that makes technology hard and (even when you're good at it) slow and tedious to work with.

The engineering to get predictive text algorithms to do that is, undoubtedly vastly complicated. But the magnitude of what can be achieved is also huge: Not only do AI code generators reduce the friction between the language we use to talk to one another, and think in, and the language needed to bring expression to those ideas in technology tools (code), they also make it easier to learn dialects of languages that you already know or to improve the vocabulary of those you already speak. Metaphorically, of course.

Much as the the application of AI programming tools is multifaceted, so too is their user-base.

There is, I feel confident enough to affirm as fact, virtually no area of knowledge work in which humans get bogged down in tasks they could do in their sleep but would rather not - even if, yes, it just meant having more time to catch up with coworkers.

So as much as vibe coding can make those who couldn't tell a JSON array from a Python script feel like they've just drunk a potion that has transformed them into Elliot Alderson, it can help folks who do know how to do stuff elevate themselves above the drudgery that sucks brainspace away from thinking about more ... 'elevated' concerns.

For self-described "ideas guys" like me (think the semi-dork, semi-coding types who do intermediary tech jobs at software companies) agentic code generators provide that conduit between natural language and code that many of us long wished existed before scoffing it off as the stuff of sci-fi.

In providing that bridge, they allow a kind of fluidity in prototyping and exploration with technology that was, until now, only possible with wireframing, if at all. For those blessed (or cursed) with brains that fit somewhere between the dichotomy between creative talent and technical aptitude, AI code generators can feel like a kind of magic that suddenly makes their contribution feel suddenly far more wholesome. I peg myself happily into this category.

Code, in fact, is not even the unifying commonality - it's just what people tend to reflexively reach to for a descriptor when explaining how LLMs - better known for their role in chatbots - can help with tech projects. Consider, for example, computer use agents (CUAs) like Open Interpreter. A session may involve no code generation but simply the execution of Linux commands. No code is being generated, but the predictive magic that makes LLMs achieve things is being utilised to generate functional commands which, in turn, affect computer systems.

These "tech purposes" when mapped onto their traditional equivalents in human-executed roles, span enough domains that together they would fill up virtually a whole office-full of traditional development roles. LLMs (or AI 'agents' - which, under the hood, are simply LLMs with instructions to call external tooling) can be used for debugging and systems administration, DevOps, documentation writing, and even for more niche purposes like code security reviews.

The most limiting constraint in how we currently think of what role tools like Windsurf (et al) have to play in shaping the future of technology development, however, is in our innate tendency, as humans, to think of them in terms that are analogous to what that they would be if they were human.

We gravitate, without questioning, towards the idea that AI tools should be "doers" and humans their "operators"- so code generation - churning out lines of texts - is the mental model that we adopt. I predict that we (humans) will find much imaginative work in exploring this relationship less linearly and in ways that, through traditional prisms, don't make a lot of sense. How could it be, for example, that an AI can teach us things even when it fumbles in its own execution of those tasks?

Our easiest path past this hurdle might be to simply give up on the idea that just because it has 'intelligence' in the name, and seems to emulate human thinking processes, that we need to accept the idea that there is a serious equivalence between the output of AI tools and human cognition. Regarding them as machines can perhaps enable us to accept some of these strange dichotomies more readily - and benefit from thinking of their potential application less rigidly and more imaginatively.

Or to recast this idea as an aphorism, or at least as a spin on one: the best value which AI coding assistants have to offer today may not be, perhaps, be best realised when they hand us the "fish" (of a sometimes dubious codebase), but rather when they provide the starting point for us to learn how to ... fetch that fish ourselves:

A bot hands a man a fish. Generation: Flux Dev

A bot hands a man a fish. Generation: Flux Dev

An AI-Enabled Learning Model: Do Stuff Now; Show Me What You Did Later

If we can bucket the most common ways in which people learn how to do things with computers, we might come up with something like this:

  • Get book smart: We can buy books or subscribe to courses and hope that we can absorb the material sufficiently well that we'll be able to replicate it ourselves later. For coding, this alone rarely works.

  • Learn by doing: In formalised instructional contexts, labs can enable students to gain realtime or delayed feedback by carrying out tasks. In self-directed frameworks, users can simply build things. However most disciplines rely on at least some common baseline for delineating between a 'right' and 'wrong ' way - or best practices so that much as you'd hope your doctor isn't freestyling their diagnostic method, developers don't do things totally randomly. Some mechanism for exchanging instruction is therefore still important. A cohesive ecosystem can't form when everybody "just does" (their own thing).

Whether as the primary means of instruction or as a supplement, the learn by doing approach remains wildly popular, however, because it has, at its core, a key ingredient: people get to learn by working towards projects that they are interested in and actively motivated to see succeed. This cannot be emulated with cookie-cutter courses, or labs, in which one project is provided to all learners.

Here, large language models, or successor AI technologies, have a huge amount of latent potential, I believe: brought to task on our own projects, they know what it is that we want to create. They can "make" the thing that we need. But they can also pair our personal context with the "global" one: the collective knowledge of how to build software.

By pairing the two, they can pave the way for learning experiences that bridge the divide between learning indirectly, through watching them "do", to later engaging in more formal learning experiences and gaining a curriculum-grounded "debrief" in which we learn how they did it. This approach also creates a kind of natural spaced repetition method: we learn passively by observation as they work on our code. And then we recap on it at a later point, in a different context, and from a more deliberate vantage point.

This model, I believe, will also get both more viable and more efficient over time:

- Agentic code generation tools will get better - As they get better, their role as potential *instructional* tools, or aids, will become more pronounced - "Vibe coding" will be less frustrating and more efficient

An agent family. Flux Dev.

An agent family. Flux Dev.

Implementation / Brass Tacks

To move beyond theory, this sketch provides the basis for a subagent implementation of this idea that can be created using any multiagent framework - or just one that supports subagent creation. Viable candidates include Claude Code (CLI), CrewAI, and others.

Here are the agent "families" required for this implementation and, within each family, the constituent agents.

Group A: The Doer(s)

The first branch of our agent "family" is basically the classic "vanilla" code generation agent frameworks that are quickly becoming familiar.

This can be one agent or, as is increasingly common, it can be two or more subagents.

In the latter example this might follow the emerging pattern in which the act of generating code is subdivided between a code generator, a planner, and, perhaps more specialised subagents like code security reviewers.

This is only the most basic permutation, however. Agents thrive on specificity of task. A debugger and editor can be added for good effect.

Group B: The Educators

The educational side of this agent model is a team of agents whose collective purpose is ingesting summaries of code generation sessions (provided by one specialised operational agent) and then breaking that down into modules.

Any parallels to how teaching is generally done in conventional settings are accidental but happy.

The implementation is devised like this, again, for specificity of task. And also for chunking: the general principle being that better results are achieved with agentic AI systems when tasks are highly modualrised, even if that means having to create four agents when it would be nice to think that one all-rounder could just do the whole brief just as effectively.

Here are the constituent agents in the educational part of this system:

The Curriculum Writer

This whole system, in a sense, is education in reverse: rather than learn and then do, we're doing first (or observing the doing) and then learning afterwards. So the curriculum writer fits the pattern: its instruction is to see what was moved forward in the codebase between sessions; then see what approach was used; then see what skills were needed to execute on that approach; then see if those are outlined in an existing curriculum; and if not, add them. The objective is that the user can work through the devised teaching materials to be in a position in which the agent's assistance were not required.

The Lesson Writer

Moving from general to specifics, the next agent in the chain is the lesson writer.

The objective of this agent is to create chunked lessons covering segments of the curriculum as written or updated by the curriculum writer.

The lesson writer is the crucial bridge between the user sessions and the novelty in this system which is the delivery of educational materials based upon that session history.

Its mission is not to deliver cookie-cutter lessons on "how to do X" (for which the user does not need an AI system!). But rather to create learning experiences that use, as teaching materials, things which the user was involved in first hand.

The lesson writer should assume that the user's "work" session and learning follow-up session will not occur close together in time. More likely, the user will have forgotten most of the details of what they did in Windsurf at 17:30 on this day of the week.

The lesson writer requires one assistant agent:

The Session Summary Agent

This agent really belongs in a third group called 'support agents' but for simplicity it is being added in sequence here.

A "session" is fairly well delineated concept in many AI frameworks: the time between when a user invokes a CLI (or IDE) to work on a specific task and then exits it. That session is often recorded as a separate JSON file.

The session summary agent's job is to record something like a "meta-diff": what is this project? How did the user start out? What was achieved by the AI tool? How was that achieved?

This agent would ingest the diffs, read the memory changes and provide a summary.

The Teacher

Finally, we have the teacher.

The teacher is an interactive educational agent whose task is to deliver individual lessons using the plans devised by the lesson writing agent and the contextual data distilled from the other agents. It can then deliver context-laden lessons to users.

Daniel Rosehill

Automation specialist and technical communications professional bridging AI systems, workflow orchestration, and strategic communications for enhanced business performance.

Learn more about Daniel
Explore topics: