💡
TL;DR: This article is about how vibe coding is evolving. It brought too much chaos. Upwork is flooded with broken AI projects and 45% of AI-generated code contains security vulnerabilities. Yet, Claude Code happened, then Andrej Karpathy admits he's never felt this behind as a programmer. 2026 is the year we stop pretending AI can do it all and start learning how to harness it properly.

Goodbye vibe coding. Welcome agentic engineering.

Last Christmas, Andrej Karpathy dropped a post that sums up what I feel:

"I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue."

He co-founded OpenAI and built Tesla's Autopilot.
If he's behind, then I'm a toddler in a cave.

He continued:

"There's a new programmable layer of abstraction to master (in addition to the usual layers below) involving agents, subagents, their prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations..."

Read that list again. Agents. Subagents. Hooks. MCP. Skills. Plugins. If you care to know what those mean and still think you'll just prompt your way into the future, just unsubscribe from this blog because it won't be fun for you. (No really, just leave.)

This is not Kansas anymore Dorothy, we have a full ecosystem of strange tools now:

"Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession."

Karpathy is describing exactly what I've been experiencing with Claude Code, my client project and of course with Alfred. Basically with every agent harness I've duct-taped together in the last year. Except now there's a (not so new) name for it.

Agentic engineering.


Disaster of Vibes

In February 2025, Karpathy coined the term vibe coding. He described it on X as "fully giving in to the vibes, embrace exponentials, and forget that the code even exists."

And to be fair, it worked. Kind of.

By March 2025, TechCrunch reported that 25% of YC's Winter 2025 batch had codebases that were 95% AI-generated. YC managing partner Jared Friedman clarified:

"It's not like we funded a bunch of non-technical founders. Every one of these people is highly technical, completely capable of building their own products from scratch. A year ago, they would have built their product from scratch — but now 95% of it is built by an AI."

The tools had arrived. Cursor. Lovable. Bolt. Claude Code. Everyone was vibe coding their way to MVPs in days instead of months.

But you still need to know what you're doing.

I wrote about this evolution throughout 2025 from defending vibe coding as "a nail gun: loud, fast, and dangerous if you ignore the manual" to eventually admitting I was wrong after experiencing firsthand why prototypes collapse in production.

Even Karpathy ended up hand-coding his own project called Nanochat:

"It's basically entirely hand-written. I tried to use Claude/Codex agents a few times but they just didn't work well enough at all and net unhelpful."

The godfather of vibe coding hand-coded his own project.


In May 2025, security researcher Matt Palmer published a CVE (CVE-2025-48757) that should have been a wake-up call for the entire vibe coding movement.

He and his colleague scanned 1,645 web apps built with Lovable, one of the hottest vibe coding platforms on the market (I'm an ambassador btw). Of those, 170 allowed anyone to access sensitive user data: names, email addresses, financial information, API keys. No authentication required.

The problem wasn't that the AI wrote bad code. The code compiled. It ran. It looked fine.

The problem was that nobody configured the database security properly. Row Level Security policies were either missing or misconfigured. And the people building these apps had no idea that was even something to check.

As Semafor reported:

"Even if AI models write flawless code, vibe-coded software can still have major security flaws because of how it's implemented. The models generating code can't yet see the big picture and scrutinize how it will ultimately be used."

This is the crux of it. The AI can write the code. It cannot understand the system.


Half of AI-written code is insecure

Lovable wasn't an outlier. Veracode's 2025 GenAI Code Security report tested 100 leading LLMs across 80 tasks and found that 45% of AI-generated code contains security flaws—with no real improvement across newer or larger models.

The Glide blog summarized the common vulnerabilities:

"Common types of security flaws observed in AI-generated snippets include the addition of malicious code, SQL injection, insecure file handling, improper authentication/authorization, and insecure file handling."

Let me be clear about what's happening here.

AI has solved syntax. You can describe what you want in plain English and get functional code. That part works.

AI has not solved architecture. It doesn't know your system. It doesn't know what's important. It doesn't know that your database shouldn't be publicly accessible. It generates code that works in isolation and falls apart in production.

I've been saying this since I launched this blog: AI-written code is spaghetti. Great for MVPs and prototypes but falls apart when hundreds of people need to collaborate and maintain the same code.


The Context Management Problem

The deeper issue goes beyond code to context. I wrote about this in November when explaining why workflow automation tools break: they treat context as an afterthought, forcing you to manually manage state, schema, concurrency, and validation.

The same is true for vibe coding. AI can write a function but it can't maintain context across an entire project. It doesn't know that the function it wrote in file A will break the function in file B. It doesn't remember what you told it three prompts ago.

That's not "no-code." That's ops-engineer cosplay.


This is what's happening in software

About 80% of software engineering work is what I'd call syntax translation. Taking architectural decisions and implementing them in code. Writing functions. Connecting APIs. The stuff that used to take months now takes hours.

That 80% is getting automated.

The remaining 20% is architecture. Understanding how systems fit together. Making decisions about security, scalability, maintainability. Knowing what questions to ask before writing a single line of code.

That 20% is becoming the entire job.

I noticed this when building my $30K app with Lovable—the coding took 48 hours, but the architecture and planning took 200+. Guess which part the AI couldn't do?

The traditional career path in software has always moved this direction. You start as a junior dev writing code. You become a senior dev designing systems. Eventually you become an architect who barely touches a keyboard. This is the standard progression

What's changing is the timeline. You used to have 7-10 years to make that transition. Now you have maybe 2.


What's Agentic Engineering?

Reuven Cohen (aka rUv) has been talking about this shift for years. He founded the Agentics Foundation and created Claude Flow. In a September 2025 podcast, he explained the breakthrough:

"The major breakthrough we had probably around two years ago was this idea of a recursive loop... what really made these systems work well was recursion, feeding errors, logs, successes, and failures back into the system so the system could understand the context of what was happening."

This is the key insight. The AI can write code. But it can't debug at scale. It can't understand the bigger picture. It can't maintain context across an entire project.

Unless you harness it.

In 2026 you will win by building harnesses

A harness is not a prompt. Prompts are probabilistic suggestions. "Please don't delete the database" is a prompt. The AI might listen. It might not. (Ask Jason Lemkin about what happened when Replit's AI deleted his production database despite explicit instructions not to.)

A harness is deterministic constraints. The agent literally cannot access production. The agent must run tests before committing. The agent cannot merge without human approval.

In Claude Code, this looks like:

  • Hooks that enforce workflow steps
  • Skills that encode domain-specific patterns
  • MCP connections that give controlled access to tools and databases
  • Plugins that package all of the above into reusable configurations

Anthropic's documentation describes MCP (Model Context Protocol) like this:

"MCP servers give Claude Code access to your tools, databases, and APIs... Implement features from issue trackers, analyze monitoring data, query databases, integrate designs, automate workflows."

This is the new abstraction layer Karpathy was talking about. You're not writing code anymore. You're configuring agents. You're building the constraints within which AI operates safely and effectively.


From AI-First Operator to Agentic Engineer

Last year, I wrote about becoming an "AI-First Operator" someone who uses AI tools as leverage while exercising judgment about when and how to deploy them.

The big idea behind my Operator Bootcamp was to help you build disposable software as a way to solve your problems.

But if you want your solutions to work well, you'll need to move towards Agentic Engineering and that happens by learning how to build your own harnesses.

It's not that different. Same AI capabilities, but with deterministic constraints. Same speed, but with guardrails that prevent the AI from destroying your work.

Operators focus on creating disposable solutions. Agentic Engineering makes things last.


Jevons Paradox and the Future of Software Engineers

Here's the counterintuitive part: I don't think this kills software engineering jobs. I think it creates more of them.

This is Jevons Paradox, first observed in the 1860s with coal consumption:

"When steam engines became more efficient, coal consumption didn't drop—it exploded. Cheaper operations meant more factories, more trains, more applications, and ultimately, a net increase in consumption."

The same thing happened with software. As one analysis noted:

"Despite productivity improvements, we have more software engineers employed than ever before and being paid record amounts. The huge demand for software has meant that companies in aggregate did not fire software engineers—they instead hired them in droves to produce more of it."

If AI makes code production 10x cheaper, the world won't need 10x fewer developers. It'll demand 10x more software.

What changes is what developers do. As another analysis put it:

"There may be less jobs for traditional code jockeys, but they will be replaced by jobs that manage outcomes driven by many applications working together."

Less coding. More architecting. More orchestrating. More harnessing.


What This Means For You

If you've been trying vibe coding in 2025 and wondering why everything breaks, here's the reality:

You don't need to learn to code. But you do need to learn agentic engineering.

That means two things:

1. Understand Architecture

You need to know how systems talk to each other. What a database does. What an API is. Why security configurations matter. Not so you can write the code—so you can tell when the AI is doing something stupid.

This doesn't mean becoming a software engineer. It means becoming technically literate. Understanding the scaffolding around code even if you never write the code yourself.

2. Build Your Harness

A harness is how you capture the 10X Karpathy is talking about.

You can use other people's harnesses. Plugins are now available in Claude Code. There are marketplaces emerging with pre-built configurations for specific workflows.

Or you can build your own. If you understand your domain—construction workflows, legal document processing, healthcare operations—you can encode that understanding into skills and hooks that make the AI dramatically more useful than any generic tool.

The harness is the moat. The harness is the product. The harness is what separates "I vibe coded a broken app" from "I built a system that actually works."


Harnesses for Everything

I've been building Alfred, my AI butler project since August 2024. The vision was always clear: a personal assistant that actually works, that knows my context, that proactively helps manage my life instead of waiting for me to prompt it.

For over a year, I kept hitting the same wall: persistent context management. Alfred would work brilliantly for a session, then forget everything. I'd build elaborate n8n workflows to maintain state, and they'd collapse under their own complexity. The project stalled repeatedly.

Now I understand why. Alfred needed a harness: deterministic hooks that enforce workflow steps (morning briefings, task captures, follow-up sequences) combined with probabilistic tool access (reasoning about priorities, drafting communications, researching questions). The duality clicks in a way pure prompting never did.

What I demo'd in my last build in public post in december is exactly that.

My development work gets a separate harness:

Dovetail.
Different domain, different constraints, different hooks.
Same architectural principle.

This is where the industry is heading. Software development harnesses are leading the way because that's where the pain is most acute right now.

But every knowledge work domain will need its own harnesses. Operations. Sales. Legal. Healthcare. The pattern is universal.

I wrote in my very first post that we needed "neurosymbolic hybrids"—neural networks governed by symbolic systems. I quoted Gary Marcus on how humans use symbolic reasoning to handle outliers that neural networks can't manage.

Harnesses are the symbolic system. Hardcoded reasoning through hooks, skills, MCP configs. These are the deterministic rules that govern the probabilistic neural network. It's not an agent that suddenly uses hard logic. But it makes what we have incredibly useful. Neurosymbolic architecture is emerging right now, built by practitioners who got tired of their agents misbehaving.


Your opportunity in 2026

Let me repeat one sentence from Karpathy:

"Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it."

He's right. There is no manual.

Nobody taught us how to combine hooks with MCP servers with skills with plugins. Nobody defined best practices for agentic engineering workflows. The term itself barely exists outside a few podcasts and GitHub repos.

This is the opportunity.

The people who figure out how to hold this alien tool, who build the harnesses, who write the manuals, who train the next generation of agentic engineers will have a structural advantage that lasts for years.

The 10X is available. It's sitting right there. Most people won't claim it because they're still trying to learn prompt engineering or waiting for AI to just figure it out on its own.

Later this month I'll do a free webinar session where I walk you through my own harnesses. I'll show you what plugins, hooks, mcp tools I use and how I turned Claude Code into my b... I mean go to solution in my setup.

I'll send you the registration link tomorrow.