Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan
Sequoia CapitalAITechnologyFuture of Work

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

30mSummarized Jun 29, 2026

TL;DR

  • December marked a stark shift — agentic coding finally just works.
  • Software 3.0: prompting replaces code; the context window is the lever.
  • LLMs automate what you can verify, creating jagged capabilities.
  • Vibe coding raises the floor; agentic engineering preserves the quality bar.
  • You can outsource thinking, but not understanding — taste still matters.

Key Insights

  1. 1

    "Never felt more behind" was about a real, sudden shift

    Karpathy explained his startling comment by pointing to December as a clear turning point. With the latest models, code chunks "just came out fine," he stopped correcting them, and he found himself trusting the system and vibe coding. He argued people who only saw AI as a ChatGPT-style tool in 2024 needed to look again.

  2. 2

    Software 3.0 makes prompting the new programming

    He laid out his framing: software 1.0 is writing code, 2.0 is programming by curating datasets and training neural networks, and 3.0 is prompting. In 3.0, what's in the context window is your lever over the LLM, which acts as an interpreter performing computation in information space.

  3. 3

    The install-as-text example

    To make 3.0 concrete, Karpathy described how installing a certain tool is no longer a ballooning shell script but a block of text you copy-paste to your agent. The agent reads your environment and debugs in the loop, which he argued is far more powerful than spelling out every detail in 1.0 code.

  4. 4

    MenuGen and the "Nano Banana" moment

    He built a small app (running on Vercel) to turn a photo of a restaurant menu into pictures of the dishes. Then he saw the 3.0 version: just hand the photo to Gemini and ask its image model ("Nano Banana") to render the dishes onto the menu image. His reaction was that his whole app "shouldn't exist" — the neural network does the work directly.

  5. 5

    It's general information processing, not just code

    Karpathy stressed this goes beyond coding. His LLM knowledge-bases project turns a pile of documents into a wiki — something that couldn't exist before because no traditional code could recompile facts into a new, useful reframing. He finds the genuinely new capabilities more exciting than mere speedups.

  6. 6

    The far extrapolation: neural computers

    Pushed on what looks obvious in hindsight, he imagined "neural computers" where raw video or audio feeds a neural net that uses diffusion to render a UI unique to the moment. He suggested the early-computing question of calculator-versus-neural-net may flip, with neural nets becoming the host process and CPUs the co-processor.

  7. 7

    LLMs automate what you can verify

    Karpathy's verifiability thesis: traditional computers automate what you can specify in code, while LLMs automate what you can verify. Because frontier labs train with reinforcement learning and verification rewards, models peak in verifiable domains like math and code and get rougher elsewhere.

  8. 8

    Jagged intelligence, and staying in the loop

    He illustrated "jaggedness" with a model advising you to walk to a car wash 50 meters away while, he noted, a state-of-the-art model can refactor a 100,000-line codebase or find zero-days. His takeaway: stay in the loop, treat the models as tools, and learn which "circuits" you're in — fine-tuning when you're outside the trained distribution.

  9. 9

    You're at the mercy of what labs train on

    Using the example that chess improved sharply from one GPT model to the next because chess data entered the pre-training set, Karpathy argued capabilities depend heavily on what labs happen to include. He advised exploring the model, since it ships with "no manual" and works in some settings but not others.

  10. 10

    Vibe coding vs. agentic engineering

    He distinguished the two: vibe coding raises the floor so anyone can build software, while agentic engineering preserves the professional quality bar — no introduced vulnerabilities, still fully responsible for the software, but faster. Agents are spiky, fallible, and stochastic but powerful; coordinating them without sacrificing quality is the discipline. He believes the ceiling now far exceeds the old "10x engineer."

  11. 11

    Taste, judgment, and oversight grow more valuable

    As agents handle more, Karpathy said humans must own the spec, plan, design, and taste. He treats agents like interns with great recall (he no longer memorizes API details) but flagged that their code is often bloated and brittle, and shared a bug where his agent matched payments by email instead of a persistent user ID. He hopes quality improves once labs reward aesthetics.

  12. 12

    You can't outsource understanding

    On education, he cited a line that stuck with him: you can outsource your thinking, but not your understanding. He feels he's become the bottleneck — knowing what to build and why, and directing his agents — and uses knowledge bases to process information, because you can't be a good director without understanding, which LLMs don't excel at.

Chapter Breakdown

  • 0:38"Never felt more behind" — the December shift
  • 2:38Software 1.0, 2.0, and 3.0
  • 4:49MenuGen and the "Nano Banana" moment
  • 9:45Verifiability and jagged intelligence
  • 15:48Vibe coding vs. agentic engineering
  • 19:26What human skill grows more valuable
  • 23:32Animals vs. ghosts
  • 25:16Toward an agent-native world
  • 27:40Education: thinking vs. understanding
Read the full blog post