Blog
The Ground is Moving...
- ai
- agents
- engineering-culture
I didn't work this out on my own. I'd been watching the industry argue about AI for a while without being able to name what bothered me about the argument, until I read two old books that have nothing to do with software: Thomas Kuhn's The Structure of Scientific Revolutions, which is about how a field actually changes, and Everett Rogers' Diffusion of Innovations, which is about how new things spread through it. Read side by side they described the trend I was already half-seeing better than I could, and they're most of the reason I now think the usual cautious instinct, the one I've leaned on for years, is about to backfire.
The instinct is to wait it out: you ignore the first wave of a new thing, let the loud people and the consultants bleed on the edges, hold off until the tooling is real and the worst ideas have died, and then adopt the parts that are obviously useful, and I don't mean that sarcastically. It has worked well for engineers for a long time, because it's basically how sane engineering organizations avoid wasting five years on every distributed object system, JavaScript framework, NoSQL database, and blockchain pitch that shows up wearing a lanyard and asking for budget. Most senior engineers aren't slow because they're lazy; they're slow because they've learned, correctly, that being first is usually a more expensive way to be wrong.
The whole strategy assumes the thing moving through the adoption curve is only a tool, and if it's only a tool then waiting is fine, because you skip the early pain, read the postmortems, adopt the boring version, and still come out ahead of most organizations. But if it changes what counts as competent work, the same cautious move turns into a trap, because by the time the safe version arrives the people who spent the messy phase building intuition already know where the bodies are buried.
That mismatch is what I think a lot of good engineers are underestimating right now. I don't say that because I think the models are magic, or because anyone should sprint after every product launch with their hair on fire. The question I keep coming back to is narrower: is AI closer to a better IDE feature, where waiting costs you a little productivity and you catch up fine, or closer to a change in how the work itself gets done, where waiting too long leaves you fluent in a way of working the field has already moved past?
Kuhn is useful here, even though he's easy to abuse. His model is that a field spends long stretches in normal science, working inside a shared frame of assumptions, methods, and acceptable questions, until the anomalies start to pile up. The old frame needs more and more awkward patches, a crisis builds, and eventually the field reorganizes around a new frame. The part that matters for us is the human one: expertise doesn't transfer cleanly across that break, so the physicist who was brilliant under one set of assumptions doesn't automatically stay brilliant under the next, and seniority by itself isn't a passport.
Software isn't physics, and I don't want to oversell the analogy. We aren't replacing the idea of mass: a compiler still compiles, latency still matters, distributed state still ruins your afternoon, and anyone who tells you the fundamentals no longer matter is probably selling something with a monthly seat price. Where Kuhn does fit is one level up, at what counts as the normal working loop. If that loop used to be think, type, run, debug, repeat, it's increasingly becoming describe, generate, inspect, test, and review. The fundamentals still matter, but the bottleneck has moved, and when a bottleneck moves, the skill profile around it moves too.
Rogers gives the other half of the picture. He mapped the familiar curve (innovators, early adopters, early majority, late majority, and laggards), and most good professionals are, and I mean this as a compliment, optimized to live in the early majority. Innovators are often wrong, early adopters routinely confuse pain with insight, and the early majority waits until there's real evidence. In normal technology adoption that's a great place to stand: you get most of the benefit without ever volunteering to be the bug report.
Overlay Rogers on Kuhn, though, and the curve starts to behave differently. When the new thing is only a tool, sitting in the late majority just costs you a temporary productivity gap. When it changes the working model, sitting in the late majority on the new model can mean you're lagging on the old one, because the market and your team have already moved their definition of competent output. Nothing I've actually lived through is the right size for this. The closest rhyme I can find happened before my time: the jump from the command line to the mouse and the GUI, when the whole way a person drives a computer changed, and knowing how to use one stopped meaning you'd memorized the right commands. The command line didn't die; most engineers, me included, still live in it. But it stopped being the thing everyone had to learn, and a far bigger world walked in through the new interface. That's the shape this has for me: a new way of telling a computer what you want, layered on top of everything we'd already built.
You don't have to reach back that far to feel the mechanism, though. The biggest shift in my own career was cloud, and even that undersells what's happening now. Cloud mostly repackaged capability that already worked behind a nicer API, but it shows the pattern cleanly. The engineer who moved early spent years building instinct for managed services, autoscaling, and infrastructure as code, and that instinct quietly became the job, while the engineer who waited got genuinely excellent at hand-tuned capacity planning and bare-metal high availability, right up until the market repriced that expertise toward zero. The skill didn't get worse; the ground under it moved. That's the part I actually worry about with AI: not that good engineers get replaced by a model, but that they keep getting sharper at a version of the job the rest of the field has quietly stopped paying for.
My read is that AI is closer to that kind of interface shift than to a better IDE feature, not because today's tools always produce good code (they often don't), but because the unit of work itself is changing. In roughly the time it used to take to get oriented, a good engineer with an agent can now ask for test scaffolding, a first-pass refactor, a PR review, a failing repro, and a draft migration plan. A lot of that still needs human judgment, and plenty of it will be wrong on the first try. But the agent doesn't really replace taste; it widens what you apply taste to, and the engineer who learns to steer that ends up with a very different ceiling than the one who just learns to type faster.
It's worth being concrete about what steering actually means, because it isn't clever phrasing and it isn't a personality trait you either have or you don't. It comes down to two fairly unglamorous habits: being deliberate about what you put in front of the model before it starts, and being disciplined about how you check what it gives back. Neither habit is new in spirit, which is exactly why they're easy to underrate, and together they're most of the difference between an engineer whose work gets sharper with these tools and one who just produces more code than they can actually stand behind.
The first habit is the part people have started calling context engineering, which sounds fancier than it is: it's mostly about not burying the model in things it doesn't need. It's tempting to treat the context window like free storage, dumping in the whole repo, a couple of half-relevant runbooks, and an hour of old chat on the theory that more can't hurt, but past a certain point more is the thing that hurts. You don't get a sharper model; you get context rot, the documented way these systems degrade as the window fills, where the agent starts applying a convention from the wrong service or relitigating a decision you settled two files ago, confidently and without flagging any of it. The fix is almost always less, not more, which still feels backwards every time I reach for it. I keep the working set small and push everything else into skills, so a deploy runbook or a review checklist sits on a shelf the agent only reaches for when the task actually needs it. Nine times out of ten, a better answer comes from fixing what the model is looking at, not from finding a cleverer way to ask.
The second habit is where the real work lives, because a clean context still won't save you if you take the model's word for anything. So I don't read the diff and decide it looks right; I make the agent prove it. It writes the tests first, so the code has to satisfy a spec instead of grading its own homework, then it deploys the change to an isolated environment, feeds it real inputs, and watches the outputs and the database to see what actually happens instead of what it intended. My job moves up a level, to deciding what good output looks like before any of that runs and building the evals it has to clear, so the question quietly shifts from whether the model will hallucinate to which checks are standing between it and main. None of it is new, which is the whole point, because it's the same testing, review, and CI we've leaned on for years, only now aimed at a new kind of author that writes fast and sounds sure of itself. That's the competent work hiding in plain sight, the way instinct for managed services hid inside "we moved to the cloud," and you only build it by running the loop yourself, because there won't be any postmortems to read later.
The uncomfortable part is that the reps matter more than the takes, because you can agree with every cautious sentence in this post and still fall behind if your actual contact with these tools is asking a chatbot to explain an error message now and then. The people getting visibly better aren't necessarily the ones with the strongest opinions about model capability; they're the ones building small things, watching where the agent drifts, learning which tasks need fresh context, writing a skill when the same instruction keeps repeating, adding tests because the model's confidence isn't evidence, and slowly developing a feel for when the tool is genuinely helping versus dressing up uncertainty as confident prose.
There's a reasonable version of skepticism here, and I don't want to flatten it, because the concerns are real: the security and IP exposure of pushing your codebase through somebody else's model, the way model behavior swings between versions so the thing that worked last month quietly stops working this one, the vendor lock-in that shows up the moment your whole workflow assumes one provider, and the real risk that some organizations will use AI as an excuse to cut mentoring and review and call the savings a win. To me, though, those read as arguments for learning the new loop with discipline rather than standing outside it and waiting for a version that shows up with no tradeoffs, because no such version is coming, and if it somehow did, the people who practiced on the messy one would still understand it better than the people who waited.
The risk I take most seriously isn't deskilling, because I don't think these tools erode the foundation of someone who already has one. A lot of people call AI a force multiplier; I think it's closer to a skill multiplier, which is a less comforting thing. There's a floor it hands everyone, a baseline you get no matter what you know, but above that floor it mostly multiplies whatever you walk in with, so real depth compounds while half a skill just gets you half a skill amplified, confidently and at speed. You can vibecode a working mobile app that way, and it'll genuinely run, but a demo and an enterprise system are separated by exactly the foundations the multiplier can't supply: the failure modes you thought to handle, the data model that survives contact with real users, the security boundary you drew on purpose. So the danger was never that good engineers would get worse; it's that people who never built the foundation get fluent at shipping output they were never equipped to evaluate, and nobody finds out what's missing until something subtle breaks in a way the tests didn't catch and no one in the room can reconstruct why.
What I'd actually do, if I were trying to stay honest with myself instead of just collecting opinions, is pick one boring piece of work a week and force it all the way through that loop, tests first, fresh-context review, CI as the judge, my own hand on the merge. The goal isn't to become the person who trusts the tool by default; it's to become the person who knows, from actual scar tissue, which parts of the work the agent can carry and which parts still need me, because the only honest way to learn that line is to run the loop enough times that the failure modes stop being theoretical and start being things I've watched happen.
The part I don't know yet is whether most teams will use AI to get better at the work or to quietly avoid learning it, and my honest suspicion is both, because incentives are incentives and humans are humans. But for an individual engineer the bet looks clear enough to me: if the ground really is moving under the adoption curve, then waiting for permission is a far riskier strategy than it looks from the old floor.