• Three things I am certain of

    My thinking follows three principles. Everything that comes after depends on these three principles.

    First: Every model we use today is the worst model we will ever use again. As OpenAI's Chief Product Officer Kevin Weil said: models will only become stronger, and may grow at an exponential rate.

    Second: The cost of artificial intelligence is plummeting. Epoch AI's data shows that inference prices are decreasing at a median rate of 50 times a year, and since the beginning of 2024, this rate has accelerated to 200 times a year. This is simply a super version of Moore's Law.

    Third: the bottleneck shifts from code to intent. If the first two points hold, then the limiting factor in software engineering is no longer writing code, but defining the right problems, setting the right context, assessing outputs in the appropriate way, and knowing when to intervene and when to let go.

    From optician to optometrist

    In my experience, engineers fall into two categories: requirement receivers and requirement creators. Requirement receivers take in requirements and turn them into code. Requirement creators, on the other hand, analyze the problems deeply, clarify goals, set constraints, and build solutions before writing any code.

    The best engineers I have worked with have always played the role of specification creators. This is not new. The real change is that LLM (lifecycle management) now makes this a basic requirement rather than a differentiating advantage. When AI agents can generate code based on well-structured specifications, relying solely on implementation without taking responsibility for the problems is no longer a viable long-term skill. Industry standards have changed.

    Even before the advent of AI, great technology leaders possessed the same skills: setting context in advance, establishing safety mechanisms for autonomy, and knowing when to intervene and when to let work proceed on its own. In an LLM (lifecycle management) centered era, every efficient engineer needs to operate this way. Every engineer will become a technical leader.

    I discussed this issue last year when AI-driven software development was still in the early stages of agent development. Since then, LLM-based development has grown exponentially.

    My own workflow has evolved through three stages, from initially using Cursor/Windsurf, to Claude Code 4.0, and now to Claude Code Opus 4.6: from clarifying 'how to do' to clarifying 'what to do', and now to spending most of my time on 'why to do'. When a problem arises, my first thought is not 'where did the code go wrong?', but 'where is my intent not clearly expressed?' This shift in thinking is precisely the difference between seeing AI as a tool and seeing it as a team member.

    What does this mean for developer experience?

    If specifications are now the primary artifact, then the most leveraged investment is building a shared context layer: that is, documentation, specifications, and standards that enable any tool to operate effectively across the organization. Claude Code, Cursor, and Codex are interchangeable execution layers. The context layer is what matters.

    In fact, this means viewing documentation as living infrastructure. CLAUDE.md, AGENTS.md, specification documents, and architectural decision records will become the active memory of engineering organizations.

    This shared context layer also changes the way product engineers collaborate. When specifications, architectural decisions, and product principles exist within the same source control system, both parties operate based on the same data source. The failure mode where product managers write product requirement documents (PRD) but engineers have different interpretations will no longer exist.

    Today, code has two audiences: humans and agents. Looking ahead, agents will increasingly become the primary audience. This means codebases need to be agent-friendly: for agents, code must be readable, writable, parseable, and debuggable. Well-structured Markdown documentation, consistent naming conventions, and clear architectural decision records are crucial. If the codebase is not agent-friendly, the specification documents will be longer (and possibly less precise), the context window will fill more quickly, and the overall workflow efficiency will decrease. Clarity and precision are critical for both humans and agents.

    Tool design should also prioritize command line interfaces (CLI), application programming interfaces (API), and headless architectures. Agents perform best when they can operate autonomously without human clicks through a browser. The more headless the process, the more parallelization of agent work can be achieved.

    I have also established a mechanism to intentionally reset context between major iterations, clearing all context information to ensure specifications can exist independently. If a brand new employee, engineer, or leader can quickly get up to speed and start working immediately, then the organization gains institutional knowledge that can withstand any transitional tests. We should envision a world where startup times during any transition period are as rapid as the startup process itself. But to achieve this, the mechanisms for maintaining shared context must remain consistent.

    How do I measure whether it is effective?

    I believe measurement can be divided into three levels, each providing information to the one above it.

    Developer experience is its cornerstone: The metric is how quickly engineers can go from zero knowledge to making meaningful and validated contributions. This encompasses four aspects: environment readiness (setup and onboarding speed), codebase readability (agent interruption interval), toolchain reliability (continuous integration cycle time, test reliability), and feedback loop speed (the time required from change to validation of correctness).

    The intermediary layer is engineering health: deployment frequency, change failure rate, average resolution time for failures, system stability. In agent environments, throughput can increase dramatically, so quality gatekeeping and defect attribution must also scale accordingly. You need to know not just where things went wrong, but whether it was a specification defect, insufficient coverage, or a review oversight.

    The top layer is business outcomes: feature throughput, time to market, customer satisfaction. The health of engineering directly impacts these outcomes because higher stability and faster iteration cycles mean that companies can deliver more products, enhancing customer satisfaction.

    The causal chain is clear: invest in the foundation, and its benefits will accumulate layer by layer.

    I don't believe we should measure the adoption rate of artificial intelligence or the lines of code generated by agents. Business units don't care how the code is written; they care about whether we deliver faster, with fewer bugs, and higher customer satisfaction.

    What I am uncertain about is how long it will take for agent-generated code to meet the reliability standards required in regulated environments like financial services. The trend is clear, but there remains a gap between 'suitable for amateur projects' and 'passed SOX audits.' Companies that take the lead in building validation and compliance infrastructure will have a significant advantage.

    How will this change organizational design?

    There are three reasons for the existence of the traditional engineering pyramid structure: problem decomposition, information flow, and accountability. This structure makes sense when everyone is only responsible for a narrow field, and the transfer of information is essentially a high-loss interpersonal process.

    The hierarchical model (LLM) changes this mathematical relationship in two ways. First, when each individual can have a larger control area (because agents handle most of the execution work), fewer levels are needed. Second, and more fundamentally: in a specification-driven world, goals, outcomes, and solutions are precisely defined and integrated into a single codebase, and everyone operates based on the same data source, making the hierarchical transmission of information almost frictionless. Those levels primarily used for transforming and transferring information become less necessary.

    As employees take on more execution work, all functional areas can benefit, not just engineering. Product owners can prototype without waiting for iteration cycles. Designers can build and deliver products, rather than just creating models. Engineers can dig deeply into customer issues, not just technical ones. Those who stand out will be those who are deeply knowledgeable about their fields, have a profound understanding of customer pain points, and possess a keen subjective judgment of what 'good' means.

    The trade-off that leaders truly face is not speed versus quality, but autonomy versus oversight: to what extent do you trust your systems, your agents, and your employees to operate independently?

    Opportunity

    I don't know whether this transition will happen in two years or five years. For some companies and fields, it may happen more quickly.

    The transition to LLM-native engineering is not about finding the perfect tools or prompts, but about building a rigorous methodology that treats artificial intelligence as a powerful yet fallible partner and embedding that methodology into the culture, processes, and infrastructure of the organization.

    Companies that integrate artificial intelligence into the core of their products rather than as an add-on technology will have the most advantage. They have the institutional strength to build models, the data required to train them, and the willingness to let AI reshape internal workflows (not just customer-facing processes). Leaders who can build bridges from 'AI as a product' to 'AI as a process' will gain a competitive advantage that is hard to replicate.