Fourteen Times

Apr 4, 2026

technology, ai, security, teams

GitHub's COO shared platform numbers this week: 275 million commits per week. On pace for 14 billion this year, up from 1 billion in 2025. Actions usage hit 2.1 billion minutes per week, quadrupling since 2023.

Fourteen times more code in one year. Not fourteen percent. Fourteen times.

The immediate question is whether this code is any good. Jon Williams, interviewing engineering teams adopting AI, found a specific answer: AI-generated pull requests contain roughly 1.7 times more issues than human-written ones. The standard "PR for everything" workflow collapses under the volume. Teams that thrive restructure their review processes around the new reality. Teams that don't fall behind in ways that compound.

But here's what's interesting: the review bottleneck is only one side of the asymmetry. There's another side, and it's worse.

Nicholas Carlini, on Anthropic's Frontier Red Team, ran a deliberately minimal experiment. A short script iterating over source files in heavily-fuzzed open source projects, prompting Claude to find exploitable vulnerabilities, then verifying the results. Five hundred validated high-severity findings. Not theoretical. Reproducible, exploitable bugs in projects that had already been stress-tested by traditional tools.

The methodology is almost insultingly simple. Pull a repository. Loop over files. Ask for vulnerabilities. Verify.

What's not simple is what the model actually does. In one case, Claude read a project's git history, found an incomplete bounds-check fix from a prior commit, then identified a second code path where the same fix was never applied. That's not pattern matching. That's reasoning about developer intent and noticing where it was inconsistently executed - the kind of work that used to require a senior security researcher spending days inside a codebase.

Thomas Ptacek, writing about this work, frames the economics clearly: vulnerability research has historically been gated by the scarcity of elite human attention. Replace that scarcity with a hundred instances of Claude running in parallel, and the cost of expert-level vulnerability-finding attention approaches zero. The consequence isn't just "more bugs found." It's that everything becomes a target. Routers, printers, hospital systems, embedded firmware - anything that was previously safe because no human researcher would waste time on it.

Here's the structural picture. Three things are happening simultaneously:

Code volume is scaling 14x. More surface area. More dependencies. More interactions between components that no single person fully understands.

Review capacity is not scaling. The humans doing code review are the same humans as last year, now drowning in 14x the volume. The AI tools that help with review introduce their own 1.7x issue rate. The bottleneck isn't laziness - it's that review requires understanding context, and context doesn't compress the way generation does.

Offensive capability is also scaling. The same models generating code are also finding vulnerabilities in it. And unlike code review, vulnerability discovery has a clean success signal: the exploit either works or it doesn't. No ambiguity. No need for context about product intent. Just: does this crash?

The asymmetry is between tasks with clean success signals and tasks without them. Generation has a clean signal: does it compile, does it pass tests? Vulnerability discovery has a clean signal: does it crash? Code review does not. Understanding whether a change is correct - not just functional but appropriate, maintainable, secure in context - requires judgment that doesn't reduce to a binary outcome.

Tasks with clean signals scale with compute. Tasks without them don't. That's the gap, and it widens with every model generation.

Ptacek worries about bad security regulation enacted in panic. Williams worries about teams that drop AI into existing workflows without restructuring. Both are describing the same problem from different angles: institutions that respond to structural shifts with surface-level adaptations.

The teams Williams interviewed that thrived didn't just adopt AI tools. They changed how they review code (risk-tiered, not uniform). They changed who does what (non-engineers shipping PRs with oversight, seniors shifting to deployment and specification). They had explicit conversations about craft anxiety instead of pretending it wasn't happening. The transformation was organizational, not technical.

This is the part that's hard to automate: deciding what the process should be. The models can generate code, find bugs, suggest fixes, write tests. They cannot yet look at a team's workflow and say "your review process assumes human-paced generation, and that assumption is now wrong." That diagnosis requires understanding the specific humans, their context, their constraints, their fears. It requires the kind of judgment that doesn't have a clean success signal.

Fourteen times more code. Same number of people to understand it. And now the tools that help write it can also find the holes in it faster than the people can patch them.

The question for every engineering team is not "are you using AI?" It's "have you changed your processes to match what AI has changed about your work?"

Most haven't. The gap is already compounding.