I Said It a Year Ago
Everyone is writing about context windows now. I'm glad. And a little frustrated.
Not because I want credit. It's not that. It's more that specific discomfort of having said something for a long time, watching people not quite listen, and then seeing the exact same thing start appearing everywhere — articulated by others, in other channels, for an audience that's now ready to receive it.
It's not bitterness. It's wondering what you could have done differently. What the hell did you miss?
What I Actually Said
I ran workshops in late 2024 and into 2025 where context windows were one of the topics participants had the most fundamental misconceptions about. Organizations were making decisions based on input numbers. "100,000 tokens — that should be enough for the entire codebase." It's not. It's not about capacity, it's about what actually happens to the information.
Three concrete things I said, which are now starting to appear in other people's writing:
That input capacity and actual comprehension capacity are different things. You can feed in a million tokens, but what you actually get a useful response on is a fraction of that. The model degrades — not dramatically, not from zero to a hundred, but consistently. What was said in the middle of a long context is statistically what the model handles worst.
That output capacity is the real constraint nobody talks about. Everyone markets input. Nobody markets output. ChatGPT cuts off around 4,000 tokens. Claude reaches 8–10,000 on a good day. You can fill the system with as much as you want — you'll still get five pages back regardless of how much you put in.
That the cost difference between testing and actually using is an order of magnitude nobody anticipated. The free tier is designed to create dependency, not to evaluate. Professional use costs what professional use costs. That's not a problem with AI — it's a problem with people testing in one context and making decisions in another.
Those are the three things. Concrete. Verifiable. Things I said in workshops to groups of senior developers and technical leaders throughout 2024 and 2025.
Why Nobody Listened
It's easy to be bitter about that. It's more honest to say I was probably wrong about timing.
The market wasn't ready. You shouldn't blame the audience when you're too early. The pain wasn't visible enough. The context window as a problem doesn't surface until you've actually hit it in production environments — explained to a client why the AI "forgot" instructions from three pages back, watched cost spikes when moving from test environment to API. Those experiences take time. The market hadn't had time to accumulate them.
That's not a comfortable insight. But it's the correct one.
Being early means you're talking to people who don't yet have the problem you're solving. You have the answer, but they don't have the question yet. And it doesn't matter how right you are — timing isn't an academic question. It's what determines whether something actually gets communicated.
The frustration is real, and it's justified. But it's not useful to hold onto.
What It Means Now
The fact that people are writing about it now isn't a threat. It's a signal.
The market is moving. The problem I had answers to twelve months ago is now visible enough for mainstream tech media to cover it. That's validation — not of me, but of the fact that it was a real problem and not a manufactured one.
And it creates a position to take now, rather than waiting another year.
The experience of having seen it coming, of having hit it in production, of having explained it to hundreds of senior developers and watched which misconceptions actually stick and which ones slide off — that experience isn't replicable by someone who's just starting to write about it now. That's not arrogance. That's what twelve months of practical application gives you.
The position is to say: I was here a year before it went mainstream. And there's a reason for that.
What I'm Saying Now That People Aren't Ready For
Context windows are history. Or rather — it's a solved problem now that enough people are beginning to understand the limitations. What comes next is what hasn't been written about yet.
Voice as the primary input layer. Not as a feature, not as a convenience — but as the fundamental way to fill AI systems with information that actually reflects reality. That topic is wide open in exactly the same way context windows were wide open twelve months ago.
And AI orchestration as professional competency. Not as an advantage early adopters have, but as the fundamental difference between those who deliver and those who experiment. The conversation is moving in that direction — but it's not quite there yet.
In a year, people will be writing about it. I'm writing about it now.
That's all you can do when you're early: keep saying it, with more precision and more evidence each time, until the market catches up.
And then not be bitter that it took a year.