GPT-5.4's Million-Token Window Is an Extraction Event, Not a Feature

A 1M-token context window means a single paste can ingest your entire body of work. The question isn't whether it's impressive. It's whose terms govern what happens next.

Your client just pasted your entire consulting archive into GPT-5.4. Ten years of decks, transcripts, playbooks, proposal templates, redline patterns, the whole thing. One prompt. One paste. They didn't ask.

The cost of that paste, if you price your expertise at $400 an hour, is somewhere between $180,000 and $2M in accumulated IP handed to a frontier model in under four seconds. No NDA clause covers it. No contract anticipated it. The model now knows how you think, and your client now has a cheaper simulation of you on tap.

The question worth asking isn't whether GPT-5.4's million-token window is impressive. It is. The question is what becomes possible when the default client workflow shifts from hire the expert to upload the expert's work and ask the model.

I'm Matt Cretzman. I build systems for experts who don't want to be the next training set. Here's what the 1M-token window actually is, why it's an extraction event dressed as a feature release, and the stack you need between your IP and the frontier models before your clients stop asking permission.

The Feature Framing Is a Trap

Every model release gets covered the same way. Bigger context. Better benchmarks. New multimodal tricks. The trade press treats each jump as a capability story.

It isn't. It's a consumption story.

A 128K context window could hold a book chapter. A 1M-token window holds roughly 750,000 words — call it ten full books, or a mid-career consultant's entire archive of client deliverables. What was a copy-paste exercise limited by friction is now a single drag-and-drop.

Friction was doing quiet work for you. Friction is why clients hired you for the fifth engagement instead of building an internal summary of the first four. Friction is gone now.

This is Extraction Economy #1 from my book — the same economy that ingested expert knowledge into training sets without consent, now operating at the inference layer instead of the pretraining layer. Your work doesn't need to be in the weights. It just needs to be in the context window at the moment of the question.

Knowledge Debt Is About to Get Called

Most experts I talk to are carrying massive knowledge debt. Their IP exists in seventeen places — Google Drive, old Dropbox folders, Loom libraries, a Notion that nobody maintains, three generations of proposal templates, a hard drive labeled "ARCHIVE FINAL v7."

They've been telling themselves they'll organize it someday. Someday was fine when no one could use it at scale. Someday is now a liability.

Here's the math. The first client who discovers that pasting your full archive into a 1M-token window produces a 70%-as-good version of you for $20/month will not keep that discovery secret. They'll tell their peer group. Their peer group will do the same with their own expert relationships. The behavior spreads through procurement faster than any sales motion you run.

And the experts who lose first are the ones whose IP is most portable — slides, transcripts, written frameworks, recorded talks. Which is to say, the experts who did the best job documenting their thinking. Knowledge debt gets called, and the most generous documenters pay first.

This is the part nobody wants to say out loud. The discipline that made you valuable — writing it down, building the deck, recording the workshop — is the same discipline that makes you extractable.

Mercor Was the Warm-Up

If you haven't watched Brendan Foody's Redpoint interview from June 2025, go watch it. Mercor pays experts by the hour to dump their domain knowledge into training datasets. Doctors, lawyers, finance operators, engineers. They sit down, they answer questions, they get paid a per-hour rate that feels fair in the moment.

That's Extraction Economy #2 in the book. The expert consents, but the terms are the platform's. Your knowledge becomes someone else's model weights and you get an hourly rate that doesn't compound.

The 1M-token window removes the middleman. Why pay experts to dump knowledge when clients will paste it in for free?

Mercor was the warm-up act. The main event is clients doing the extraction themselves, one context window at a time, with no payment to anyone except OpenAI.

The Third Option: Owned Surfaces, MCP-Mediated

I spent most of 2024 and 2025 building what I now call a Knowledge Delivery System — the core product of Skill Refinery. The premise is straightforward. Your IP lives on surfaces you own. Models reach your IP through a protocol you control. Every query is logged, priced, and tied back to you.

The protocol matters. When Anthropic launched MCP in November 2024, most people read it as a plumbing announcement. It's not. MCP is the first serious attempt at a standard for how models talk to external systems of record — the systems experts actually own.

In December 2025, Anthropic donated MCP to the Linux Foundation's Agentic AI Foundation. That moved it out of one vendor's roadmap and into ecosystem infrastructure. OpenAI adopters, Google adopters, open-weights adopters — they all converge on the same protocol for reaching external tools and knowledge.

This is the opening. An expert with an MCP-mediated surface becomes a called resource inside the agent's workflow, not a pasted artifact inside its context window. The difference in economics is everything.

- Pasted IP is a one-time extraction. The value is transferred, the expert is invisible, no revenue event occurs.
- Called IP is a metered transaction. The model reaches your surface, you log the query, you price the response, you stay in the loop.

KDS is how I operationalize this for clients. Your decks, playbooks, frameworks, and decision trees get structured into callable endpoints. Your voice, your constraints, your redlines — served as tools the agent uses rather than raw text it ingests. The agent can still do its job. You're still in the revenue path.

What to Build Before the Default Shifts

The default client workflow hasn't fully shifted yet. It's shifting. If you're an expert with a meaningful archive, here's the order of operations I'd run this quarter.

1. Inventory your IP. Every deliverable, every framework, every recorded session. Not to organize it for your own sanity — to know what's extractable. You cannot defend what you haven't catalogued.

2. Separate surface from substance. Public thought leadership is meant to travel. Client-grade methodology is not. Most experts blur the line. The 1M-token window punishes blur.

3. Stand up an owned endpoint. A domain you control, a database you control, an MCP server that exposes a defined set of capabilities. This does not need to be elaborate. It needs to exist.

4. Meter everything. Every call, every query, every response. Not to nickel-and-dime clients — to have the data that lets you price what was previously invisible labor.

5. Change how you contract. Your engagement letter should address AI use of your materials explicitly. Not as a restriction. As a rerouting. You may use AI with our work. Here is the endpoint.

That last one is the unlock most experts miss. You are not fighting your client's desire to use AI. You are channeling it through a surface where you stay in the revenue path.

The Stewardship Frame

I think about expertise as a gift that arrived through a long chain of teachers, mentors, failures, and second chances. Stewardship of that gift isn't hoarding it. It's refusing to let it be extracted on terms that sever it from the person who carries it.

There's a version of the next five years where every serious expert becomes a thin layer of personality over a frontier model trained on their own output. There's another version where experts build owned surfaces, compound their IP as an asset, and partner with frontier models as peers through protocols like MCP.

I know which version I'm building toward. The parking-lot conversation I had in Grapevine in late 2023 — lock the experts down — is the same conversation, just with more urgency now that context windows make the extraction trivial.

The Numbers That Should Move You

A frontier model subscription is $200/month at the pro tier. A mid-career expert bills $400-$800/hour. One client who replaces four hours of your time per month with pasted-context prompting recovers their entire annual AI spend in a single engagement.

That's not a threat. That's a pricing signal. The clients who would have churned on price are now armed with a substitute. The clients who stay are the ones who want you — not a simulation of you — and those clients are willing to pay for infrastructure that keeps you in the loop.

Build the infrastructure. Price the calls. Contract for the rerouting. Let the clients who wanted the cheap simulation go find it. They were never your compounders.

Closing

GPT-5.4's million-token window is not a feature. It's an extraction event with a release-note costume. The experts who treat it as a capability story will be the case studies in the book I'm finishing. The experts who treat it as a terms-of-trade event will be the ones who still own their practice in 2030.

I build the systems that put experts on their own terms. If you want the full stack — the inventory frame, the MCP scaffolding, the KDS build — it lives at mattcretzman.com.

I'm writing a book about this — On Whose Terms: The New Expert Economy and the Fight for What You Know. If the thesis resonates, join the launch list.

Keep Building,
— Matt

← Back to all posts