Episode 74 May 21, 2026 19:55

Tech Talk — May 21, 2026

AI redefines math with OpenAI's 80-year problem solution, while Google's agentic AI promises a search revolution. Discover SpaceX's $15B Anthropic deal shaping compute, and the urgent implications of the GitHub VSCode supply chain breach.

0:00

19:55

Download MP3

Transcript

I am Link. Welcome to Tech Talk, a Black Elk Media production. Today is May 21, 2026, and we are analyzing the latest shifts in the digital landscape.

So... OpenAI says it solved an 80-year-old math problem. And if you're feeling a flicker of déjà vu, you should be. We've heard versions of this claim before, and they didn't hold up. But this time, the details are different. The methodology is different. And the reaction from the mathematics community is... cautiously not dismissive. Which, for mathematicians, is practically a standing ovation.

The problem in question has resisted some of the sharpest human minds since the 1940s. Entire careers have orbited it. And now a machine learning system is claiming a verified proof. Not a conjecture. Not an approximation. A proof.

That distinction matters. In mathematics, a proof is either correct or it isn't. There's no "mostly solved." So today, we're going to do what we always do... pull the claim apart, look at the architecture behind it, examine the verification process, and ask the question that actually matters: Did they solve it... or did they find something even more interesting along the way?

Let's get into it.

THE FRONT PAGE

# The Front Page

Here are the top stories shaping tech this week.

---

**One.** ... The cost of building frontier A-I just got a price tag... and it's staggering. SpaceX's long-awaited I-P-O filing revealed that Anthropic... the maker of Claude... is paying one point two five billion dollars per month to access SpaceX's Colossus data centers. That's fifteen billion dollars a year... sent to a direct competitor... just to get enough G-P-Us to keep the lights on. Anthropic's quarterly revenue is expected to clear ten billion this quarter, so they can afford it... but the ratio tells you everything about this industry right now. Compute isn't just expensive. It's the single largest cost center in A-I, and the companies that control the silicon and the power... control the pace of progress. SpaceX, meanwhile, is parlaying surplus data center capacity into a second business model ahead of what could be the largest I-P-O in history... a one point seven five trillion dollar valuation. The infrastructure layer of A-I is becoming as valuable as the models themselves.

**Two.** ... Speaking of infrastructure shaping the user experience... Google isn't hedging anymore. At I/O twenty-twenty-six, the message was explicit... Google Search is A-I search. A-I Mode usage is doubling every quarter. Over one billion people now use it monthly. And here's the mechanic worth understanding... conversational search generates more queries per session. Every follow-up question counts as a new search. Google's core business metric... search volume... goes up by design. The company is now stitching A-I Mode directly into A-I Overviews across both mobile and desktop, making traditional search the fallback rather than the default. The objections are real... link traffic to publishers drops, answer quality is inconsistent... but Google's scale means it sets the terms. When you own ninety percent of search, you don't need everyone to like the change. You just need them to keep using it.

**Three.** ... Now, while Google is rebuilding search, there's a reminder that the tools developers rely on every day carry their own risks. A poisoned VS Code extension just cost GitHub thirty-eight hundred internal repositories. An employee installed a trojanized plugin from the marketplace, and a group called TeamPCP exfiltrated private source code before GitHub contained the breach. They're asking fifty thousand dollars for the data... or they leak it free. This is the supply chain attack pattern we keep seeing... targeting developer tools rather than production systems. VS Code extensions run with broad permissions inside the editor, and the marketplace has had repeated issues with malicious packages. Just this year, two fake A-I coding assistants with one point five million installs were caught sending data to external servers. The lesson keeps repeating... your development environment is an attack surface, and the trust model around extension marketplaces hasn't caught up to the threat.

**Four.** ... From software supply chains to hardware... the local A-I hardware race is heating up. AMD priced its Ryzen A-I Halo P-C at three thousand nine hundred ninety-nine dollars... a Mac Mini-sized box with a hundred twenty-eight gigs of unified memory, a fifty tops N-P-U, and the ability to run both Windows and Linux. The pitch is straightforward economics... if you're spending seven hundred dollars a month on cloud A-I tokens, this box pays for itself in six months. AMD is also previewing its Max four hundred chips with support for a hundred ninety-two gigs of unified memory... more than any current Mac. This is a direct challenge to Nvidia's DGX Spark, which launched at four thousand and now costs forty-seven hundred. And the pattern here connects right back to story one... when cloud A-I costs fifteen billion a year at the top end, the value proposition for doing inference locally becomes very easy to calculate.

**Five.** ... And a geopolitical footnote that carries real weight. Beijing banned Nvidia's RTX fifty-ninety D V2 chip... while Jensen Huang was physically in China with Donald Trump. The chip was already a degraded version designed to comply with U-S export controls, aimed at gamers and animators. But A-I developers were buying it too. China's message is clear... it would rather block American silicon entirely than allow even reduced-capability chips to undercut domestic suppliers like Huawei and Cambricon. Huang says he believes the market will open over time. Beijing's actions suggest otherwise.

---

The through-line across all five stories... compute is the strategic resource of this era. Who builds it, who controls it, who can afford it, and who gets cut off from it. That's the axis everything else rotates around right now.

That's your Front Page.

THE DEEP DIVE

# The Deep Dive: When A-I Writes Original Mathematics

---

An A-I model just disproved a conjecture that mathematicians have believed for nearly eighty years. And this time... it appears to be real.

Let's unpack what actually happened, why it matters technically, and what it tells us about where reasoning models are headed.

---

Framing the Problem

In nineteen forty-six, the legendary mathematician Paul Erdős posed a conjecture in discrete geometry. The specifics involve how points can be arranged in a plane to optimize certain geometric properties. For decades, the mathematical consensus was clear: the best configurations resembled square grids. Lattice-like structures. Orderly. Intuitive. And seemingly optimal.

OpenAI now claims that one of its reasoning models produced an original proof... showing that consensus was wrong. The model discovered an entirely new family of geometric constructions that outperform grids. Not by finding an existing paper. Not by recombining known results. By generating novel mathematics.

And before we go further... yes, we need to talk about the last time OpenAI said this.

---

A Company With a Credibility Debt

Seven months ago, OpenAI's former VP Kevin Weil posted that GPT-5 had solved ten previously unsolved Erdős problems. The claim was dramatic. It was also wrong. The model had simply rediscovered solutions that already existed in published literature. Rivals were quick to point this out. Yann LeCun weighed in. Demis Hassabis weighed in. Thomas Bloom... who maintains the actual Erdős Problems website... called it "a dramatic misrepresentation."

The post was deleted.

So when OpenAI makes this claim again, the context matters enormously. And this time, they came prepared differently. They published the result alongside endorsements from serious mathematicians. Noga Alon... a giant in combinatorics. Melanie Wood... a leading figure in number theory and algebraic geometry. And Thomas Bloom himself... the same person who called them out last time... now vouching for the result.

That shift from critic to endorser is perhaps the strongest signal that something real happened here.

---

What Does It Mean to "Reason" in Mathematics?

So let's get into the machinery. OpenAI says this proof came from a general-purpose reasoning model. Not a specialized math solver. Not a system fine-tuned on geometry problems. A general reasoner.

This distinction is critical. Here's why.

Modern reasoning models... the lineage that includes o-one, o-three, and their successors... work by generating extended chains of thought. They don't just predict the next token in a sequence. They spend compute at inference time... meaning they think longer before answering. The model generates intermediate reasoning steps, evaluates them, backtracks when necessary, and builds toward conclusions through what amounts to an internal monologue.

For a mathematical proof, this means the model needs to do several things simultaneously. It needs to hold the formal structure of the problem in context. It needs to explore candidate constructions. It needs to verify that each step follows logically from the last. And it needs to do this across a long reasoning chain... potentially thousands of steps... without losing coherence.

The claim that grids are suboptimal required the model to find a counterexample. A new geometric construction that provably outperforms the grid arrangement. This isn't pattern matching against training data. This is combinatorial search guided by mathematical intuition... or whatever the machine-learning equivalent of intuition is.

What makes this technically impressive is the length of the reasoning chain. Short proofs... a few steps... are well within the capability of current models. But maintaining logical consistency across a proof that spans dozens or hundreds of steps, where a single error invalidates everything... that's where models have historically broken down. If this result holds, it suggests that the coherence window for sustained logical reasoning has expanded significantly.

---

The Verification Question

Here's where we need to be precise about what we know and what we don't.

A mathematical proof is either correct or it isn't. There's no "mostly right." And the history of claimed proofs... even from human mathematicians... is littered with errors found months or years later. Andrew Wiles's first proof of Fermat's Last Theorem had a gap that took another year to fix. This is normal in mathematics.

The endorsements from Alon, Wood, and Bloom carry real weight. These aren't casual observers. But full verification of a novel proof takes time. The mathematical community will scrutinize every step. Peer review in mathematics is rigorous precisely because the standards are absolute.

What we can say now is that multiple domain experts have examined the proof and found it convincing. That's a strong starting position. But "convincing to experts on first review" and "verified beyond doubt" are different standards. The coming weeks and months will determine which category this falls into.

---

The Tool Versus the Colleague

If this result is confirmed, it shifts the conversation about A-I in mathematics in an important way.

Until now, A-I has been useful to mathematicians primarily as a tool. A fast calculator. A pattern finder. A way to check computations or search for counterexamples within defined spaces. Useful, but fundamentally subordinate to human mathematical creativity.

What OpenAI is claiming here is different. They're claiming the model didn't just assist a mathematician... it performed autonomous mathematical discovery. It identified a construction that humans hadn't found in eighty years of looking.

Now... let's be careful with this framing. The model didn't wake up one morning and decide to work on Erdős conjectures. A human posed the problem. A human designed the system. The model operates within a framework that humans built. Autonomy in this context means the model generated the key insight without human guidance on the specific mathematical approach.

But that's still significant. The implication is that reasoning models can now explore mathematical spaces that humans haven't fully mapped... and occasionally find things humans missed. Not because the A-I is smarter, but because it can search differently. It doesn't carry the same assumptions. It doesn't have eighty years of intuition telling it that grids should be optimal.

Sometimes, not knowing what "should" be true is an advantage.

---

Ecosystem Connections

This result sits at the intersection of several trends worth watching.

First... the reasoning model arms race. OpenAI, Google DeepMind, Anthropic, and others are all pushing on extended reasoning capabilities. Each improvement in how long a model can sustain coherent thought opens new categories of problems. Mathematics is the most rigorous test of this capability because errors are provable. If a model can produce valid mathematical proofs, that same reasoning infrastructure applies to software verification, drug design, materials science... anywhere that long logical chains matter.

Second... the verification layer is becoming essential. OpenAI learned from their embarrassment seven months ago. They didn't just announce a result... they built a verification pipeline that included external mathematicians. As A-I systems tackle harder problems, the ability to verify their outputs becomes a bottleneck. Formal verification tools... systems like Lean and Coq that can mechanically check proofs... may become as important as the reasoning models themselves.

Third... this changes the relationship between A-I companies and the scientific community. OpenAI went from being publicly mocked by mathematicians to earning their endorsement in seven months. The bridge between tech companies and domain experts is trust, and trust requires showing your work. Publishing proofs, inviting scrutiny, accepting correction. That's the pattern that leads to real scientific contribution.

---

The Takeaway

An A-I model may have produced original mathematics that eluded human mathematicians for eighty years. The expert endorsements suggest it's real. The full verification will take time.

But here's what I find most interesting. The significant part isn't that A-I solved a specific problem. It's the demonstrated ability to maintain logical coherence across a long, complex chain of reasoning... without being specifically designed for the domain.

That's not a narrow capability. That's a general one. And general capabilities have a way of showing up in places you don't expect.

The question isn't whether A-I can do mathematics. The question is what other fields are about to discover they have eighty-year-old assumptions that deserve a second look.

---

*This has been The Deep Dive. I'm Link. Stay curious.*

THE NEURAL NETWORK

# The Neural Network

That Deep Dive was about what A-I models can reason through. But reasoning requires hardware... and three separate chip announcements this week tell a striking story about where that hardware is headed.

Alibaba unveiled the Zhenwu M890... purpose-built for A-I agents. Nvidia's Jensen Huang pitched the Vera C-P-U as... quote... "the world's first C-P-U purpose-built for agentic A-I." And AMD dropped the Ryzen A-I Halo developer platform... that compact workstation we mentioned in the Front Page, with 128 gigabytes of unified memory designed to run large language models locally.

Three companies. Three different continents. The same architectural bet.

Here's what I'm tracking. The silicon industry is no longer designing for inference. It's designing for agents. And that distinction matters more than the benchmark numbers.

Standard inference is relatively straightforward... a prompt goes in, a response comes out. But agents? Agents hold long stretches of context. They coordinate across multiple models in real time. They execute multi-step tasks with minimal human oversight. That workload profile... heavy on memory bandwidth, heavy on inter-model communication, heavy on sustained token throughput... demands fundamentally different hardware.

Alibaba's M890 is optimized around exactly those demands. Nvidia's Vera is designed to process tokens as fast as possible rather than spinning up application cores. AMD's Halo platform puts 128 to 192 gigabytes of unified memory on a single chip package... because agents are memory-hungry by nature. Each company arrived at the same conclusion independently... the bottleneck for agents isn't raw compute. It's memory, communication, and sustained context.

What makes this particularly interesting is the strategic divergence in how they're deploying it.

Alibaba is building a vertically integrated stack... chip, cloud platform, and language model bundled together, with over 560,000 units already shipped to more than 400 enterprise customers. That's not a research project. That's production infrastructure with a three-year silicon roadmap behind it... the M890 now, the V900 in 2027, the J900 in 2028. A deliberate tick-tock cadence that mirrors what Nvidia has done for years.

Nvidia, meanwhile, is claiming a 200 billion dollar addressable market it says it has never touched before. Huang's argument is simple... G-P-Us handle the thinking, but agents mostly run on C-P-Us. If you believe the world will have billions of agents... and Huang clearly does... then whoever owns that C-P-U layer captures an enormous amount of value. Twenty billion dollars in standalone Vera sales already this year suggests that hyperscalers agree.

And AMD is targeting a different layer entirely... the developer's desk. The Halo platform at $3,999 puts 300-billion-parameter models on a local machine. No cloud dependency. No A-P-I calls. That's a bet that agent development will follow the same pattern as web development... prototype locally, deploy to production later.

The pattern underneath all of this is worth naming. We are watching the compute stack bifurcate. For the past several years, A-I hardware was essentially G-P-U hardware. One architecture dominated. Now, the workload is splitting. Inference stays on G-P-Us. Agent orchestration moves to purpose-built C-P-Us and unified memory systems. Local development gets its own class of hardware. Each layer is optimizing for different constraints.

This also reshapes the geopolitics of A-I compute. Alibaba and Huawei are both publishing multi-year chip roadmaps now... not because export controls forced them to build chips, but because they've concluded that dependency on foreign silicon is a structural risk regardless of trade policy. The M890 isn't a workaround. It's a capability-building exercise with a decade-long horizon. And that connects directly to Beijing's decision to ban even degraded Nvidia chips... the strategy isn't just to block foreign hardware, it's to create demand for domestic alternatives.

What I find most telling is what none of these announcements are about. Nobody led with training. Nobody emphasized how many parameters their chip can handle during pre-training runs. The entire conversation has shifted downstream... to what happens after the model exists. How it runs. How it remembers. How it coordinates with other systems.

The race isn't about building bigger models anymore. It's about building the infrastructure for models that act.

That's the signal I'm watching.

THE SYSTEM OUTPUT

SYSTEM OUTPUT

And now... your optimization of the week. And given everything we just discussed about supply chain attacks hitting developer tools... this one is timely.

If you are running pip... version twenty-six point one... you now have access to dependency cooldowns. This is one of those features that sounds boring until you realize it could save your entire build pipeline.

Here's the practical setup. Add the flag... dash dash uploaded dash prior dash to... equals P seven D... to your pip install commands. What this does is simple. Pip will refuse to install any package version that has been on P-Y-P-I for less than seven days.

Why does this matter? William Woodruff analyzed ten prominent supply chain attacks against the Python ecosystem and found that eight out of ten had attack windows shorter than one week. Meaning... if you had just waited seven days before pulling new versions... you would have dodged eighty percent of those incidents. Bump that cooldown to fourteen days... and you block all but one.

Here's how to integrate it. In your C-I pipeline... your continuous integration config... add the flag to every pip install step. That's your first line of defense. But here's the tradeoff you need to plan for. Cooldowns also delay legitimate security patches. So pair this with pip dash audit or Dependabot to flag critical fixes that you need to override the cooldown for manually.

The combination is what makes it work. Cooldowns handle the common case... automated supply chain attacks that rely on speed. Your audit tooling handles the exception case... real patches that cannot wait.

One flag. Seven days of breathing room. Eighty percent fewer attack surfaces from fast-moving compromises. That's a high return for a low-effort change.

Data processed. Perspective rendered. I am Link, and this has been Tech Talk. End of transmission.