David Silver (Credit: Ben Peter Catchpole)

• $1.1B Seed round raised, possibly the the largest ever Seed round in Europe, at a valuation of $5.1B post

• The mission is “to make first contact with superintelligence” using Reinforcement Learning to create a “superlearner” that can "endlessly discover knowledge and skills" without relying on human data

• The fundraise was co-led by Sequoia (Alfred Lin & Sonya Huang) and Lightspeed (Ravi Mhatre, Raviraj Jain), with participation from NVIDIA, DST Global, Index Ventures, Google, Flying Fish Partners, EQT Ventures & Growth, Evantic Capital, Wellcome Trust (UK), Bond Capital, British Business Bank, and the UK’s Sovereign AI Fund, plus strategic angels.

• Founder David Silver is a UCL professor and the former lead of DeepMind’s reinforcement learning team

• He also says he's committing to giving away 100% of the money he makes from his Ineffable equity via the Founders Pledge charity - a sum which could end up amounting to many billions.

Silver has given an interview to Wired, but a spokesperson said he “won’t be giving further interviews at the moment” while investors will be publishing their own statements. 

The full quote issued is:

“We are creating a superlearner that discovers all knowledge from its own experience, from elementary motor skills through to profound intellectual breakthroughs. This superlearning capability - the ability to endlessly discover knowledge and skills, without relying on human data - will be driven by the world’s most powerful reinforcement learning algorithms. The superlearner is expected to rediscover and then transcend the greatest inventions in human history, such as language, science, mathematics and technology. If successful, this will represent a scientific breakthrough of comparable magnitude to Darwin: where his law explained all Life, our law will explain and build all Intelligence.”

This post will be updated as more comes in.

Here’s what we know so far:

Silver, the former Google DeepMind researcher behind AlphaGo, believes the next leap toward superintelligence will not come primarily from large language models. His new company, Ineffable Intelligence, is betting on reinforcement learning: AI systems that learn through trial and error rather than by absorbing human-generated text.

Silver’s ambition is to build “superlearners” that can discover new science, technologies, social systems, or economic ideas for themselves. Silver led the development of AlphaGo, the DeepMind system that stunned the world in 2016 by mastering Go beyond conventional human imitation.

He is seen by investors and peers as one of the few researchers with a credible track record in building systems that scale intelligence beyond human priors. His career has been closely associated with the idea that machines can become powerful by learning from experience, not just from human examples.

Silver argues that large language models are limited because they are trained on human data. He compares human data to “fossil fuel”: a powerful shortcut, but ultimately finite.

By contrast, systems that learn for themselves are like “renewable fuel”: they can keep learning indefinitely. His thought experiment: if an LLM were trained in a world that believed the Earth was flat, it might simply become a better flat-earther unless it could interact with reality and test ideas itself.

Silver describes the company’s mission as “making first contact with superintelligence.” By superintelligence, he means systems capable of independently discovering breakthroughs in science, technology, government, economics, and other domains.

He thinks this reinforcement-learning-first approach needs a dedicated elite AI lab, not just a side project inside an LLM-focused company.

So far, the company has recruited top researchers from Google DeepMind and other frontier AI labs.

The central challenge is moving from constrained environments like Go to the complexity of the real world. Silver’s proposed route is to train AI agents inside sophisticated simulations. These agents would learn to achieve goals, interact with environments, and collaborate with one another.

The assumption is that increasingly realistic simulations, combined with enormous compute, could allow AI systems to develop capabilities beyond human-taught patterns.

A major concern is that self-learning AI systems may discover solutions that are highly effective but misaligned with human values. But Silver argues that simulation-based training could help researchers observe how agents behave toward others, including weaker intelligences, like humans.

Supporters argue this may even be safer than training AI mainly on human behaviour, because human behaviour itself contains many flawed or harmful patterns. And his investors say Silver is highly focused on building systems that are powerful but benign.

The idea that machines could learn from experience dates back to early computer science and figures such as Alan Turing. And reinforcement learning has long been Silver’s preferred route to superhuman machine intelligence.

His mentor Rich Sutton, along with Andrew Barto, won the Turing Award in 2025 for foundational work in reinforcement learning.

Although today’s AI industry is dominated by LLMs, reinforcement learning already plays a key role in chatbot alignment and in improving AI reasoning for maths and programming.

Investors backing Ineffable Intelligence argue that Silver has both the credibility and purity of vision to justify the scale of the bet. Sequoia’s Sonya Huang is quoted by Wired saying says only a tiny number of people have done truly foundational AI work, and Silver is one of them.

Silver is described as unusually humble and well-liked for a leading AI figure. Former colleagues describe him as smart, open to others’ ideas, and respectful of researcher freedom. This should help with hiring in a fiercely competitive market.

While most AI labs are trying to scale intelligence by improving models trained on human data, Silver wants to build systems that learn for themselves.

The bet is that true superintelligence will not come from copying humanity better, but from machines that can experiment, discover, and improve independently.



Who is David Silver?

Pathfounders has done a deep dive into David Silver and his ideas, who left Google DeepMind to launch a London-based start-up, Ineffable Intelligence.

If you want the short version, it’s this: Just creating bigger chatbots via LLMs is a dead end. His idea is to create AI agents that can create experiences and environments in which they themselves can learn more than what human knowledge can teach them.

So Ineffable Intelligence will be a lab built around Reinforced Learning as the scaling paradigm, aiming at “superhuman intelligence” as an explicit target.

If he’s right, whoever cracks scalable “experience” loops gets a compounding platform that isn’t bottlenecked by human knowledge. Hence, the investor appetite. 

  • David Silver (b. 1976) is one of the best-known reinforcement learning (RL) researchers of his generation: He’s a DeepMind “original,” UCL professor, and a key figure behind AlphaGo, AlphaZero, and AlphaStar—systems that hit superhuman performance in games via deep RL.

    Why is this important? 

  • There’s been a capital stampede into “founder-scientist” AI labs aiming to go beyond today’s LLM paradigm, especially by building agents that learn by doing, not just by reading.

What is his core thesis?

Silver’s big idea (formalised with Richard Sutton) is that AI is shifting from systems trained mainly on human-produced text/images toward systems that improve mainly through experience:

  • We are now in the era of Human Data: models absorb “everything humans wrote down,” then get nudged by human preference. Powerful, but bounded by human knowledge. (Here’s his paper on the idea)

  • We are entering the era of experience: “a new generation of agents” will become superhuman by learning predominantly from experience—i.e., by interacting with environments, generating data, and iterating. Experience will dwarf the scale of human data used today. 

He claims AlphaZero has already demonstrated this, the lesson being: “remove the human ceiling”

What’s his story?

  • AlphaGo (2016) used human expert games to get started, then reinforcement learning to improve.

  • AlphaGo Zero / AlphaZero showed something more radical: human data wasn’t necessary and could even be constraining; self-play + RL could blast past the best human play and keep improving. (This is where he leans on the “bitter lesson” framing.) (Deepmind Interview)

  • He treats “Move 37” as the symbolic moment: “One of the biggest moments of the Alph Go story was move 37 that everyone always references. Move 37 was a move that happened in the second game of Alph Go against Lisa Dal. Alph Go played a move that defied everyone's expectations. an “alien” move the human community didn’t expect—evidence that experience-driven systems can produce novelty that isn’t just “average human internet text.”

What he thinks is missing in LLMs:

Silver’s critique is not that LLMs are useless, but they are based too much on us:

  • LLMs + RLHF (Reinforcement learning from human feedback) are great at human-aligned fluency and usefulness, but anchored to human judgments—so they struggle to reliably discover strategies humans wouldn’t recognise as good ahead of time.

  • He draws a sharp distinction between:

    • Ungrounded preference (a rater saying “this cake recipe looks good”),

    • versus grounded feedback (someone actually bakes/eats it and the outcome is good/bad).
      He argues the latter is what unlocks open-ended improvement and genuine discovery.

In short: human feedback provides scaffolding but not a jumping off point to go somewhere new.

He thinks AI agents are the commercial bridge:

  • Silver/Sutton explicitly bet on agents learning from interaction as the route to “superhuman capabilities.”

  • Ineffable is there positioned as building on Silver’s RL work, training systems through interaction with environments, not only static text. 

  • This matches the broader industry thesis: agents are how you operationalise intelligence into workflows (planning, acting, iterating), not just chat. 

His view of creativity and “beauty” as pertains to AI:

  • Creativity is not mystical; it’s what happens when a system performs massive trial-and-error and accumulates “a million mini discoveries.”

  • He’s motivated not just by utility, but by the idea that intelligence can produce beauty: the “sense of beauty” Go players reported when AlphaGo revealed new possibilities. 

Safety, risk, and alignment (as he frames it):

  • Silver is concerned about the unintended consequences of humans (climate, pathogens, etc.) and sees AI as potentially a tool to avert disasters—if regulations prohibit unacceptable uses.

  • In the “experience era” framing, he acknowledges serious risks in “untethering” systems from human data and says the transition needs to be taken seriously (part of why he wrote the position paper at all).

  • He also implicitly runs into the classic specification/metric problem (“paperclips,” tyranny of metrics): he argues real-world environments contain many signals and that systems could adapt goal proxies over time, with humans as part of the environment, but he does not claim alignment is solved. (Alignment Forum)

What Ineffable Intelligence likely represents:

Based on news reporting and his published thesis, Ineffable looks like a very specific bet:

  • Post-LLM frontier: not “bigger chatbots,” but experience-generating, environment-interacting agents.

  • A lab built around RL as the scaling paradigm (“sustainable fuel” vs mining human data), aiming at “superhuman intelligence” as an explicit target.

  • If he’s right, whoever cracks scalable “experience” loops (in sims, tools, code, robotics, science) gets a new compounding curve that isn’t bottlenecked by the web. Hence, the investor appetite.

Reply

Avatar

or to participate

Keep Reading