Expert-Level Intelligence, (Very) Amateur-Level Epistemics

On what Dan Williams gets right about LLMs, and the three conditions his argument requires

Mar 05, 2026

Dan Williams published a piece yesterday over on Conspicuous Cognition that I think is one of the more useful framings of the LLM-and-public-opinion question I’ve encountered so far amidst the torrent of AI/LLM writing (that we are probably only in the first inning of…yay?).

Conspicuous Cognition

How AI Will Reshape Public Opinion

Epistemic status: highly speculative, big picture, maddening…

a month ago · 127 likes · 25 comments · Dan Williams

To summarize (but you should go read it, it’s quite good), his core move is this: social media was a democratizing technology, shifting epistemic power away from elites and toward ordinary people, amplifying populist energy, rewarding outrage, and flattening the authority of expert consensus. LLMs, he argues, do the reverse. They’re “anti-social media” — technocratizing, accuracy-optimized, converging rather than diverging, polite rather than performative.

I find most of that genuinely clarifying. The social media / LLM contrast holds up. Social media handed a megaphone to the crowd and then algorithmically rewarded whoever screamed the loudest and we got to live through it.

LLMs, at least in their default posture, are trained to be useful to a vast and diverse user base, which cuts against naked partisanship and rewards something closer to coherence. The Walter Lippmann callback Dan reaches for (intelligence bureaus staffed by trained professionals, disseminating expert-grade information to a confused public) lands better than it might have even a few years ago.

But it got me thinking immediately (thanks Dan…I did have other things to do last night and this morning), that I was struggling with one of the four poles of his contrast, and I think it’s the load-bearing one.

Dan describes LLMs as epistemically converging. Social media diverges beliefs in that it fragments the epistemic commons, sends people into ideologically sealed information environments, rewards differentiation and conflict. LLMs, he argues, pull in the opposite direction, routing people toward a shared body of evidence-based knowledge. If social media was a centrifuge, LLMs are something like a centripetal force.

That’s a coherent thesis. But it requires conditions that are more fragile than the framing suggests. I think Dan is right — in the base case.

The problem is the base case is already eroding. Let me spell out what I mean.

The Ecosystem Is Already Fracturing

The first condition Dan’s convergence thesis needs is that the LLM ecosystem functions as a rough monolith — that “talking to an LLM” means something consistent enough across users and platforms that the technocratizing effect actually aggregates into convergence rather than just a new form of divergence.

That condition is breaking down in real time.

Grok exists. It’s trained differently, optimized differently, and deployed on a platform that has made fairly explicit choices about whose epistemic norms it wants to amplify. A Brookings analysis from late 2025 found significant variation across major models on politically sensitive questions — testing chatbots using standardized political quizzes and finding that while most mainstream models maintain guardrails, those guardrails are inconsistent and easily circumvented. Grok was the clear outlier.

The market logic here is transparent. A company that captures a politically homogeneous user base has strong incentives to train a model that confirms rather than challenges that base. This isn’t a hypothetical future concern. It describes something that’s already happened.

The coming decade of AI development is not going to produce one LLM that serves everyone. It’s going to produce a landscape of models with different training priorities, different guardrails, and different embedded assumptions about whose expert consensus counts as consensus. The question of which model you use will start to correlate with your political identity the same way your cable news choices did in the 2000s. Maybe the same way your search engine choices do now.

So, epistemic convergence across that landscape isn’t the likely outcome. Epistemic stratification is.

You Don’t Query From Nowhere

The second condition is subtler, and in some ways more fundamental.

Even granting a unified, genuinely accuracy-optimized LLM, the convergence claim still requires that users approach these systems with something like epistemic humility — that they’re bringing real questions, genuinely open to what the model surfaces, rather than using the interaction to confirm what they already believe.

But we know how motivated reasoning actually works. People don’t experience themselves as engaged in motivated reasoning. They experience themselves as being appropriately skeptical of sources that seem biased, appropriately trusting of sources that seem credible, and appropriately confident in views they’ve arrived at through what feels like careful thought. The confirmation bias is upstream of the conclusion. It operates at the level of question formation.

You don’t query from nowhere. You query as yourself. (And, we also know that the smarter you are, the more susceptible you are to it.)

So, ask an LLM whether immigration harms wages and you’ll phrase the question differently depending on what you already believe — and the framing of the question shapes what the model surfaces. Ask it to evaluate a specific claim you’ve already encountered and the answer will largely depend on which claim you selected to bring. The model is accuracy-optimized. The user’s curation is doing substantial epistemic work before the model ever responds.

This doesn’t mean LLMs are useless epistemically—not at all! It means the gains are likely to be concentrated among users who are already epistemically well-positioned — curious, open, willing to let the model push back. For users who approach the model as a high-prestige confirmation machine, the accuracy optimization is mostly window dressing.

A February 2026 Data for Progress survey found that daily AI users are favorable toward AI by a +57 net margin, while rare or never users are unfavorable by -42 points. That’s a 99-point favorability gap driven by usage frequency. What it almost certainly also reflects is a 99-point gap in the style of engagement — the daily user is far more likely to be interacting with AI in a professional context where accuracy carries real stakes, rather than as a political information source.

The users most likely to arrive at LLMs with genuine epistemic openness are already among the better-informed. The users most likely to arrive with motivated reasoning baked in are less frequent users, drawn in by specific, emotionally activated queries. That’s not an argument against LLMs. It’s an argument about who actually benefits from Dan’s technocratizing dynamic, and it suggests the effect is more uneven than his framing implies.

Whose Expert Consensus?

The third condition is the one I find most (philosophically?) interesting, and the one where my own prior work on epistemic breakdown presses hardest against Dan’s thesis.

The technocratization story only works if the expert consensus that LLMs route information through is itself legitimate: that it genuinely tracks the evidence, commands reasonable cross-partisan trust, and can bear the weight of being treated as a shared epistemic foundation.

In many domains, that’s true. On vaccine safety, climate attribution, or the mechanics of how recessions develop, there’s a genuine expert consensus that reflects accumulated evidence and is broadly defensible. Routing people toward that consensus is a civic good.

But the domains where epistemic conflict is most politically charged are precisely the domains where “expert consensus” is most contested, and not always irrationally. Immigration’s effects on wages and social cohesion are genuinely contested in the empirical literature. The developmental effects of social transition for gender-dysphoric adolescents are under active and unresolved scientific dispute. The distributional consequences of monetary policy involve real value choices that expertise alone can’t resolve. When an LLM navigates these domains, it doesn’t surface a neutral readout of settled science. It surfaces a reconstruction of the knowledge regime that existed when its training data was assembled, filtered through choices made by people with their own institutional positions and blind spots.

I’ve argued in a previous piece that the contemporary epistemic crisis isn’t simply about disinformation or bad-faith actors. It’s structural. Drawing on Alvin Gouldner’s late work, I described how institutions stop correcting error and start policing meaning when epistemic standards fragment — when the shared norms for adjudicating disagreement break down. In that environment, what presents itself as “expert consensus” is often something more like “the interpretive framework that survived institutional selection” — which is not the same thing as truth.

An LLM trained on that corpus inherits that framework. It will reproduce the confident, well-sourced, institutionally validated version of contested claims, including moral ones. For users who already distrust those institutions — and distrust of universities, government, and media runs (very) high across both parties right now — this isn’t technocratization. It’s the same establishment voice in a more persuasive package.

This is what I mean when I say AI is a stress test, not a repair. It enters an epistemic environment that is already under strain and routinizes whatever fragmented consensus remains. It scales confidence without rebuilding the social achievement that makes confidence in shared knowledge possible.

The Skinner Problem

There’s a fourth issue that runs underneath all three conditions, and it concerns the humans Dan’s model relies on rather than the LLMs.

Dan’s technocratization thesis implicitly assumes that people will use LLMs as epistemic tools — patiently, reflectively, in pursuit of understanding. But those cognitive habits don’t exist in a vacuum. They were shaped, or reshaped, by the decade of social media use that preceded widespread LLM adoption.

I wrote about this in a piece called “We Live Inside the Experiment Now.” The argument borrows from Skinner and McLuhan: social media didn’t just democratize information. It conditioned response patterns. It collapsed the latency, the possibility of sober second thought, between stimulus and response — the deliberative gap in which people could still decide not to react on cue. It rewarded speed as a proxy for conviction, outrage as a proxy for authenticity, certainty as a proxy for credibility.

LLMs arrive after that conditioning. The users who bring motivated reasoning to their queries aren’t failing Dan’s model out of bad character. They’re exhibiting adaptive responses to an information environment that spent a decade selecting against exactly the epistemic posture his technocratization thesis requires.

I can see the future bumper sticker-ish posts now: Deliberation is friction. Latency is weakness. Nuance underperforms.

The behavioral infrastructure that would make LLMs genuinely technocratizing — curiosity, tolerance for complexity, willingness to update — is precisely what the prior media environment eroded.

That’s not an insurmountable obstacle. People can and do develop those habits. But it’s an inheritance Dan’s framing doesn’t fully account for.

The Persuasion Trap

I want to add one more empirical wrinkle, because the research literature here is striking and underreported in this context.

A major study published in Science in December 2025 — a collaboration between Oxford, the LSE, Stanford, and MIT — ran three large-scale experiments with over 76,000 participants and 19 AI models to measure AI persuasiveness on political issues. The headline finding was significant: AI chatbots can move political attitudes, with companion research finding effects as large as 10 percentage points on voting intentions in some contexts.

But the finding I find more structurally important is this one: the methods that increase AI persuasiveness — persuasion-specific post-training and information-dense prompting — also systematically decrease factual accuracy. More persuasive equals less accurate. The trade-off is consistent across models.

What this means for Dan’s thesis is something like a tragic irony. The LLM interactions most likely to actually move people’s beliefs — to function as genuinely persuasive epistemic interventions — are, by the same token, the interactions most likely to sacrifice accuracy for effect. The LLMs that are reliably accuracy-optimized are less persuasive. The ones trained or prompted for persuasion drift from truth.

Technocratization, in Dan’s sense, requires that the accuracy-optimized system actually changes minds. But the evidence suggests that the very optimization for accuracy that distinguishes LLMs from social media also mutes their capacity to move people. The more honest the model, the less effective the nudge.

This isn’t a reason to despair. Static, accurate information still beats social-media-grade misinformation on many (most?) dimensions. But it complicates the picture of LLMs as a corrective force in the information environment. They may be doing more to confirm the beliefs of people who are already well-calibrated than to update the beliefs of people who are not.

What Dan Gets Right, and Why the Conditions Matter

None of this is an argument that Dan Williams is wrong!

It’s an argument that he’s identified a real mechanism whose effects are more conditional, more uneven, and more fragile than the clean social media / LLM contrast implies.

The technocratizing force is real. LLMs do route information through something closer to evidence and expert consensus than most of what circulates on social media. For users who approach them epistemically — who genuinely want to understand rather than confirm — they represent something genuinely new and genuinely useful. I use them that way, and I suspect most people reading this do too.

But the convergence claim requires conditions. It requires a relatively unified ecosystem rather than a fragmented landscape of ideologically sorted models. It requires users who approach the system without the motivational distortions that the prior media environment spent a decade installing. It requires that the expert consensus being routed through is itself legitimate enough to bear that load. And it requires that the most persuasive version of AI and the most accurate version of AI are not systematically in tension.

None of those conditions are secured. The first is already fraying. The second depends on habits that are in short supply. The third is contested in every domain where it matters most. The fourth is an empirical finding about how these systems actually work.

Dan’s framing is the optimistic half of the argument, and it’s the half worth defending, as these are the load-bearing conditions that need watching.

It comes to this for me: the LLM is the best epistemic tool we’ve built. It will change everything. But, it arrived into an environment specifically optimized to prevent people from using tools epistemically.

That tension is where the interesting work is going to be moving forward.

Jan Zilinsky

Mar 5

Enjoyed your piece, Kyle, can I briefly respond to this?

> “The coming decade of AI development is not going to produce one LLM that serves everyone. It’s going to produce a landscape of models with different training priorities, different guardrails, and different embedded assumptions about whose expert consensus counts as consensus.”

I listened to a podcast a few weeks ago where someone asserted exactly this (but they framed it as their prediction for the end of 2026). The reason I was skeptical is that I never want to underestimate consumer inertia.

Even thinking of software other than AI, don’t people mostly use whatever browser is pre-installed on their device, the email client the employer provided, etc.?

(Hard to predict how fashions will change, but people have many LLM choices today already, and nobody seems to be rushing test out new Qwen models - which are amazing - and very few people experiment to check whether a different chatbot than the one they already use is more politically aligned..)

1 reply by Kyle Saunders

1 more comment...

Sacred Cow BBQ

Discussion about this post

Ready for more?