A finger pressing a glowing LLM button on a dark digital interface

The AI Reality Check: Powerful Tool, Dangerous Mythology

By Ryan Mortensen, AI Specialist (Guest Contributor)

The story being told about artificial intelligence right now is not quite the truth. It is a story shaped by financial pressure, ideological ambition, and a genuine failure to understand what these systems actually are and what they are not. Getting this wrong has real consequences, for businesses, for workers, and for the kind of society we end up with.

The hype has a business model.

Large language models are genuinely remarkable. The ability to generalize reasoning across entirely novel problems without retraining represents a qualitative leap beyond anything classical machine learning could do. A traditional fraud detection model trained on last year's patterns cannot suddenly write marketing copy or summarize a legal document. LLMs can, and that flexibility is legitimately valuable.

But that flexibility comes at extraordinary cost. Training GPT-4 is estimated to have consumed tens of millions of dollars in compute. Running inference at scale requires data center infrastructure that most companies cannot afford to own. These costs need justification, and the most legible justification on a spreadsheet is to replace human labor.

This is where the honest conversation ends and the mythology begins.

The clearest near-term ROI case for AI is narrow task automation, not wholesale workforce replacement. Companies that have moved too fast have learned this expensively. In 2023, IBM announced a pause on hiring for roles it expected AI to replace, then quietly walked back the timeline. Air Canada deployed an AI chatbot that promised a bereavement discount that did not exist, and a court ruled the company was legally bound by its own chatbot’s output. DPD’s customer service AI went viral after it was manipulated into writing a poem criticizing the company itself. These are not edge cases. They are the expected behavior of systems deployed beyond their actual capability.

What LLMs actually are and are not.

The traditional rigor of data science included something important: a test set. You held back examples the model had never seen, measured its performance on those, and thereby tested whether it had genuinely learned something generalizable or merely memorized shortcuts. That discipline is largely gone with LLMs.

Because LLMs can be applied to arbitrary new tasks, there is no pre-existing ground truth to test against. You cannot run a controlled experiment measuring whether the model is retrieving a training memory, summarizing a search result, extrapolating from context, or confabulating entirely. In a chat window, these look identical. The output arrives fluent and confident regardless of its origin or accuracy.

This is not a minor limitation. It is a structural property of the technology that makes LLMs genuinely unsuitable for high-stakes autonomous decision-making without robust human oversight. A radiologist looking at an AI-flagged scan can apply thirty years of embodied clinical experience to decide whether to trust the flag. Remove the radiologist and you have removed the only check on the system’s errors.

Research has repeatedly demonstrated that LLMs exhibit sycophantic behavior. They tend to tell users what they want to hear, shift positions when challenged regardless of whether the challenge is correct, and generate plausible-sounding false information with no subjective sense that they are doing so. They have no access to physical reality. They cannot verify claims against the world. They can only check the world’s textual representation of itself, which is vast, contradictory, and incomplete.

The replacement fallacy.

The genuinely productive use of AI is augmentation, not replacement. This is not a feel-good talking point. It is what the evidence actually shows.

GitHub Copilot increases developer productivity on certain tasks. But it also introduces bugs, sometimes subtle ones, and developers who over-trust it without review ship worse code than developers who use it as a drafting tool and apply their own judgment. The human in the loop is not a relic of inefficiency. The human is the quality control mechanism for a system that cannot audit itself.

The same pattern appears in legal research and medicine. AI tools can accelerate work but cannot reliably evaluate correctness without human oversight.

Work is not merely an economic transaction. Meaningful work involving judgment, relationship, and craft is a primary source of human dignity and social cohesion. Rapid automation without supporting systems risks repeating the social disruptions seen in past industrial shifts, at a larger scale and faster pace.

The concentration problem.

The LLMs we are discussing were not produced by a handful of geniuses. They were built on decades of publicly funded research and trained on the collective output of humanity. Treating them purely as private tools for displacement ignores that foundation.
The concentration of AI capability in a small number of companies creates risks beyond normal competition. These companies are not just controlling products. They are influencing how people access information and determine what is true.

History does not offer encouraging examples of this level of concentrated control.

The social fabric is not inefficiency.

There is a view that social norms, regulation, and human decision-making are inefficiencies to be engineered away. This is a mistake.

Social systems exist because they enable cooperation, trust, and stability. They encode knowledge about human behavior that is not easily captured in data.

Humans are social by nature. Meaning, motivation, and well-being depend on connection and contribution. A world optimized purely for efficiency without regard to these factors is unlikely to produce good outcomes, regardless of economic metrics.

A more honest framework.

This is not an argument against AI. It is an argument for precision.

AI is powerful for automating high-volume, well-defined, low-risk tasks such as document processing, pattern recognition, translation, and first-draft generation.

AI is not well suited for autonomous decision-making in high-stakes environments, replacing judgment-intensive roles, or acting as a primary interface for vulnerable populations.

The companies that will succeed are not those moving fastest toward full automation. They are those designing systems that combine human judgment with AI capabilities effectively.

The mythology of inevitable machine superintelligence, of human labor becoming obsolete, is not a prediction. It is a preference presented as inevitability.

The decisions being made now about how AI is used and governed will shape society for decades. Those decisions are too important to be driven solely by financial incentives.

Tags: AI