Economics of Imposibility

Monolithic scaling

The dominant paradigm in AI research has centered on the scaling hypothesis: that larger models trained on more data with greater computational resources will eventually achieve general intelligence. Even all modern AI progress or attempts are through scaling existing architectures, such as transformer-based models, with more data, parameters, and compute.

While scaling has led to impressive gains in past, recent benchmark trends suggests diminishing returns. Scaling is inching towards performance plateaus on many benchmarks unless paired with architectural tweaks (e.g., retrieval-augmented generation, tool use, or memory extensions) and may plateau well before reaching AGI-level capabilities.

Training costs have surged into billions of dollars, raising sustainability and exclusivity concerns. Marginal improvements require exponential increases in compute, which is economically unsustainable. Moreover, brute-force scaling does not resolve fundamental challenges such as grounding (linking symbols to real-world meaning), compositional reasoning, and generalization outside training distributions. These scaling limits highlight that current approaches are more about engineering optimization than genuine breakthroughs in cognition. The lesson of the last two years is clear: scaling produces surface-level competence, but without new architectures or paradigms, it will not yield the depth of general intelligence.

The Exponential Scaling of Resource Costs

GPT-4's training costs exceeded $100 million. Current projections suggest that next-generation models approaching AGI capabilities will require investments in the billions for single training runs. This represents just the tip of the financial iceberg. The real costs extend far beyond raw compute. A single state-of-the-art training run requires months of preparation, involving hundreds of specialized engineers commanding salaries exceeding $1 million annually, with some key individuals receiving tens of millions in retention bonuses. Provide massive computational resources for experimentation, effectively giving each top researcher millions of dollars in compute budget annually. Acquiring and cleaning high-quality training data costs millions, with specialized datasets commanding premium prices. The legal costs of navigating copyright and privacy regulations add another layer of expense, with some organizations spending tens of millions on legal teams to defend their data usage practices. Failed experiments, which constitute the majority of attempts, represent pure financial loss with no recoverable value. The iteration cycle for large models means organizations must be prepared to spend several hundreds of millions on experiments that may yield no usable results.

Energy Consumption and Carbon Footprint

The computational demands for training state-of-the-art AI models have risen exponentially, with profound environmental implications. Training frontier systems often requires tens of thousands of high-performance GPUs operating continuously for weeks or months, consuming as much energy as hundreds of households would consume in an year, with significant carbon emissions. Beyond energy, these facilities draw heavily on water resources for cooling, straining local ecosystems and communities. If current scaling trajectories persist, future models may require dedicated power plants, raising serious concerns about long-term sustainability. Furthermore, these resource-intensive practices risk exacerbating global energy inequities, as access to such compute remains concentrated in the hands of a few actors. Together, these factors highlight an urgent need to rethink the environmental, social, and economic sustainability of scaling toward AGI.

Infrastructure as a Barrier to Entry

Building AGI-capable infrastructure requires more than purchasing hardware. Organizations must construct specialized data centers with advanced cooling systems capable of handling megawatt-scale power draws. These facilities require locations with access to massive power grids, often necessitating direct agreements with utility companies or even the construction of dedicated power generation facilities. The global chip shortage has transformed access to high-end GPUs into a geopolitical issue. NVIDIA's H100 GPUs, essential for large-scale training, cost $30,000 each with waiting lists extending over a year. Organizations need tens of thousands of these chips, requiring not just capital but also political and business relationships to secure priority access. Compute access has become a currency of power.

Economic Impossibility

The Token Economy's Fundamental Flaw

The prevailing business model of selling API access through token-based pricing faces an insurmountable mathematical reality: the operational costs of running large language models exceed the prices markets will bear.

In 2025, Token-Pricing for many commonly used high-capacity models is between $0.01 to $0.06 per 1,000 tokens (or more if output tokens are large) depending on input vs output split, yet the actual cost to serve these requests, including compute, energy, infrastructure amortization, and human oversight, often exceeds $0.10 per thousand tokens - meaning providers are absorbing losses of 70–90% on every request. This economic inversion becomes more severe as models grow larger. Each generation of models requires approximately 10x more compute for training and 3-5x more for inference, but market prices for tokens have decreased by 90% year-over-year as competition intensifies. Organizations find themselves in a race to the bottom, subsidizing usage through venture capital or profits from other business lines while hoping for future efficiency gains that may never materialize.

The Hidden Costs of Inference at Scale

The true cost of inference extends far beyond raw compute. Each request requires maintaining hot models in memory, consuming expensive GPU RAM continuously regardless of utilization. A single high capacity model class requires approximately 300-800GB of high-bandwidth memory, spread across tens of GPUs that must remain powered and cooled 24/7. At typical utilization rates of 30-40%, the majority of infrastructure costs serve idle capacity.

Load balancing and redundancy requirements multiply these costs. Production systems require multiple model replicas across different geographic regions to ensure low latency and high availability. Each replica represents millions in hardware investment and thousands in daily operational costs. The need for instant response times prevents efficient batching of requests, forcing systems to operate at suboptimal efficiency levels.

The human costs of maintaining these systems prove equally substantial. Each production model requires teams of engineers for monitoring, optimization, and incident response. Content moderation, safety filtering, and quality assurance add layers of human oversight that scale linearly with usage. These human costs, often invisible in simplified economic analyses, can exceed computational costs for systems serving diverse global audiences.

The Depreciation Disaster

The rapid pace of AI advancement creates a depreciation crisis for infrastructure investments. Hardware purchased for training current-generation models becomes obsolete before loans are repaid. GPUs costing $30,000 today will be worth less than $5,000 in three years, not due to wear but because newer architectures offer 10x better performance per dollar.

This depreciation disaster extends to trained models themselves. A model costing $100 million to train today becomes economically obsolete within 18 months as competitors release superior alternatives. Unlike traditional software that can be maintained and updated incrementally, large language models require complete retraining to incorporate new knowledge or capabilities. The sunk cost of training cannot be recovered through incremental revenue before obsolescence strikes.

Organizations find themselves on a treadmill of continuous capital destruction. Each generation of models requires new infrastructure investments before previous investments are amortized. The accounting reality of AI development shows massive capital expenditures with minimal recoverable value, hidden temporarily by growth-focused metrics that ignore unit economics. This financial reality will become undeniable as growth slows and investors demand profitability.

The Insufficiency of Traditional Funding Models

Traditional venture capital, which fueled previous technological revolutions, proves inadequate for AGI development. Even the largest VC funds, managing billions in capital, cannot support the sustained burn rates required for competitive AGI research. A typical Series A startup with $50 million in funding would exhaust its entire capital on a single failed training run of a large model.

This has forced a fundamental shift in funding models. Only organizations with either massive existing revenue streams (big tech companies) or unprecedented access to patient capital (sovereign wealth funds, government programs) can sustain AGI development. The traditional Silicon Valley model of iterative development and pivoting becomes impossible when each iteration costs hundreds of millions.

Economic Concentration and Dependency

Centralized AI deepens economic monopolies by consolidating compute, models, and infrastructure in the hands of a few corporations. This creates global dependency where nations, institutions, and individuals must rely on external actors for critical AI capabilities.

Monopolization of Value Creation

As centralized actors own both the models and the platforms for deploying them, they also monopolize the economic value chains of intelligence. From foundational models to downstream applications, centralized players extract rents at every level. Local developers, startups, and civic organizations are reduced to consumers of proprietary APIs rather than co-creators of value, stifling distributed innovation.

The API Economy and Perpetual Dependency Trap

As direct competition becomes impossible, smaller organizations are forced into dependent relationships with AGI leaders through API access. This creates a feudal structure where "vassal" companies build applications on top of "lord" companies' models, paying both financially and through data sharing. The terms of these relationships can be changed unilaterally, leaving dependent companies vulnerable to destruction at the whim of their AI providers.

Global Dependency on Few Actors

This concentration leads to structural dependency: nations, businesses, and institutions worldwide become reliant on a few corporations or governments for critical AI capabilities. This mirrors historical patterns of dependency in oil, finance, or telecommunications, where control of key infrastructures translates into geopolitical leverage. AI, as a general-purpose technology, raises the stakes by embedding this dependency into all economic and governance systems.

Suppression of Alternative Economies

By concentrating control, centralized AI actively suppresses the emergence of alternative economic models such as cooperative AI development, open-source innovation, or localized AI ecosystems. The possibility of plural economies of intelligence is marginalized in favor of a globalized, corporate-controlled paradigm.