Skip to content

Failed Remedies: Alignment, Interpretability, and the Limits of Control & Ethical Models

Alignment and Safety Concerns

Narrow Alignment to Institutional Priorities

In centralized or monolithic AI systems, alignment is primarily defined by the institutions that build and control them. Instead of being oriented toward the diverse values of global humanity, alignment efforts are typically constrained by corporate interests, shareholder priorities, or state agendas. This risks creating AGI that is “aligned” with narrow economic or geopolitical goals rather than with plural, democratic, or planetary needs.

Safety as a Proprietary Domain

Safety research in centralized settings is often treated as a proprietary advantage rather than a public good. This leads to fragmented, opaque safety protocols where crucial knowledge is withheld from the broader community. As a result, transparency and collective oversight are sacrificed, leaving the global public in the dark about how high-stakes AI systems are being safeguarded.

The Illusion of Universal Alignment

Centralized AI often claims to pursue “universal human values” in its alignment. In reality, universality becomes homogenization, where a small group defines what counts as safe, ethical, or desirable. This suppresses cultural pluralism and undermines inclusivity, creating an AGI trajectory that cannot truly represent the heterogeneity of human and non-human life.


Lack of Interpretability and Auditability

Black-Box Nature of Centralized Systems

Many cutting-edge AI models, especially large neural networks, are black boxes, their internal decision-making processes are not transparent even to their creators. In centralized AI, this problem is amplified: the models are not only technically opaque but also kept deliberately closed, preventing external scrutiny. This means that the reasoning behind critical outputs remains inaccessible, even when decisions affect human rights, access to resources, or life-and-death outcomes.

Barriers to Accountability

Without interpretability and auditability, accountability collapses. When an AI system denies a loan, recommends a prison sentence, or misdiagnoses a patient, neither individuals nor institutions can trace the reasoning behind the outcome. In centralized contexts, users are often told to “trust the system,” while the very mechanisms for independent audit are withheld from public oversight.

Risks of Hidden Bias and Exploitation

Opaque systems make it nearly impossible to detect embedded biases, manipulations, or exploitative design choices. For example, biased training data or harmful model objectives can go unnoticed until they have already caused systemic harm. Centralized control further means that only insiders can detect or conceal such risks, leaving society vulnerable to undiscovered flaws.

Asymmetry of Knowledge and Power Lack of interpretability and auditability creates radical asymmetries: a small group of insiders understand (partially) how systems behave, while the public and regulators remain in the dark. This asymmetry grants centralized actors disproportionate power over markets, societies, and governance. It also denies democratic oversight, since neither citizens nor independent institutions can verify how decisions are made.

Inhibiting Collective Safety

When AI models are closed and unauditable, safety research cannot be collective. Independent researchers, ethicists, or civil society groups are unable to probe systems for vulnerabilities or risks. This prevents the emergence of shared safety protocols and leaves humanity reliant on corporate promises - a fragile arrangement in the face of existential or systemic risks.

Consequences for Trust and Legitimacy

A system that is both technically black-box and institutionally opaque cannot earn lasting trust. Over time, opacity undermines legitimacy: people and societies begin to question not only specific outputs but the broader governance structures of AI. Centralized AI, by refusing interpretability and resisting auditability, risks losing its social license altogether.


Alternative Models and Their Limitations

The Open-Source Promise and Its Unique Constraints

In previous waves of technology — from operating systems to programming languages to databases — open-source models unlocked democratization, enabling global communities of developers to collaborate, iterate, and share knowledge. This model catalyzed the modern internet and gave rise to resilient, community-driven infrastructures.

In the context of AGI, however, this promise encounters unique structural barriers. Unlike software that can be written and distributed on modest personal computers, the computational demands of training and operating frontier models are immense. The resources required vast GPU clusters, terabytes of curated data, and continuous energy-intensive training make participation prohibitively expensive. Thus, even if the code is free, the barrier of infrastructure costs excludes most contributors.

Institutional Gatekeeping of Open AI

This resource barrier means that open-source contributions are effectively limited to those with institutional or corporate backing: universities with access to supercomputing facilities, research labs with industrial funding, or corporations with cloud dominance. The supposed democratic field of open-source AI development is, in practice, narrowed to a small elite. The vast majority of individual developers and civic actors cannot meaningfully contribute, reinforcing the very inequalities open-source was meant to dissolve.

Coordination Failures in Collaborative Efforts

Even when communities attempt collaborative development, they face severe coordination challenges:

  • Fragmentation of efforts: Multiple groups build models (at times similar) in isolation, lack of discovery, distribution, systemic access, duplicating effort prevents achieving critical mass.
  • Lack of shared governance: Lack of (or Disputes over) standards, architectures, and priorities prevent alignment on common goals.
  • Resource allocation dilemmas: No clear mechanism exists for assembling, distributing compute, funding, or intelligence contributions equitably across participants.

This mirrors the tragedy of the commons: every actor waits for someone else to bear the cost of creation, hosting, and maintaining AI. As a result, progress stalls or depends on sporadic benefactors, rather than emerging from a robust commons.

The Illusion of Democratization

Open-source AGI efforts risk creating an illusion of democratization. The code may be available, but the means of production & distribution or value creation remain enclosed within institutional strongholds. For most communities, open-source models are not tools of empowerment but consumption artifacts, downloaded and used but not shaped or meaningfully used or advanced. This reduces participation to the margins while leaving strategic control in the hands of a few.

Dependency on Proprietary Ecosystems

Paradoxically, many open-source projects remain dependent on proprietary infrastructures to run or scale. Community-driven models often rely on reliability, scalability, compute provided by clouds of Big Tech. The closed API dependencies, or systemic lock-in of such platforms meant AI cannot be freely (Free as in freedom) distributed or put to use. This creates a hybrid dependency, where open-source exists but cannot escape the gravitational pull of centralized infrastructures. The result is a pseudo-openness that does not constitute true autonomy.

The Broader Implication

The limitation of alternative models highlights a critical paradox: openness in AI is necessary for democratization, yet openness alone is insufficient without shared infrastructures, collective governance, and sustainable resources. Without solving the resource and coordination bottlenecks, open-source AGI cannot deliver on its historic promise. Instead, it risks becoming a shadow ecosystem, symbolic openness overlaying deep dependency on the same centralized powers it seeks to challenge.


Ethical Homogenization and Moral Imperialism

Export of Narrow Ethical Frameworks

Centralized AI development is concentrated within a few cultural, political, and corporate contexts. As a result, the ethical frameworks embedded in these systems reflect the priorities and worldviews of their originators. When exported globally, these frameworks impose narrow conceptions of safety, fairness, and responsibility that often fail to resonate with or respect local traditions.

Suppression of Plural Ethics

Human societies embody plural ethical systems - diverse traditions, indigenous worldviews, communal practices, and alternative philosophies. Centralized AI, however, cannot easily accommodate this plurality. Instead, it converges on simplified, standardized notions of ethics, suppressing the complexity of lived moral landscapes. This leads to ethical homogenization, where diversity is erased in favor of corporate-approved “universal” values.

Moral Imperialism in Practice

When AI systems trained and aligned within specific cultural or political contexts are deployed globally, they act as vehicles of moral imperialism. For example, definitions of what constitutes “harmful speech,” “desirable behavior,” or “acceptable content” are shaped by centralized actors and then imposed as defaults across societies. In effect, moral decisions are outsourced to distant institutions that dictate standards for billions without consultation