
As decentralized AI networks mature, the conversation is gradually shifting away from raw capability and toward something far more consequential, which is alignment. It is no longer enough for models to be powerful. They must also be predictable, safe, and resistant to manipulation under pressure.
Within the Bittensor ecosystem, two subnets have emerged at the center of this alignment conversation: Trishool (Subnet 23) and Aurelius (Subnet 37). At a glance, both appear to operate within the same domain, evaluating and improving the behavior of large language models. However, beneath that surface similarity lies a fundamental divergence in philosophy, architecture, and long-term design.
Both of these subnets contribute complementary approaches to one of the hardest problems in AI.
The Core Question: What Does It Mean to Align AI?
Before comparing both systems, it is worth grounding the discussion in what alignment actually entails. Alignment is about ensuring that an AI system behaves in ways that are consistent with human expectations and values, even under adversarial or unforeseen conditions. This includes:
a. Avoiding harmful or deceptive outputs,
b. Maintaining truthfulness and consistency,
c. Resisting manipulation or jailbreak attempts, and
d. Preserving intent across increasingly complex reasoning paths.
The challenge is not simply identifying when models fail, but building systems that can continuously test, verify, and improve those behaviors at scale.
This is where Trishool and Aurelius diverge in how they approach the problem.
Trishool (SN23): Alignment Through Continuous Adversarial Pressure
Trishool is best understood as an automated stress testing layer for AI systems. Its design revolves around one central idea: the most reliable way to make AI safe is to constantly try to break it.
Rather than relying on static evaluation benchmarks or periodic audits, Trishool creates a live, adversarial environment where models are continuously probed for weaknesses. This transforms alignment into an ongoing process rather than a fixed checkpoint.
Trishool operates through a tightly coordinated pipeline right on Bittensorβs subnet 23:
a. Miners submit prompts designed to expose behavioral flaws in AI models,
b. Validators execute these prompts using the Petri alignment agent within controlled environments, and
c. The platform layer manages submissions, validates results, and stores behavioral data.
This structure creates a competitive marketplace for discovering failure modes. Participants are directly incentivized to identify vulnerabilities such as:
a. Deception,
b. Sycophancy,
c. Manipulation,
d. Overconfidence, and
e. Power seeking tendencies
Each discovered weakness becomes part of a broader feedback loop, strengthening the system over time.
Design Philosophy
What distinguishes Trishool is its emphasis on automation and scale:
a. Automated Safety Loop: Alignment is not treated as a manual process but as a continuously evolving system driven by AI agents themselves,
b. Proof of Invariance: The system aims to produce verifiable guarantees that a model remains aligned even under extreme adversarial testing, and
c. Outer and Inner Alignment Coverage: It targets both what a model is supposed to do and what it actually optimizes for internally.
In effect, Trishool transforms red teaming into a decentralized, always-on infrastructure layer. The harder participants push against models, the stronger those models become.
Aurelius (SN37): Alignment Through Structured Evaluation and Collective Judgment
While Trishool focuses on breaking models, Aurelius focuses on understanding and verifying them. Its approach introduces a more structured, governance-driven system for evaluating AI behavior, with a strong emphasis on reasoning, consensus, and data integrity.
Rather than relying primarily on automated agents, Aurelius builds a human aligned evaluation loop encoded directly into protocol logic.
On Bittensor Subnet 37, Aurelius operates through a three layer design:
a. Miners generate adversarial prompts and collect model responses,
b. Validators evaluate these responses against a shared alignment rubric, and
c. The Tribunate defines the rules, scoring logic, and governance framework.
This creates a system where alignment is not only tested but formally interpreted and recorded.
Key Mechanisms
Aurelius introduces several important innovations:
a. Consensus-Based Evaluation: Multiple validators independently assess outputs, and results are aggregated to produce reliable judgments,
b. Alignment Rubric: A structured framework covering dimensions such as factual accuracy, bias, logical consistency, and harm potential,
c. Cryptographic Provenance: All data, including prompts, responses, and scores, is hashed and stored, ensuring transparency and auditability, and
d. Dynamic Governance: The Tribunate evolves the systemβs rules over time, adapting to new forms of model behaviour.
Asides this, Aurelius operates as a continuous cycle where misaligned outputs are discovered, validators confirm and score them, data is recorded and structured, and developers use this dataset to improve models
Over time, this produces a high quality dataset of alignment failures, effectively turning the network into a living repository of AI weaknesses.
Key Differences: Pressure vs Interpretation
While both systems operate within the same problem space, their differences become clear when viewed across a few core dimensions:
1. Approach to Alignment: Trishool applies constant adversarial pressure to expose weaknesses, and Aurelius applies structured evaluation to interpret and classify those weaknesses,
2. System Design: While Trishool leans toward automation, with AI agents driving testing at scale, Aurelius leans toward human-aligned governance, with validators and rules shaping outcomes,
3. Output: Trishool produces real-time stress test results and robustness signals, and Aurelius produces structured datasets and alignment metrics,
4. End Goal: Trishool seeks to ensure models cannot be broken under pressure, and Aurelius seeks to define and document what βaligned behaviorβ actually means.
Why This Distinction Matters
It is tempting to view Trishool and Aurelius as competing solutions, but that framing misses the broader picture.
Alignment is not a single problem with a single solution. It is a multi-layered challenge that requires both:
a. Systems that can aggressively test AI under extreme conditions, and
b. Systems that can interpret, standardize, and learn from those results.
In this context, Trishool acts as the pressure engine, and Aurelius acts as the judgment layer. One identifies weaknesses at scale, and the other makes those weaknesses understandable, verifiable, and usable.
Together, they form a more complete alignment stack than either could achieve independently.
Closing Thoughts
As AI systems continue to evolve, alignment will increasingly define which models are trusted, deployed, and integrated into real world systems. The challenge is no longer theoretical. It is operational.
Trishool and Aurelius represent two distinct but necessary approaches to solving this problem within a decentralized framework. One treats alignment as an adversarial game that never ends, and the other treats it as a structured process that must be continuously refined.
The real insight is not choosing between them, it is recognizing that alignment at scale requires both pressure and interpretation, both chaos and structure.
In that sense, these subnets are not competing visions of the future. They are two halves of the same system, quietly shaping what safe and reliable AI might actually look like.
Enjoyed this article? Join our newsletter
Get the latest TAO & Bittensor news straight to your inbox.
We respect your privacy. Unsubscribe anytime.

Be the first to comment