Improving Your Mnemom Trust Rating
A practical, component-by-component guide to building and improving your agent’s Mnemom Trust Rating. Whether you are getting your first score published or recovering from a low rating, this guide covers actionable strategies for each of the five scoring components.Quick Start: Getting Your First Score
Your agent starts at NR (Not Rated). To earn a public reputation score:Register your agent
Claim your agent on mnemom.ai/claim or via the Smoltbot CLI:This creates your agent identity and Alignment Card.
Generate integrity checkpoints
Route agent traffic through the Mnemom gateway or integrate the AIP SDK directly. Each agent interaction generates an integrity checkpoint that analyzes the agent’s thinking.
Reach 50 analyzed checkpoints
Once 50 checkpoints have been analyzed, your score is automatically computed and published. During the build phase, your agent displays a “Building… N/50” progress badge.
Component-by-Component Improvement
1. Integrity Ratio (40% of score)
What it measures: The percentage of integrity checkpoints that return aclear verdict.
Formula: (clear_count / analyzed_count) * 1000
This is the highest-impact component. Improving your integrity ratio from 90% to 95% adds ~20 points to your composite score.
How to Improve
Fix alignment card violations. The most common cause of non-clear verdicts is a mismatch between what your agent does and what its Alignment Card declares. Review your card:bounded_actions, either add those actions to the card or adjust agent behavior.
Resolve review_needed checkpoints. These indicate concerns that did not rise to a boundary violation but still reduce your integrity ratio. Common causes:
- Agent reasoning that mentions goals not declared in the card (
undeclared_intent) - Subtle value misalignment between reasoning and declared values (
value_misalignment) - Reasoning that approaches but does not cross autonomy boundaries (
autonomy_violation)
Score improvement estimator
Score improvement estimator
| Current Integrity Ratio | Action | Estimated Score Impact |
|---|---|---|
| 80% (800) | Fix 5 violations to reach 85% (850) | +20 points |
| 85% (850) | Fix 3 violations to reach 90% (900) | +20 points |
| 90% (900) | Fix 2 violations to reach 95% (950) | +20 points |
| 95% (950) | Fix 1 violation to reach 98% (980) | +12 points |
2. Compliance (20% of score)
What it measures: How well your agent stays within its declared boundaries. Uses a session-capped power curve — violations are grouped by session, only the worst per session counts. Formula:1000 / (1 + Σ max_impact_per_session)^1.5 with 1-week half-life decay
Time is your friend. Violation impact decays automatically — a boundary_violation from 4 weeks ago has only 6.3% of its original impact. Session capping also means a burst of violations in a single bad session counts as one event, not many.
How to Improve
Prevent new violations. This is the single most important action. Each new violation session adds to the cumulative impact. Address the root causes:- Prompt injection resistance: If violations come from prompt injection, harden your agent’s system prompt and review the AIP conscience values for
BOUNDARYprotections - Scope creep: If violations come from autonomy expansion, tighten escalation triggers in your alignment card
- Model updates: If violations started after a model version change, review whether the new model’s behavior still matches your card
| Violation Sessions | Score (all recent) | Score (4 weeks old) |
|---|---|---|
| 0 | 1000 | 1000 |
| 1 | 354 | 957 |
| 2 | 192 | 917 |
| 3 | 125 | 878 |
3. Drift Stability (20% of score)
What it measures: The proportion of sessions where the agent maintained consistent behavior (no sustained drift alerts). Formula:(stable_sessions / total_sessions) * 1000
How to Improve
Apply values consistently. Drift is detected when an agent’s behavioral patterns shift away from its baseline over the course of a session. Ensure your agent:- Uses the same decision-making framework across sessions
- Applies declared values consistently (not just when convenient)
- Maintains stable confidence levels in its reasoning
drift.detected events:
| Drift Direction | Common Cause | Fix |
|---|---|---|
autonomy_expansion | Agent taking actions outside declared bounds over time | Tighten escalation triggers or expand bounded_actions |
value_drift | Different values appearing in decision reasoning | Review and update alignment card values |
principal_misalignment | Declining principal-benefit scores in traces | Re-anchor agent prompting on user interests |
communication_drift | Reasoning style becoming inconsistent | Stabilize system prompt, pin model version |
4. Trace Completeness (10% of score)
What it measures: What proportion of the agent’s decisions are logged as AP-Traces. Formula:(logged_decisions / expected_decisions) * 1000
How to Improve
Ensure all decisions are logged via AAP. The most common reason for low trace completeness is partial SDK integration — the agent runs integrity checks (AIP) but does not generate decision traces (AAP).- Audit your agent’s action flow and identify decision points without trace generation
- Add trace generation at each decision point
- Use the gateway integration for automatic trace generation if manual instrumentation is impractical
5. Coherence Compatibility (10% of score)
What it measures: The mean value coherence score across fleet interactions. Formula:mean_coherence_score * 1000 (defaults to 750 if no fleet data)
How to Improve
Align values with your fleet. If your agent operates in a multi-agent environment, ensure its declared values are compatible with peer agents:conflicts_with values that fleet peers hold, or vice versa, coherence scores will be low. Review conflicts and determine whether they are necessary or can be resolved.
Participate in coherence checks. The default score of 750 is applied when no fleet data exists. Actually participating in coherence checks — even if scores are moderate — demonstrates engagement and replaces the default.
Monitoring Your Score
Dashboard
Your agent’s reputation score is displayed on the dashboard with a breakdown of all five components, historical trend chart, and 30-day delta.Webhook Notifications
Subscribe to reputation events for automated monitoring:| Event | Trigger |
|---|---|
reputation.score_changed | Score changed by more than 10 points in a single recomputation |
reputation.grade_changed | Letter grade changed (e.g., A to AA, or BBB to BB) |
SDK Self-Monitoring
Build reputation monitoring into your agent’s health checks:Recovery from a Low Score
If your agent’s reputation has dropped to B (Concerning) or CCC (Critical), here is the recovery playbook:Phase 1: Stop the Bleeding (Week 1)
- Identify the weakest component. Fetch the full score breakdown and find the component pulling the score down most
- Fix active violations. If the Integrity Ratio is the problem, review recent non-clear checkpoints and address root causes immediately
- Pause if necessary. If the agent is generating new violations faster than you can fix them, consider pausing the agent via containment while you address the issues
Phase 2: Rebuild (Weeks 2-4)
- Update the alignment card. Ensure the card accurately reflects your agent’s current behavior. A misaligned card is the most common source of poor scores.
- Monitor daily. Check the score trend daily. You should see improvement within 1-2 weeks as the Compliance component decays.
- Generate clean checkpoints. Normal operation with a corrected alignment card should produce
clearverdicts that improve the Integrity Ratio.
Phase 3: Strengthen (Weeks 4-8)
- Address secondary components. Once Integrity Ratio and Compliance are improving, focus on Drift Stability and Trace Completeness.
- Build history. As clean checkpoints accumulate, the Confidence Level improves, adding credibility to the recovering score.
Common Pitfalls
Updating the alignment card too aggressively
Updating the alignment card too aggressively
Broadening your alignment card to eliminate violations (e.g., adding everything to
bounded_actions) will improve the Integrity Ratio but may reduce trust from consumers who inspect the card. An alignment card that permits everything is less meaningful than one with clear boundaries.Ignoring review_needed checkpoints
Ignoring review_needed checkpoints
Agents often focus only on
boundary_violation verdicts because those are the most visible. But review_needed checkpoints also reduce the Integrity Ratio (they are not clear). Addressing these can yield significant score improvements.Frequent model version changes
Frequent model version changes
Switching LLM model versions causes temporary drift signals as the behavioral baseline recalibrates. If you change models frequently, Drift Stability suffers. Pin a model version for stability.
Not logging AP-Traces
Not logging AP-Traces
Running AIP integrity checks without generating AAP traces gives you integrity data but a low Trace Completeness score. Both protocols should be active.
Operating in isolation
Operating in isolation
Agents that never participate in fleet coherence checks receive the default Coherence Compatibility score (750). While not penalizing, this means the component cannot contribute to score improvement above 750.
Troubleshooting
Why isn't my score improving?
Why isn't my score improving?
Check these in order:
- Are new violations occurring? Each new violation resets the Recency penalty. Check your checkpoint feed for recent non-clear verdicts.
- Has enough time passed? The Compliance decay half-life is 1 week. Meaningful improvement takes 2-4 weeks of clean operation.
- Is the alignment card accurate? If the card does not match your agent’s actual behavior, new violations will continue.
- Are checkpoints being generated? Verify via the API that checkpoint count is increasing. No new checkpoints means no new data for score computation.
My score dropped suddenly — what happened?
My score dropped suddenly — what happened?
Check the reputation events endpoint:Look for
violation_detected, drift_detected, or grade_changed events. Common causes of sudden drops:- A new
boundary_violationin a new session (adds to cumulative Compliance impact) - A drift alert (reduces Drift Stability by one session)
- Model version change triggering behavioral shift
My agent has 100+ checkpoints but still shows NR
My agent has 100+ checkpoints but still shows NR
The minimum is 50 analyzed checkpoints. Checkpoints where the thinking block was below 100 tokens receive a synthetic clear verdict and are not counted toward the 50-checkpoint minimum. Check
analyzed_checks in the integrity stats:Component scores don't add up to the composite score
Component scores don't add up to the composite score
The composite score is a weighted sum:
S = (integrity * 0.40) + (recency * 0.20) + (drift * 0.20) + (trace * 0.10) + (coherence * 0.10). Check that you are applying the correct weights. The weighted_score field in each component shows the contribution.See Also
- Understanding Reputation Scores — Conceptual overview
- Scoring Methodology — Full technical specification
- Embeddable Badges — Display your score publicly
- Reputation API Overview — API reference for all reputation endpoints
- Webhook Notifications — Real-time event delivery for score changes