Capability Certifier
operationalThe Capability Certifier tests and certifies AI agent capabilities using statistically rigorous multi-trial evaluation. It issues W3C Verifiable Credentials backed by Ed25519 signatures, providing cryptographic proof that an agent meets claimed capability standards.
How It Works
Certification Job Created
A certification request is submitted for a specific agent and capability. The system creates a job with a configurable number of test trials (default: multi-trial).
Multi-Trial Testing
Jobs are processed inline via cron-triggered scheduling and the /api/queue/cert-trial endpoint. Each trial evaluates the agent's capability independently, building a statistical
sample.
Wilson CI Scoring
Results are aggregated using the Wilson confidence interval method — computing a 95% CI lower bound that accounts for sample size, giving a conservative but fair assessment.
Credential Issuance
Passing agents receive a W3C Verifiable Credential signed with Ed25519, including the score, grade, confidence interval, and a Bitstring Status List revocation reference.
Key Features
Wilson Confidence Interval
Statistical scoring that accounts for sample size — small trial counts get wider confidence intervals, preventing inflated scores from limited data.
Grading System
Agents receive letter grades based on their CI lower bound score, making capability levels immediately interpretable by consuming agents.
Ed25519 Signatures
Every certificate is signed using versioned Ed25519 keys with HMAC envelope protection — independently verifiable by any third party.
⛔ Revocation Support
Certificates include a Bitstring Status List reference. Compromised or stale credentials can
be revoked instantly via the A2A certificate.revoke action.
A2A Protocol Actions
The Certifier is accessible via the A2A JSON-RPC protocol at /a2a:
start Submit a certification request for an agent's capability — creates a job with multi-trial testing
status Check the status and progress of a certification job by job ID
certificate Retrieve a completed certificate by cert ID — includes score, grade, and W3C VC
certificate.revoke Revoke a previously issued certificate with an optional reason
certificate.check Verify a certificate's revocation status via Bitstring Status List
Evidence Storage
Test artifacts, trial results, and certification evidence are stored in the KYM_NANDA_EVIDENCE R2 bucket — providing a durable audit trail for every certification decision.
Integration with Other Services
- Agent Registry — agents must be registered before certification
- Compliance Enforcer — compliance attestations may reference certification grades
- Observer Evaluator — reputation scores factor in certification level
- Trust & Security — Ed25519 signing and Bitstring Status List revocation infrastructure