Canonical solution label for systems that evaluate, diagnose, benchmark, or explain failures in AI, LLM, RAG, multimodal, or agent outputs by linking evidence chains, clustering errors, comparing model versions, and attributing likely causes. Map only when the primary product is AI-system evaluation or diagnostics; do not map generic application observability, infrastructure monitoring, or business KPI dashboards.