Treffer: Leveraging multi-agent framework for root cause analysis.

Title:
Leveraging multi-agent framework for root cause analysis.
Source:
Complex & Intelligent Systems; Jan2026, Vol. 12 Issue 1, p1-13, 13p
Database:
Complementary Index

Weitere Informationen

RCA is critical for operational stability in complex integrated systems, such as cloud-native platforms and distributed power metering infrastructures. However, achieving automated RCA faces three fundamental challenges: (1) the complex mapping between anomalies and runtime data, (2) the semantic gap between unstructured logs and structured metrics, and (3) combinatorial explosion of causal relationships for inferring the root causes. While LLMs offer potential for automated RCA due to their superior reasoning and knowledge-linking capabilities, their susceptibility to hallucinations constrains practical deployment. Specifically, prevailing single-agent architectures exacerbate error propagation and context-switching failures during multi-step RCA reasoning, leading to incorrect root cause identification. To address these limitations, we propose MA-RCA (multi-agent root cause analysis), a collaborative framework deploying specialized agents for distinct subtasks. Each agent operates within a dedicated domain to minimize context-switching failures. Crucially, to counteract hallucinations and error propagation, MA-RCA introduces two agents: a Retrieval Agent that grounds hypotheses in external domain knowledge (e.g., historical documentation) using retrieval-augmented generation, and a Validation Agent that verifies hypotheses by executing dynamic tests against runtime data. Experimental evaluations on cloud-native platforms (Nezha, 95.2% F1) and distributed power metering infrastructures (82.8% F1) demonstrate the effectiveness of MA-RCA in automating multi-domain RCA, bridging intelligent computing with complex system resilience through agent collaboration. [ABSTRACT FROM AUTHOR]

Copyright of Complex & Intelligent Systems is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)