Multi-agent systems are prone to failures typical of any distributed system. Agents and resources may become unavailable due to machine crashes, communication breakdowns, process failures, and numerous other hardware and software failures. Most of the work done in fault handling in multi-agent systems deals with detection and recovery from faults such as state-inconsistencies, relying on the traditional techniques for recovering from other distributed systems failure. However, the traditional fault-tolerance techniques are designed for specific situations and they require special infrastructural support. We argue for fault-tolerance techniques that can be readily implemented using generic agents with minimal or no modification to the agent infrastructure. We propose that theories from multi-agent systems literature can be effectively combined with basic fault-tolerance principles to design robust multi-agent systems. In particular, we argue that (1) teamwork can be used to create a robust brokered architecture that will recover a multi-agent system from broker failures without incurring undue overheads, (2) teamwork can also be used to guarantee a specified number of brokers in a large multi-agent system, and (3) agent autonomy can be used to prevent thrashing and guarantee acceptable levels of quality of service by an agent. To validate our approach, we present experimental evidence using the Adaptive Agent Architecture (AAA).
|Number of pages||8|
|Publication status||Published - 3 Dec 2000|
|Event||4th International Conference on Autonomous Agents - Barcelona, Spain|
Duration: 3 Jun 2000 → 7 Jun 2000
|Conference||4th International Conference on Autonomous Agents|
|Period||3/06/00 → 7/06/00|