Detecting node failures and the Phi accrual failure detector
•
Distributed Systems
Fault tolerance
Failure detectors
Partial failure is an aspect of distributed systems; the asynchronous nature of the processes and the network infrastructure makes fault detection a complex topic.
Failure detectors usually provide a way to identify and handle failures