Fault tolerance: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Howard C. Berkowitz
No edit summary
imported>Howard C. Berkowitz
No edit summary
Line 2: Line 2:
In engineering, '''fault tolerance''' is a characteristic of a system that can have one or more subcomponents fail without the entire system failing. This does not mean that the system has no single point of failure, but that at least some parts can fail and have continued operation; a system might be robust to a component failure, but have no backup for complete physical destruction.
In engineering, '''fault tolerance''' is a characteristic of a system that can have one or more subcomponents fail without the entire system failing. This does not mean that the system has no single point of failure, but that at least some parts can fail and have continued operation; a system might be robust to a component failure, but have no backup for complete physical destruction.


With the loss of some components, there might be a loss of performance or functionality, or the system could be engineered with reserve capacity such that some failures would not be noticed by users.
With the loss of some components, there might be a loss of performance or functionality, or the system could be engineered with reserve capacity such that some failures would not be noticed by users.  


Simple duplication of components does not make a system fault-tolerant; there needs to be an intelligent mechanism for spreading the work around the failure.
Simple duplication of components does not make a system fault-tolerant; there needs to be an intelligent mechanism for spreading the work around the failure.

Revision as of 17:29, 5 June 2009

This article is a stub and thus not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

In engineering, fault tolerance is a characteristic of a system that can have one or more subcomponents fail without the entire system failing. This does not mean that the system has no single point of failure, but that at least some parts can fail and have continued operation; a system might be robust to a component failure, but have no backup for complete physical destruction.

With the loss of some components, there might be a loss of performance or functionality, or the system could be engineered with reserve capacity such that some failures would not be noticed by users.

Simple duplication of components does not make a system fault-tolerant; there needs to be an intelligent mechanism for spreading the work around the failure.