Reliability and safety are two inter-relatable terms that define the overall effectiveness of any working system ― be it an electrical system, mechanical system, or a combination of both.
The concept of reliability deals with the term “failure,” which is defined as the inability of a system to perform a required function under given conditions and time intervals. Conversely, the reliability of a system defines how well the system behaves while mitigating the risk of failures.
Safety, on the other hand, is defined as “freedom from the risk that is not tolerable.” While system safety is considered an important aspect of any system design, it holds its unique position when it comes to the electrical system because the risks cannot generally be seen unless they manifest themselves ― unlike a mechanical system where the operator can detect anomalies based on appearance and behavior.
Safety vs. reliability
There is no distinct line between reliability and safety. Most systems are required to be both reliable and safe. However, the specific requirements for reliability and safety can differ or even conflict with each other, depending upon the operational context.
For example, adding a new circuit breaker or fuse to a circuit can enhance human safety. That said, any new protective device added to a system will also introduce a new failure point, which can reduce reliability and availability of the system.
Similarly, some failures can impact reliability and safety at the same time. For example, the insulation failure of a piece of high-voltage equipment not only impacts its reliability but also safety with fatal consequences. Some safety-critical electrical systems (such as backup emergency power at health facilities, communications systems at an offshore oil-producing platform, and similar others) require their system to be safe as well as reliable to ensure human survival.
The DfR process
While failure causes can vary in nature, a poorly designed system is a cause that can have far-reaching consequences. Designers, manufacturers, and end-users strive to minimize the recurrence of failures through the six-step Design for Reliability (DfR) process shown in the Figure and outlined as follows:
1. Identify: In the first step, the system’s goals, requirements, and specifications are identified. These deliverables serve as the decision-making guide for the remainder of the process.
2. Design: This is the core step of system design. It involves decisions around component identification, material selection, engineering design (electrical, mechanical, and controls), software development, and human factors (user interfaces and interactions). Initial design typically requires several iterations based on the feedback from other steps in the process.
3. Analyze: In this step, the design is analyzed from the perspective of identifying uncertainties. This encompasses identifying the design weaknesses, improving the product robustness, and refining the understanding of operating context and environmental conditions.
4. Verify: In this step, the solution design is checked against the specifications identified in Step 1. Any shortcomings identified at this stage eventually help in finalizing the design.
5. Validate: This step involves checking the solution with the end-user. This includes getting end-users’ feedback and verifying based on their responses whether the system meets end expectations or not.
6. Control: The last step entails handover of the system to the end-user and carrying out monitoring and improvement as required for future operations.
When it comes to electrical systems, the general failure modes include:
- Electrical stresses (high- or low-voltage/current)
- Thermal stresses (high temperatures)
- Mechanical stresses (overload, shock, vibration, tension, over-speed)
- Moisture/humidity
Reliable design criteria for electrical systems should address all these dominant failure modes in addition to factors that influence their safe operations. Some important considerations include service reliability within the specific operating context, availability requirements (e.g., redundancy), maintainability, safe electrical isolation barriers, and environmental factors.
Reliability design techniques
Some reliability techniques that can be used to design and develop not only reliable but also safe electrical systems are explored as follows:
1. Design Failure Modes and Effects Analysis (D-FMEA)
D-FMEA is a reliability technique that identifies and documents all possible failures and their potential impacts during a system design. It adopts a “bottom-up” approach of analysis, which means analysis starts from the lowest replaceable unit (LRU) up to the subsystem and system level, thus providing a level of granularity for risk assessment.
It determines, through a failure mode analysis, the effects of each design failure on a system’s operation and identifies single failure points critical to its reliability and safety. It also ranks each failure according to the criticality category of failure effect and probability occurrence. It works similarly to the concept of risk assessment and provides a basis for proper resource and cost prioritization at the design stage. This technique identifies potential failures and their effects that can impact reliability as well as safety. Some potential failure modes include:
- Short circuits
- Open circuits
- Arc flash
- Fire and explosion
- Insulation breakdown
- Failure to operate for Safety-Critical Equipment
- Loss of power
2. Fault Tree Analysis (FTA)
Unlike D-FMEA, FTA is a “top-down” approach, starting from the system down to its subsystem and components level. It is a failure-oriented and deductive approach that represents undesirable events associated with the system and subsystems in the form of logic gates. FTA provides a possibility to consider not only reliability, but also safety-related events as the top event. Being a top-down approach, FTA considers multiple failures and is useful in analyzing the functional path of high complexity where the outcome of one or more combinations of noncritical events may produce an undesirable critical event.
FTA is a visual tool that uses a standard set of symbols and nomenclature that give this technique wide acceptance across industries. It can be used as both a qualitative and quantitative tool.
The symbols used in FTA are the logic gates, such as AND, OR, NOT, etc., standard Boolean logic, and they can represent the state of any electrical subsystem or component. For example, the configuration of two series electrical switches (A and B) connected in series can be easily portrayed as an AND gate between switch A and switch B.
FTA can be effectively used for the analysis of safety and mission-critical emergency power supplies, nuclear reactor control systems, and fire and gas protection systems.
3. Sneak Circuit Analysis (SCA)
SCA is based upon the identification of sneak circuits that may inadvertently be designed into the system. It is a powerful analytical tool that can be used to design mission- and safety-critical systems. A sneak circuit is an unexpected path or logic flow within a system that, under certain conditions, can initiate an undesired function or inhibit the desired function. The path may consist of hardware, software, operator actions, or combinations of these elements that can impact both the reliability and safety of electrical circuits. There are four categories of sneak circuits:
- Sneak paths: This causes the current or logic flow to occur in an unwanted path. This may initiate an undesired operation or inhibit the desired operation. Examples include failure of bus coupler logic, which may lead to closing operation with two unsynchronized power sources.
- Sneak timing: This is the case when some event occurs in an undesirable sequence. Examples include unexpected control signal sequence, resulting in hazardous operation of a machine.
- Sneak indications: This causes the false or ambiguous display of the operating conditions and may result in a wrong action by the operator, such as the wrong indication of earthing switch status to a substation operator.
- Sneak labels: This causes incorrect or ambiguous labeling of system control functions like switching, input, output, power, etc., that results in an incorrect action by the operator. Examples include wrong labeling of two switches, resulting in unsafe operation on a distribution system.
Conclusion
While the reliability and safety of electrical systems may at times have conflicting requirements, they can also complement each other. This is because the occurrence and consequences of many electrical failures and hazards are often intertwined and interrelated. With the selection of the right technique for the right system, it is, however, possible to design an electrical system that attains a good balance between safety and reliability, while meeting end-user requirements.
Bryan Christiansen is the founder and CEO of Limble CMMS in Lehi, Utah. He can be reached at [email protected].