If you're an electrical contractor or engineer, you've no doubt noticed that customers with critical power needs are increasingly demanding 24/7/365 up-time from their computer-based systems due to the computer-driven nature of today's society. Internet service providers (ISPs), call response centers, insurance providers, manufacturers, and other businesses with computer-driven key processes absolutely must remain operational.
To meet this ever-growing demand, computer hardware and software developers offer solutions that incorporate varying degrees of redundancy and fault tolerance. Hardware manufacturers like Dell Computer and IBM offer redundant and fault-tolerant fileservers. Microsoft operating systems software supports various levels of fault tolerance, mirroring redundant hardware support. Telecommunications and network equipment is available for supporting redundant LAN, WAN, Internet, and satellite communications. Although these technologies can provide some insurance, they're only as reliable as the electric utility source that powers them.
Instead, these computer-heavy systems with critical power requirements need something more reliable and independent of the grid. The solution may seem simple: install an uninterruptible power supply (UPS) or standby generator. And while this provides a degree of redundancy and could be enough for many businesses, it still doesn't meet the criteria for a truly redundant or fault-tolerant system. For those with ultra-critical applications, a UPS or generator still creates a single point of failure. These customers need a truly fault-tolerant or N+1 redundant system.
You have several system configuration methods at your disposal to provide such a redundant system that offers the proper level of protection. And the level of redundancy you specify for an application will ultimately depend on the degree of reliability demanded and what your customer is willing to spend. Be aware that in some cases they may not understand their system's requirements and therefore won't be able to justify the costs associated with incorporating your recommendations, so you'll have to be prepared to make your case based on hard numbers (Sidebar on page 39).
So what exactly are your options? Consider the following four examples. Each can be classified as single- and multiple-UPS configurations, and for consistency's sake, they all use computer-based servers as the critical load. However, the critical load could also be process control computers, test equipment, or medical devices.
Single-UPS configurations. The single-UPS, single-server configuration, as shown in Fig. 1 above, is the most common approach. It's used for most home computers, office workstations, and other equipment, where the end-user needs limited-term backup power. This configuration offers short-term protection from a loss of electric utility power that lasts a few minutes to several hours. Keep in mind that power will be lost to the critical load if the downtime continues.
Backup times beyond a few minutes usually require the use of a true online UPS design that will accept additional battery packs. In this design, the load is always fed through the UPS. The incoming AC power is rectified into DC power, which then charges the batteries. This DC power is then inverted back into AC power to feed the load. If the incoming AC power fails, the inverter is then fed from the batteries and continues to supply the load. In addition to providing ride-through for power outages, an online UPS provides very high isolation of the critical load from all power line disturbances. However, the online operation increases system losses and may be unnecessary for protecting many loads.
Few inexpensive off-line (standby) and line-interactive UPS designs support the connection of extended battery packs. It's important to note that off-line and line-interactive UPS units pass the electric utility voltage directly through the UPS while the device is operating from the utility line, thereby subjecting the connected equipment to a far greater number of power problems. In addition, off-line (standby) UPS designs typically provide very limited surge protection and voltage regulation.
With off-line (standby) and line-interactive UPS units, it isn't until the electric utility voltage is lost that the battery and inverter are switched over to power the critical load. This switchover to battery operation can present problems for sensitive equipment because a 4- to 50-millisecond power dropout will occur during the switchover. The CBEMA curve indicates that 8 milliseconds is the lower limit on interruption for sensitive equipment.
In contrast, a true online UPS regenerates totally new, clean output power 100% of the time, operating from the utility or battery. An online UPS won't experience switchover or the related output dropout.
For ultra-critical applications, the single UPS represents a single point of failure because if it fails, power will be lost. Most UPS products on the market have an internal bypass mode so that in the event of a UPS failure, the UPS will attempt to connect the critical load directly to the electric utility feed, if available. However, this emergency bypass transfer isn't guaranteed. Some UPS failures may cause internal protection devices like fuses or circuit breakers to open, resulting in a loss of power to the critical load.
UPS battery failures are another area of concern. Because they're electrochemical devices, batteries can and do fail without warning. If the UPS battery fails during an electric utility power outage, the critical load will be dropped without warning.
Many UPS manufacturers try to combat this problem by incorporating battery test circuitry into their products, but it isn't 100% effective. Many internal battery failure modes are difficult to detect, but improving this circuitry would require the incorporation of very costly smart batteries and sophisticated monitoring and test circuitry. This would drive the price of the UPS beyond an acceptable market price and still not solve the single-point-of-failure issue.
Another area of concern is UPS servicing. In most cases servicing the UPS requires powering down the unit, but this also requires that you power down the critical load. This may be acceptable for a home or office computer, but it could be a disaster for an ISP. You can solve this problem by installing an external maintenance bypass switch box in the power distribution system (Fig. 2 above). Most online UPS manufacturers offer this as an option. This device allows you to switch the critical load to electric utility or generator power while you remove the UPS for service.
This is a good, cost-effective solution, and it's ideal for applications that require limited power redundancy. However, you must be careful when selecting a maintenance bypass switch because some, such as break-before-make type, will disrupt the power to the connected equipment for as long as 50 milliseconds during the manual switchover. If you find out that the equipment is sensitive to this type of power disruption, you should recommend a “no-break “or “make-before-break” type external maintenance bypass switch.
Multiple-UPS configurations. Some applications are so critical that they require redundant systems. To meet the demands of this growing market, fully N+1 type redundant file-servers, computers, and operating systems software are available. To ensure this level of redundancy, you can configure two separate servers or computers in a mirroring configuration. Server operating systems software is available that supports separate server mirroring via redundant network connections.
But be careful because many so-called “redundant” servers are packaged in a single enclosure and may not be 100% (N+1) redundant. In fact, they share common processors, internal power busses, disk subsystems, and controllers. As such, these common elements represent single points of failure, requiring you to power down the entire server for servicing. A better designation for this class of products would be “fault-tolerant,” but some are presented as redundant. The same holds true for some UPS units that are advertised as truly redundant.
Fault-tolerant servers are often available with multiple power supplies and DC bus isolation circuitry. This also allows for the connection of multiple UPS units. For some applications that don't require true overall system redundancy, the connection of multiple UPS units solves the UPS servicing and short-term power redundancy problems. As two UPS units are connected to the server, you double the amount of battery backup time (Fig. 3 above).
When the system absolutely has to be up and available 100% of the time, it's best to use two identical servers configured with a mirroring operating system and powered by two UPS units (Fig. 4). This is a practical and simple method to ensure N+1 redundancy. It eliminates single-point-of-failure and servicing issues.
You can take other steps to eliminate single points of failure presented by the building's electrical system by always powering multiple UPS units from different dedicated electrical circuits protected by their own circuit breakers. Thus, an equipment failure on one circuit won't affect the equipment operation on the other. Also, avoid connecting the UPS units to two independent service panels powered from different transformers because such connections can cause unwanted common-mode noise and currents to flow on equipment ground connections and may result in network or data reliability problems.
Many businesses can get away with employing limited means of backup power, but only because their information infrastructure doesn't demand 24/7/365 availability. For others, though, the slightest outage can cripple operations and cost millions. Numerous redundant configurations exist for the varying needs of today's critical power systems, and finding the right one for your customers will go a long way toward helping them ride through those costly seconds and minutes of downtime.
Stout is engineering manager with Falcon Electric, Inc. in Irwindale, Calif.
Sidebar: Putting a Price on Reliability
It's not always easy to help a customer justify the expense of a redundant system, but the following factors can help you quantify the cost of downtime:
Risk, liability, or litigation — The cost of litigation, even if your customer successfully defends itself, can be substantial. In the case of liability, the cost can be enormous, depending on contractual obligations.
Scrapped materials — This cost can be significant in industries where both the manufacturing process and product quality are extremely dependent on power reliability.
Customer dissatisfaction — Although difficult to quantify, this factor can create a negative perception of your customer's business among its current and future clients.
Lost productivity — Even if your customer's business shuts down due to lost power, overhead costs continue and compound the resulting loss of revenue.
Customer safety — In some manufacturing processes, such as crane operation in steel production, power loss can create real safety issues for your customer's employees.
Contractual, agency, or governmental requirements — Significant losses can result from liquidated damage clauses in contracts for failing to meet specific deadlines as a result of power failures or service disruption.
Lost sales or clients — Your client's inability to meet its customers' demands can drive them to competitors, resulting in possibly permanent lost sales.
Bad will or lost stock value — Good will is difficult to quantify and takes years of good performance to achieve, yet loss of sales and/or clients from a single power quality event can negate your customer's hard work.