We had touched the Failure Effect before on our chat 5-Reliability made easy – RCM Jargons. Actually, Failure Effects(FE) needs more elaboration because it is the base of making decision about the maintenance activities. So what is in the failure effect section? what is its resources? and how to get the best results out of it? First of all what we record the failure effects for?
Failure effects are for Failure Modes, What do we mean by failure modes?
Now a days, all machinery regardless of its size has its function linked directly or indirectly to rotating equipment. Even if we are discussing a linear motion from a hydraulic cylinder or an air valve, the source of the hydraulic or air power is a rotating pump or compressor. So, one of the common failures is the stoppage of this rotational source. The causes of this stoppage are Failure Modes. If you consider a pump, there are many reasons why a pump my stop rotating. Few of the reasons for this stoppage can be: ball bearing stuck, valve closed, motor tripped, etc. Since each of these has many reasons of its own, you can consider each of them as a Failure. Then you take the failure modes stepdown to its causes as lubrication (missing or wrong), looseness, coupling, heating, induced vibrations and so on.
In short based on the level of your analysis you will come up with many Failure Modes for each Failure. I believe we can have a separate coming complete talk about Failure modes.
For each Failure Mode you need to come up with a Failure Effect. You might think that there will be repeated Failure Effects because all ends up stopping the rotating drive. Well, somehow yes but not that much. As we shall see below, there will be some variation in each Failure effect of a specific Failure Mode. That’s what actually includes the details and clues you are looking for.
What is in the Failure Effect?
It starts with a question that seems obvious but unleashes a wealth of information and opportunities for improvement. The very first Question is:
First Question: How did you come to know that this failure happened?
The answer to this HOW Failure was detected? Question has many possible answers. The complete process stopped and, the operator digged down to find which equipment initiated this complete process stoppage. Simply it may be an audible or visual indicator available to the operator. Also, it might be an indication on the SCADA screen. Other failures might be detected through inspection rounds like leakages. Or, through predictive rounds like overheated electrical terminals or gas leakages. Some Failures are discovered only when another failure occurs.
Examples for this can be found in hidden failures as those of standby equipment or protection devices. That fail while waiting for a trigger to start. Their failure is discovered only when they are triggered to take action but they don’t respond. This condition RCM (Reliability-Centered Maintenance) calls it multiple failures. Multiple Failures means that the original hidden failure is added to it the failure that came from not performing its job.
The link between how did you know? and the need for redesign.
A typical scenario for multiple failure can be when a 1+1 pumps system feeds a cooling tank for the process. The main running pump failed. The standby pump was not ready to start due to a failure. Consequently the cooling water tank level decreased till the process stopped. The process stoppage was either due to a low level process interlock signal or due to a signal of high temperature from the temperature sensor.
This above example and other such partial answers of the Failure Effects drives the need of a redesign. The redesign doesn’t mean the need to reconstruct the equipment or the process or to purchase a new one. If you manage to improve the alarm or alert system to make it clearer, easier to understand and, reflecting the level of consequence, you will decrease the effect of the failure or the effect of this specific failure mode. Now comes that the second question that h reveals the consequences of this failure:
Second Question: What happened when this failure mode (cause) occurred or detected?
This question needs a shield against biased answers. Imagine you are in charge of applying the RCM or, personally holding the initiative to test the reliability of the maintenance system. Then you know that you are going to look for concrete answers about the consequence of this failure mode. Whether it is a safety, environmental, operational or cost consequence or a mix of all, never ask explicitly using those words.
The target from this question is to record as much facts as possible. Listen to the different stories and always be eager to get more details. Look for recorded facts first. That’s why without a maintenance system in place that emphasizes the importance of feedback, it will be hard to catch the facts. There is an important note here; Reliability analysis comes to improve an existing working maintenance system not to start up one. If you need to start up a maintenance system you can join one of our trainings linked below:
What clues to look for in Question 2 answers?
Try to dig deep and be attentive to:
- Wastes generated, what kind and how it would be treated?
- Did these wastes or effect cross the facility boundaries? Examples may be dust particles, fumes, gases, smell, sound, vibration, radiation, dirt in food or pharmaceutical products, etc.
- Was there any injury or fatality?
- Were there risks of injury or fatality? Risks can be :
- If more gas or dust were generated someone might suffer loss of oxygen
- Small fire occurred but was under control
- An explosion or fire might had occurred
- Someone might face electrocution if this leakage reached the electrical parts
- The machine body became the source of small electrocution
- Hot, cold, molten, burst material might reach someone or other equipment.
Those risks even if not happened, need to be considered as probable consequence in a coming failure. Those risks and facts collected enables the proper monetization of the problem. Thus validates or diminishes the feasibility of the maintenance activity that is already in place or the proposed one.
But, wait a moment, where to find those information? Moreover, should we record only the failures that had already occurred before? The answers to those questions and more will be in our next chat.
If you feel you need help with any of these ideas we discussed, drop us a line for initial investigation in the form by Clicking here or request the service from Fiverr