As we go on with this this series of Reliability Made Easy, we simplify the understanding of the Reliability and its implementation. In this chat we shall demonstrate the main jargons used with the RCM (Reliability-Centered Maintenance). Those expressions are important to understand the process of applying the RCM. It is similar to studying any new hobby, skill or a language. When you don’t understand the basic vocabulary used within the community of that skill, it is hard to get what they mean. Even when someone tries to explain it to you, if you don’t have the correct definitions, you wont get it correctly.
Another important tip is to “Begin with the end in mind” that’s the second habit of Stephen Covey famous “7 Habits of the highly effective people”. The end point of understanding the RCM is to reach to a highly reliable equipment. What stands behind the reliable equipment is a reliable maintenance program that you can rely on. To achieve this maintenance program, you need to analyze your current maintenance performance in the RCM way. The RCM way ends up in 2 fundamental sheets: The Information Worksheet and The Decision Worksheet.
Information vs Decision Worksheets
The Information Worksheet invokes you to be mindful about your equipment and their functions in the following ways:
- How to divide the equipment into systems and their subsystems
- What are the required main and secondary functions of that system and its individual equipment
- All the instances that stops completely or partially those functions delivery i.e. Failure
- All the possible causes whether recorded or forecasted that cause those failures
- Essay like detailed description of the failure effect(s)
The Decision Worksheet helps us conclude the following:
- Any needed routine maintenance, its cycle and who will do it
- Whether some equipment needs redesign i.e. modification or upgrade
- Which equipment we might let it fail
We shall continue with this chat for the Information worksheet then in the coming chat we shall explain the Decision sheet RCM jargon
System vs Sub-System
That defines the level of data you are going to collect. In other words, you need to collect the detailed function and failure modes of your subsystem. If you select broad subsystem, you might get tens to hundreds of failure modes or causes. Otherwise, when you go too much detailed, you might find a difficulty in describing how it fails as it won’t be evident to operators. Moreover the technical description of the function might be difficult and not obvious. You may also miss the interlinks between subsystems. Also, you may faced by too much repetition of failure modes between subsystems.
It is better with an example. Let’s think about a conveyor whether belt, chain or roller. It is either a standalone conveyor or a cascaded group. What level of subsystem you will target? Will you consider the drive mechanism as a separate subsystem? Or, you may consider the whole conveyor with its protection sensors, drive mechanism, conveying medium and structure as a one unit?
There are no wrong or right answer. You will need to figure out the level of work you can do and the available technical details and how evident it is. The importance of the equipment also plays a vital role in the effort you will place in the RCM analysis.
The best who knows how the equipment works and what is expected from it to do are the users of that equipment. However, defining the function needs to include the original capability of the equipment. There is no maintenance that can make the equipment extend its original capability on the long run. Then it is not maintenance but it is called an upgrade.
There are two categories of functions: Primary Functions and Secondary functions. Primary functions describes the original reason for acquiring the equipment including the output quantity, quality and speed. You can reflect this on the conveyor example. While Secondary Functions include safety, comfort, efficiency, environment. You can think about it as the fuel efficiency of a truck or the material spillage from a conveyor.
The literal definition in RCM II by John Moubray says it occurs when: “an asset is unable to fulfil a function to a standard of performance that is acceptable to the user“. This covers the primary and secondary functions. As always said, the first one to detect the failure when it occurs is the user of the equipment. Besides, he has an access to a wealth of information that he can easily collect for the failure circumstances and consequences.
Simple they are the causes of the failure. The same failure can occur due to many reasons. All of these causes need to be listed. If you think of a the total stoppage of the drive mechanism of the belt conveyor as an example, you can find a lot of modes. Some of them can be: Electric power loss, Motor or gearbox stuck (for many reasons), sensors, the upstream jam etc. However, you need to consider also the human mistakes and the induced failure modes. So include as much as you can on the condition that each failure mode is a A single event that causes a functional failure.
It is an essay like detailed description of what happens when each failure mode occurs. The importance of this part is huge so don’t spare any effort or time to include everything recorded or noticed. This part will part crucial for the decision making. It will save you the time and effort of digging for more information when making the decisions. The Failure effect description needs to include
- how the failure is detected: Alarm sound, visual alarm, sound, smell, detected by the operator, detected during daily inspection, detected on the SCADA system in the control room, etc.
- what happened when the failure occurred: process totally stopped, partial slow down occurred, operation continued but some products scrapped, operations continued but spillage occurred, etc. You can read more about in this chat: 2-Reliability made easy. Whom we need to; Operators or Maintainers?
- was there any safety concerns with the failure?, did operators or maintainers took some risky actions? is there any wastes produced that may breach environmental regulations? did we loose operation time, volume or quality of product? All these are classified as consequences of the failure.
- how it will be repaired and in how much time? 3 hrs. to install a new part and discard the old. 4 hrs. to install a new ball-bearing to the gearbox, 15 min to replace the temperature sensor, etc.
- including numbers recorded for time, quantities and cost as far as possible.