What is Failure Tracking?

Concepts

Tasks

Settings

What is Failure Tracking?

Failure tracking is the practice of recording information about equipment failures. Any event in which equipment cannot perform its function within the limits and under the conditions required of it is considered a failure. The severity of a failure can range from potential to full failure.

APM uses the failure record’s date information to calculate statistics (such as time between failures) that measure equipment reliability and maintainability. Failure tracking also provides a way to measure the savings gained by avoiding failures. Failure tracking allows you to make informed decisions when targeting assets for improvement.

This topic provides an overview of failure tracking concepts.

Contents

Failure and Anomaly Records

Failure Severity

PF Intervals

Failures and Root Cause Analyses

Failure Follow-up Actions

Failure and Anomaly Records

In APM, a failure record can be created for each occurrence of an equipment failure or anomaly. The record documents the dates when the failure or anomaly occurred, was reported, and was resolved. The record includes the failure severity and, in the case of potential and partial failures, the PF interval: an estimate of the amount of time before a full failure occurs.

The record tracks the failure through to its resolution, documenting how the failure was identified (for example, indicator reading resulting in an alarm), the steps taken to avoid or correct the problem (for example, work order tasks), any delays that occurred (time needed to get parts), downtime incidents, and the savings achieved by avoiding a more severe failure.

Failure records can be created:

•

Automatically when an indicator alarm is acknowledged with a work document or the monitoring option (Failure records are created automatically if the Failure Creation setting is enabled for the indicator’s alarm state.)

•

Automatically if an indicator reading is marked as being fixed during inspection. The failure record is given the status Resolved. The information in the record is taken from the alarm state’s failure settings and the indicator reading

Note: In the Acknowledge Indicator Alarm dialog, the Create or link to a failure or anomaly option is selected by default when the alarm state requires that a failure record be created. When this option is selected, the Failure tab appears in the dialog. You can clear the option if needed, or you can select it even when the alarm state does not require a failure record.

•

From scratch from a checksheet or inspection report

•

From scratch from an asset or the site

•

When linking a work request to a work order

•

When planning a work order task

•

When reporting activity for a work order task

•

When reviewing indicator readings in an inspection report

Besides indicator alarm states, the acknowledgment policies in the site’s inspection management settings affect whether failure records are created. These settings specify which acknowledgment methods allow failure records to be created.

You can also create failure actions and record delays and downtime on failure records. Downtime incidents can be created automatically at the same time that failure records are generated.

When APM uses AssetWise Enterprise Interoperability (AWEIS) to exchange data with an external CMMS, failure records can be created or updated from the events associated with interop work requests, interop work orders, or both. For more information, see Overview of Work Management with AWEIS.

Failure Severity

A failure’s severity is used to indicate the seriousness of the failure. Three severity types are supported:

•

Potential: a condition that has been noticed that would result in an actual failure if the problem is not resolved. The actual failure has not yet occurred. Potential failures are normally reported as the result of an indicator reading that has raised a warning alarm against the asset.

•

Partial: the asset’s performance has decreased to a point where it is no longer performing one of its functions at the specified levels. The asset is still functioning.

•

Total: the asset’s performance has decreased to a point where the asset is no longer performing at its required level. The asset has completely failed.

You can create failure severities as required for your organization.

PF Intervals

PF interval is used to indicate the date when a potential or partial failure could escalate to a full failure if the problem is not resolved beforehand.

PF interval is usually tracked in units of time (hours, days, weeks, and so on). However, an asset that has a cumulative primary indicator can use a PF interval measured in indicator values, such as cycles or operating hours.

For time-based PF intervals, the following values are tracked on the failure record:

•

Original PF interval: The PF interval as of the date and time when the failure occurred.

•

PF Interval: The current PF interval if the original value has been changed manually.

•

Elapsed time: The time that has elapsed since the failure occurred.

•

Remaining PF interval: The time remaining until a full failure will occur. If the failure is not resolved beforehand, the remaining PF interval can be a negative value.

•

Expected failure date: The date on which the remaining PF interval will reach 0.

For PF intervals based on operating hours or cycles, the following values are tracked on the failure:

•

Original PF interval: The PF interval, as derived from the asset indicator alarm state as of the date when the failure occurred.

•

PF Interval: The current PF interval if the original value has been changed manually.

•

Elapsed value: The value that has accumulated against the indicator since the failure occurred.

•

Remaining PF interval: The value remaining until a full failure will occur. If the failure is not resolved beforehand, the remaining PF interval can be a negative value.

•

Expected failure date: The date on which the remaining PF interval will reach 0, based on the value at which the full failure will occur and the indicator’s average accumulation.

•

Expected failure value: The value at which a failure will occur. The value at which the remaining PF interval will reach 0.

Note: If you replace or remove an asset’s primary cumulative indicator after failure records have been created, the failure records are not affected by the change. They continue to use the original indicator for calculating failure statistics, P-F intervals, or both.

Recording the Initial PF Interval

The PF interval is recorded when the failure record is created. If the failure was created from an indicator alarm acknowledgment, the PF interval on the indicator state is copied to the failure record. The copied value can be adjusted to reflect your best guess for the time that elapsed from when the failure started to be noticeable to the time when the reading was taken.

For example, an indicator state has a PF interval of four months (120 days). The indicator is read at a frequency of every 60 days. When a failure is created from the indicator and state, the PF interval on the failure is prompted with a value of four months. Because the reading value indicates that the failure started to occur 20 days ago, you could manually adjust the PF interval to 100 days.

Updating Remaining PF Intervals

The time remaining for a time-based PF interval and the expected failure date are usually calculated by a scheduled action. (PF intervals can also be recalculated manually.) Every time that the action is run, the elapsed time and remaining interval are recalculated. The expected failure date remains constant. For best results, the scheduled action should be run daily or more often if PF intervals are less than one day.

The following table shows an example of a PF interval reported on June 1 for a partial failure that occurred on the same date. The PF interval is recalculated on the first of each subsequent month. Note that the elapsed time and remaining PF interval change, but the expected failure date does not.

Date Reported or Recalculated

PF Interval

Elapsed Time

Remaining PF Interval

Expected Failure Date

June 1

100 days

0 days

100 days

Sept 8

July 1

100 days

30 days

70 days

Sept 8

Aug 1

100 days

61 days

39 days

Sept 8

Sept 1

100 days

92 days

8 days

Sept 8

Indicator-based PF intervals are recalculated each time a reading is entered for the indicator. For example, a failure is reported on June 1 with a PF interval of 100 cycles. At the time of the reading, the asset’s cycle count is 15,000. Based on the PF interval and the indicator’s average usage of two cycles per day, an expected failure date of July 20 is calculated.

The next reading of the indicator is entered on June 15. The reading on this date is 15,040 cycles. The asset has been consuming cycles at a faster rate than average. This results in a new expected failure date of July 15. Note that the expected failure at 15,100 cycles has not changed. This value is fixed, but the date on which the value is expected to be reached changes.

Date and Reading

PF Interval in Cycles

Elapsed Cycles

Remaining PF Interval in Cycles

Expected Failure Value

Expected Failure Date

June 1, 15000 cycles

100

0

100

15100

July 20

June 15, 15040 cycles

100

40

60

15100

July 15

July 1, 15080 cycles

100

80

20

15100

July 10

Failures and Root Cause Analyses

You can use APM to evaluate a failure’s suitability for root cause analysis, based on the failure’s severity, consequence priority, and probability of recurring. APM calculates the criticality index as follows:

Criticality Index = Consequence priority * Failure Severity * Probability

The value of the criticality index in turn determines whether RCA is required, recommended, or not required.

If the evaluation determines that RCA is warranted, you can create the analysis from the failure record or request that an RCA be performed.

Failure Follow-up Actions

You can define failure follow-up actions, for example, a mitigation action to provide a temporary solution until the failure can be analyzed and resolved. Each action includes a sequence number, action type, description, due date, employee assignment, and action status. The document thus provides a record that can be tracked by due date, status, and owner.

Each failure follow-up action is categorized by type:

•

Follow-up work – Create work requests and work orders or, if APM has been configured to interact with SAP, maintenance orders and notifications to perform corrective tasks

•

Inspection – Create indicator checksheets to monitor the asset

•

Mitigation – Create mitigation actions that provide temporary solutions

•

Other – Describe and assign a general action

The objects that you add from follow-up actions reference the action. For example, mitigation properties include a link to the source follow-up action.

You can view a list of failure follow-up actions for the site. In the Performance Management view, select the Failures tab. Select “Failure actions by due date” in the configuration list.

Date Reported or Recalculated	PF Interval	Elapsed Time	Remaining PF Interval	Expected Failure Date
June 1	100 days	0 days	100 days	Sept 8
July 1	100 days	30 days	70 days	Sept 8
Aug 1	100 days	61 days	39 days	Sept 8
Sept 1	100 days	92 days	8 days	Sept 8

Date and Reading	PF Interval in Cycles	Elapsed Cycles	Remaining PF Interval in Cycles	Expected Failure Value	Expected Failure Date
June 1, 15000 cycles	100	0	100	15100	July 20
June 15, 15040 cycles	100	40	60	15100	July 15
July 1, 15080 cycles	100	80	20	15100	July 10