Professional Documents
Culture Documents
ALGORITHM
This paper reports on development of the failure pattern recognition model for a mining truck. The
model inputs, VIMS data collected in a mine, were processed using one of the Decision Tree
algorithms, a module of the Intelligent Miner For Data software of IBM. The results indicate that the
Decision Tree allows for identification and quantification of relations between the various types of
VIMS data. As such it can be used for development of a model that would allow prognosticating
truck condition and performance. Full development of this capacity requires further research.
1. Introduction
Modern mining equipment if fitted with numerous sensors that monitor its condition and performance. Data
collected by these sensors is used to alert the operator to existence of abnormal operating conditions and to
perform emergency shutdown if the pre-set values of the monitoring parameters are exceeded. This data is
also used for post-failure diagnostics and for reporting and analysis of equipment performance.
It is believed that availability of this voluminous data, together with availability of sophisticated data
processing methods and tools, may allow for extraction of additional information contained in the data. One
method that may permit this is data mining 1,2.
The research presented in this paper investigates use of the data collected from various sensors installed
on a mining truck for construction of a truck model, which may allow for reliable projection of both the
truck performance and its condition into the future. Subject to research was data collected by a variety of
sensors installed on an off-highway mining truck that together constitute the VIMS (Vital Information
Monitoring System) system of Caterpillar 3. The data mining tool was the IBM Intelligent Miner for Data 4.
2. Data Description
The data used in this research consists of 81,911 snapshot (event recorder) and datalogger records, each
containing values of 70 truck parameters measured over a period of time. The data was collected from a
Caterpillar 789C truck during its normal operation in a surface mine.
The snapshot stores a segment of truck history that contains values of all 70 monitored parameters
recorded during the period of six minutes, each parameter value recorded once per second. The snapshot
recording is triggered by one of a set of predefined events, usually occurrence of an abnormal situation
where a specific parameter reaches a critical value. A snapshot record describes truck conditions from five
minutes before the event to one minute after the event 3. In this paper every snapshot record is called
“event” for simplicity.
Unlike snapshot, the data logger records values of all truck parameters that are monitored by VIMS
over varying periods of time, also at one-second intervals 4. The recording and its end are triggered
manually, with individual records covering periods of up to 30 minutes of truck operation. Datalogger
records do not have to be associated with any events.
Of the 70 truck parameters used in this research, values of 26 were recorded as categorical and the
remaining 44 as numeric values. The examples of basic statistical description of both the categorical and
numerical parameter values are presented in Table 1 and Table 2.
3. Experimental Design
The Engine Speed is defined as the actual rotational speed of the crankshaft. For the modeled truck this
event is activated when the engine speed reaches 2250 rpm and deactivated when the speed drops to 1900
rpm. The Engine Cool Flow is defined as the coolant flow status in the engine cooling system. During
normal operation, the coolant flow switch is closed. The switch opens when coolant flow is less than
specified; its opening triggers the event.
gini ( s ) 1 p 2j (1)
n1 n
gini split ( s) gini ( s1 ) 2 gini ( s 2 ) (2)
n n
The tree accuracy is estimated by testing the classifier on the subsequent cases whose correct
classification has been observed 6. The v-fold cross-validation technique estimates the tree error rate. This
estimation of error rate is used to prun the tree and choose the best classifier. More detail about this
algorithm can be found elsewhere 9.
Fig. 1. Confusion matrix of training dataset (90% of available data) with four classes
To assure that analyzed event records are independent all records related the event Engine Cool Flow
were removed from the analyzed data set. Models based on the new data set yielded much lower,
satisfactory error rate. These are shown in figure 2 and figure 3, both of which present the related confusion
matrix. The results of the modeling have improved significantly. The error rates obtained for the data used
for training and that used for testing were defined to be 6.182 % and 6.165 % respectively. This confirms
The Engine Speed events have the highest prediction accuracy in this case, which allows a speculation
that High Engine Speed events of many trucks can be predicted based on a model developed for one truck
only. Further work is needed to confirm correctness of this speculation.
5. Conclusions
If real time data on truck condition is available, the predictive model can be built to project the truck
condition into the future. Such model may be built using classification tree algorithm as described in this
paper.
If VIMS Snapshot data is used in model construction and in modeling attention has to be paid to the
way this data is acquired. If several events take place during the snapshot data recording, only the primary
event that triggered the recording can be used in evaluations.
Truck condition model as described in this paper cannot be freely used for condition predictions of
other trucks. It appears that only some of the wide variety of events can be predicted in this situation.
Definition of the specific events that can be modeled, and the reliability of the related predictions need
further investigations.
Acknowledgements
Financial support of the investigations reported on in this paper by Caterpillar, Inc. of Peoria, Illinois, is
gratefully acknowledged.
References
1. T. S. Golosinski, Data Mining Uses in Mining. Proceedings, Computer Applications in the Minerals
Industries (APCOM), Beijing, China, 2001, pp. 763-766.
2. T. S. Golosinski, H. Hu, and R. Elias, Data Mining VIMS for Information on Truck Condition.
Proceedings, Computer Applications in the Minerals Industries (APCOM), Beijing, China, 2001, pp.
397- 402.
3. Caterpillar, Inc., Vital Information management System (VIMS): System Operation Testing and
Adjusting (1999), Company publication.
4. IBM (International Business Machines Corporation), Manual: “Using the Intelligent Miner for Data”
(2000), Company publication.
5. IBM (International Business Machines Corporation), Intelligent Miner for Data: Enhance Your
Business Intelligence (1999). Company publication.
6. J. R. Quinlan, C4.5: Programs for Machine Learning (1993), Morgan Kaufmann Publishers, Inc.
7. J. Jang, C. Sun, Neuro-Fuzzy and Soft Computing (1997), Prentice-Hall, Inc.
8. L. Breiman, J. Friedman, Classification and Regression Tree (1984), Wadsworth International Group.
9. J. Shafer, SPRINT: A Scalable Parallel Classifier for Data Mining in Proceedings of the 22 nd VLDB
Conference Mumbai (Bombay), India, 1996.