Professional Documents
Culture Documents
Predictive algorithm
Data Warehouse
Data Marts
Cubes
Aggregations
Learning
Implementation/Testing
Evaluation
New scoring
reports
4. A Data Mining Case Study in An enterprise data warehouse is built to hold web
Automotive Manufacturing Domain data, inventory data, car demand data and sales data to
better analyzing and predicting car sales, managing car
Automotive manufacturing are markets where the inventory and planning car production. Sales and
manufacturer does not interact with the consumer marketing managers are interested in better leveraging
directly, yet a fundamental understanding of the data in support of the enterprise goals and objectives.
market, the trends, the moods, and the changing Managers envision an analytic environment that will
consumer tastes and preferences are fundamental to improve their ability to support planning and inventory
competitive. management, incentives management, and ultimately
The information gathered in order to produce production planning, in addition to enable them to
automotive data mining solution are the following meet the expectations of their decision-making process
[22]: which is supported by appropriate data and trends.
Regardless of functional boundaries and type of
Supply chain process (sales, inventory,
analysis needed, their requirements focus on
orders, production plan).
improving access to detailed data, more consistent and
Manufacturing information (car
more integrated information.
configurations/packages/options codes and
Having a data warehouse that combines online and
description).
offline behavioral data for decision-making purposes is
Marketing information (dealers, business
a strategic tool which business users can leverage to
centers… etc). improve sales demand forecasting, improve model/trim
Customers’ trends information (websites web- level mix planning, adjust body model/trim level mix
activities). with inventory data, and reduce days on lot.
LR Overall SEB ZZ: Sales & Prediction
Train Test Eval
4000
3500
3000
2500
Value
2000
1500
1000
500
0
15/4 19/5 23/6 27/6 31/7 35/8 39/9 43/10 47/11 51/12 55/1 59/2 63/3 67/4 71/5 75/5 79/6 83/7 87/8 91/9 95/10 99/11
Week/Month
Prediction SALES
SET MAPE
Train 8.4 %
Test 18.6 %
Eval 15.5 %
The main goal for such data mining solution was to without using separated weeks, Fig.3 shows the results
get some initial positive results on prediction and to obtained:
measure the prediction score of different data sources
using findings of correlation studies.
Using our proposed ASD-DM methodology, the This method provided more accurate results for the
enterprise data warehouse was created as a result of the first week, but the next weeks predictions are
speculation phase, and the ETL package was defined inacceptable.
and developed. Another collaboration cycle was launched, and the
The collaboration phase was one of the most team adopted a new model named ANFIS (Adaptive
important phases as it needed a lot of discussions and Neuro-Fuzzy Inference System), this method is a
intensive collaborative team work. The method needed combination of fuzzy logic & neural networks by
to model our prediction solution was not specified. The clustering values in fuzzy sets, membership functions
main goal was to test and evaluate the most appropriate are estimated during training, and using neural
solution that gives the most accurate prediction results networks to estimate weights. The results obtained
on a weekly basis, for the future next 4 weeks. were more accurate and this method was adapted in
We started by using neural networks to get the first our solution as the MAPE errors don’t exceed 10%.
set of prediction results. The training data subset was
gotten from April 2002 till June 2003. The test subset 5. Conclusion
was from July 2003 till September 2003, and the
evaluation subset was from October 2003 till In this paper, we explained the use of data mining
November 2003. Fig. 2 shows the overall results for techniques in software engineering tasks such as
Sebring model family. programming, testing, maintenance, reliability, and
After evaluating the first prediction method, we quality. Due to the uncertain nature of predictive data
tried to use another method based on linear regression mining application requirements, we proposed a new
framework ASD-DM based on agile methodology, [4] Slaughter S. A., Levine L., Ramesh B., Pries-Heje J., and
specifically Adaptive Software Development (ASD) Baskerville R., “Aligning Software Processes with Strategy”,
methodology, and the CRISP-DM data mining MIS Quarterly Vol. 30 No. 4, Pp. 891-918/December 2006.
2000
1500
1000
500
0
/4
/4
/5
/6
/6
/7
/8
/8
/9
/1
/1
/2
/3
/3
/4
/5
/5
/6
/7
/8
/8
/9
1
/1
/1
/1
/1
/1
/1
/1
15
18
21
24
27
30
33
36
39
54
57
60
63
66
69
72
75
78
81
84
87
90
42
45
48
51
93
96
99
Week / Month
Prediction SALES
SET MAPE
Train 17.3 %
Test 6.6 %
Eval 21.7 %
[12] Rupnik R., Kukar M., and Krisper M., “Integrating data
mining and decision support through data mining based
decision support system”, Journal of Computer Information
Systems, Spring 2007.
[13] Maqbool O., Babri H. A., Karim A. and Sarwar M.,
“Metarule-guided association rule mining for program
understanding”, IEE Proc.-Softw., Vol. 152, No. 6,
December 2005.