You are on page 1of 76

OPIM 5671 – SPRING 2017

Group Case Study- Team 10

LARSEN AND TOUBRO: SPARE PARTS FORECASTING

- Arjun Sawhney
Nikhila Manne
Prasraban Mukhopadhyay
Saloni Bhoyar
EXECUTIVE SUMMARY:

This report provides analysis and best strategies for forecasting Larsen and Toubro’s spare parts

demand. This was critical in solving the problem of unavailability of spare parts in the market,

which otherwise could have led to market cannibalization of the unavailable product or loss of

possible revenue. The solution to this problem is to come up with the best forecasting technique

for each spare part. However, considering that Larsen and Toubro is one of India’s largest

technology, construction engineering, and manufacturing company, it had 20,000 spare parts and

coming up with a forecasting model for each spare part is very time consuming and expensive. We

have proposed a strategy to identify the game-changing spare parts and build forecasting models

just for these. Additionally, we have deep-dived into various forecasting techniques and model

selection criterion. We did this by showcasing the techniques with the help of 11 sample spare-

parts which would further assist in building the best forecasting models for the selected crucial

spare-parts.

PROBLEM STATEMENT:

Larsen and Toubro (L&T), one of India's largest technology, engineering, construction, and

manufacturing company. Construction and Mining Business (CMB) sold equipment such as Dozer

Shovels, Dozers, Dumpers, Hydraulic Excavators, Motor Graders, Pipe Layers, Surface Miners,

Tipper Trucks, Wheel Dozers, and Wheel Loaders. Supply of spare parts was critical, since the

customer faced severe losses in case of equipment unavailability. Forecasting was done on an ad

hoc basis based on the experience of the planning personnel. The value of each spare part ranged

from INR 10 to INR 8 Million. So, it was critical to maintain a correct balance for the spare-parts

inventories, since unavailability led to loss of revenues, decreased profitability, customer

dissatisfaction and gave rise to the fake products industry. On the other hand, excess inventory led
to high inventory carrying costs, working capital lock-in and a possibility of spare parts becoming

obsolete. Vijaya Kumar, Deputy General Manager of CMB, had to arrive at a forecasting

methodology for the 20,000-odd spare-parts. This required development of 20,000 forecasting

models, which was not only very time consuming but also very expensive to develop and manage.

Kumar wanted to build the forecasting model quickly so that he could roll out the forecasting

strategy on a pan-India basis within a few weeks.

BACKGROUND:

Classical statistical methods, such as exponential smoothing and regression analysis, have been

used by decision makers for several decades in forecasting spare parts demand. These models are

then benchmarked against each other, which is the process of comparing different forecasting

methods to determinate which has more confirmations in the reality. Benchmarks are the

parameters, the references with which two or more forecasting methods are evaluated, in

connection with the actual demands that occurred. There are two kinds of parameters: absolute

accuracy measures (MAPE, RMSD) and accuracy measures (RMSE) relative to other methods

which are widely used model evaluation parameters.

METHODOLOGY:

We have followed a 5-step approach to determine the best forecasting model:

Step 1: Identify stationary or nonstationary by checking trend and seasonality in the time series

 Has neither trend nor seasonality  stationary

 Has either trend or seasonality  nonstationary


Step 2a: If stationary, identify p and q to build ARMA Model:

1. Finding q such that ACF(q) falls outside confidence limits and ACF(k) falls inside

confidence limits for all k>q

2. Finding p such that PACF(p)/IACF falls outside confidence limits and choosing the

highest value of them and PACF(k) falls inside confidence limits for all k>p.

3. Determining all ordered pairs (j,k) such that 0≤j≤p and 0≤k≤q and trying out different

models with these combinations.

Step 2b: If nonstationary, detrend and de-seasonality:

1. Diagnose and determine appropriate trend and/or seasonal components.

2. Obtain the residuals after applying the trend and/or seasonal components.

3. Verify that the residuals appear to be stationary.

4. Determine an appropriate ARMA model for the residual series.

Step 3: Add intervention and regressors, if applicable

Step 4: Combine all the components (trend, seasonality, ARMA, interventions, regressors)

Step 5: Evaluate models- Basically, we evaluate and explain the following factors (trade-off):

1. Whether the model passes all the tests (white noise test, unit & seasonal root tests)

2. Whether the model has acceptable error (e.g. RMSE)

3. Whether the parameters are significant

4. Whether the model is too complex

5. Whether the forecast is reasonable


RESULTS1:

We have applied the forecasting techniques to a sample of 11 variables (refer Appendix for

detailed analysis) and come up with the best possible model for each of them. Below is the list of

models that we think best forecast each of the 100 variables based on our analysis:

Variable Best Model Suggested


205-70-N1190 Cubic Trend + Point: JUL 2010 + MA(3)
PC_198_27_42263 Cubic Trend + MA(1)
PC_203_32_51461 Mean model
PC_600_863_4210 Linear Trend + AR(3)
PC_6735_61_3410 Linear Trend + AR(7)
D30141135 Quadratic Trend + ARMA(1,3) model
600-181-6740I AR(1)
07063-51210I Quadratic Trend
_600_319_45401 Cubic Trend + AR(13)
_7000_B2011l AR(11)
_6735_51_51431 Mean

CONCLUSION AND RECOMMENDATIONS:

We recommend a strategic solution2 to the problem that requires 20,000 forecasting models to be

built to cater for demand of every spare-part. It is to rationalize the demand for each spare part

quantitatively and qualitatively. This is suggested to be done by identifying the firm-level variables

i.e. the spare parts which are causing the highest variability in supply chain and in the revenue.

Once, these have been identified the forecasting techniques discussed in the methodology and in

the appendix with the help of 11 sample variables, have to be applied and rolled out on a pan-India

basis within few weeks.

1
Results provide an outline of the best forecasting models that should L&T use to forecast different spare items.
2
This paragraph provides the strategic solution for Vijaya Kumar to adopt for developing forecasting model for
demand estimation of 20,000 spare parts.
In addition to ‘uncertainty reduction methods’ like forecasting, ‘uncertainty management methods’

such as adding redundant spare parts have also been devised to cope with demand uncertainty in

manufacturing planning and control systems. Many of these uncertainty reduction or management

methods may perform in a good way, but in general perform poorly when demand for an item is

lumpy or intermittent.
REFERENCES:

1. http://cb.hbsp.harvard.edu/cbmp/access/61717709

2. Business Forecasting Using SAS®: A Point-and Click Approach

3. http://tesi.cab.unipd.it/25014/1/TesiCallegaro580457.pdf

APPENDIX: 3
In the below section, we have explained in details various forecasting models for each of the 11 variables provided
in the “L&T Spare parts Forecasting” dataset and have suggested the best forecasting technique for them.

Variable 205-70-N1190:

Trend Analysis: Condition ‘1’:

We can see that there is a slightly upward trend in the data.

Condition 2:

3
Appendix details the various forecasting models developed for data provided in the excel sheet titled “L&T Spare
parts Forecasting” and justifies the choice for selecting a specific forecasting model.
We can see that there is no significant values for lag ‘1’ for the PCF and IACF functions. Hence, this condition fails

Condition 3:

The ACF is decreasing after lag ‘1’

Condition 4:

Before applying first difference:


After applying first difference :

ACF has significant values after first difference is applied at lag ‘1’

Condition 5:

Before applying difference at lag ‘1’:


After applying difference at lag ‘1’:

As condition ‘2’ fails but the unit root tests give significant results after applying first difference, there may be a
trend in our data.

Seasonality Analysis:

Condition 1:
We can see that there is no pattern that is repeated over time.

Condition 2:

We can see that there is no significant value at lag ‘S’

Condition 3:
ACF does not have significant value at lag ‘S’

Condition 4:

Before applying difference at lag ‘S’:

After applying difference at lag ‘S’:


ACF has significant values at lag ‘S’ after the difference is applied at lag ‘S’

Condition 5:

Before applying difference at lag ‘S’:

After applying difference at lag ‘S’:


The seasonal root test fails. Hence, we conclude that there is no seasonality in our data.

We can see that there is an event on 07/01/2012:

From the PACF and IACF graphs below, we can see that p = 0
From the ACF , we can see that q < = 5

Trying various combinations of p,q to see the best model that fits our data:

Best Model that fits our data is : Cubic Trend + Point: JUL 2010 + MA(3)

As we can see that trend helps us model our data better, hence we conclude there is trend in our data.
Residual plot:

Correlation functions:

White noise:
Parameter estimates:

VARIABLE PC_198_27_42263:

Trend Analysis: Condition 1:

We can see that there is a slightly upward trend.

Condition 2: ACF, PACF, IACF have significant values at lag ‘1’


Condition 3: ACF reduces significantly after lag ‘1’

Condition 4: Before applying difference at lag ‘1’:


After applying difference at lag ‘1’: ACF has significant difference at lag ‘1’

Condition 5: Before applying difference at lag ‘1’:

After applying difference at lag ‘1’:


Hence, we conclude that there is a trend in our data.

Seasonality analysis:

Condition 1: We can see that there is no pattern that is repeated over a period of time.

Condition 2: ACF, PACF and IACF have significant values at lag ‘S’

Condition 3:
ACF has significant values at multiples of lag ‘S’

Condition 4: Before applying difference at lag ‘S’

After applying difference at lag ‘S’:

ACF does has significant value at lag ‘S’

Condition 5: Before applying difference at lag ‘S’


After applying difference at lag ‘S’:

Conditions 1,4,5 fail, hence we can conclude that there is no seasonality in our data.

From the PACF and IACF we can conclude that value of p <= 1

From the ACF , we can conclude that q <= 14

Trying various combinations of p,q to model our data:


Residual plot:

Correlation functions:
White noise:

Parameter estimates:

Variable : PC_203_32_51461

Trend Analysis: Condition 1:


Condition 2:

Condition 3:

Condition 4: Before applying first difference at lag 1:

After applying difference at lag 1:


Condition 5:

Before applying first difference at lag 1:

After applying difference at lag 1:


Condition 1,2 fail, no TREND in our data.

Seasonality analysis:

Condition 1:

Condition 2:

Condition 3:
Condition 4:

Before applying first difference at lag ‘S’:

After applying difference at lag ‘S’:

Condition 5:

Before applying difference at lag S:


After applying difference at lag ‘S’:

Conditions 2, 5 fail, hence we conclude that there is no SEASONALITY in our trend. Implies , our model is a
stationary model.

Checking PCF and ICF to evaluate values of p =0

Checking values of ACF, q = 0


Hence, we conclude that a baseline mean model will best fit our data.

Residual plot:

Correlation functions:

White noise:
Parameter estimates:

However, we see an event at 09/01/2012

If we consider this event in our data model, we see that the white noise increases:
Hence, we conclude that the mean model suits best to our data.

Variable PC_600_863_4210:

Firstly, we identify if the series is stationary or non-stationary by checking if there is trend and seasonality in the
series.

From the below autocorrelation plots, we see that there is no significant spike at lag 1 in ACF, IACF and PACF.
After applying first differencing, the lag1 becomes significant instead.

After applying difference of order S, not much change in lag S

Unit Root and Seasonal root tests before applying differencing. The seasonal root test shows that there is no
seasonality in the series as the results are significant. The unit root test results are not significant, so let us apply
first differencing and check.
After applying first differencing, the unit root test results become significant, signifying that there is trend in the
series.

Upon, running various models for trend we find that cubic trend model gives the lowest rmse.

Now, we also model the error in the trend models and see that again Cubic Trend + AR(13) gives the lowest RMSE.

Cubic Trend + AR(13)

But, the lags in IACF and PACF for the cubic trend models are still significant even after applying the error model
and the white noise as well is higher.
We see the same issue with Quadratic Trend + AR(13) model as with the Cubic Trend + AR(13).

So, this leaves us with the Linear Trend + AR(3) model, which seems to be the best option considering the white
noise test and the ACF plots.

All the lags in ACF, PACF and IACF plots are now within the boundaries.
White noise is reduced compared to the more sophisticated models like Cubic Trend/ Quadratic Trend models.

The prediction errors seem a little random.


Variable PC_6735_61_3410:

Firstly, we identify if the series is stationary or non-stationary by checking if there is trend and seasonality in the
series.

From the below autocorrelation plots, we see that there is a significant spike at lag 1 in ACF, IACF and PACF which
shows there could be trend in the series and there doesn’t appear to be any significant spikes at lag S (signifying no
trace of seasonality in the series).

After applying first differencing, the lag1 becomes less significant, but others become significant
After applying difference of order S, not much change in lag S

Unit Root and Seasonal root tests before applying differencing. The seasonal root test shows that there is no
seasonality in the series as the results are significant. The unit root test results are not significant, so let us apply
first differencing and check.
After applying first differencing, the unit root test results become almost significant, signifying that there is trend
in the series.

Upon, running various models for trend we find that cubic trend model gives the lowest rmse:

Now, we also model the error in the trend models and see that again Cubic Trend + AR(13) gives the lowest RMSE.
But, the lags in IACF and PACF for the cubic trend models are still significant even after applying the error model
and the white noise as well is higher.

We see the same issue with Quadratic Trend + AR(13) model. So, this leaves us with the Linear Trend + AR(7)
model, which seems to be the best option considering the white noise test and the ACF plots.
All the lags in ACF, PACF and IACF plots are now within the boundaries.
White noise is reduced compared to the more sophisticated models like Cubic Trend/ Quadratic Trend models.

Parameter Estimates:

Variable: D30141135

Selecting the time series variable as D30141135 and viewing the Time Series Plot for this selected spare part.

From the pattern observed above, Trend and Seasonality seems to be missing. Checking the other parameters to
validate the same.

Viewing the Autocorrelation Plots before any differencing is applied. ACF plot seems to be gradually decreasing.
ACF and PACF have significant lag 1 values, but IACF doesn’t. There is no significant vale for the 12 th lag. Looks like
seasonality does not exist.
Viewing the autocorrelation plots after first differencing.

The ACF now has few significant values after first differencing.

Applying seasonal differencing on the autocorrelation plots and observing the change.
Viewing the Unit Root tests and the seasonal root tests for this time series before applying any differences.

We can see that there are many insignificant values present.

Applying simple difference and viewing the unit root tests.


We can see that not all values have become significant in the unit root tests. Trend is not present. Applying
seasonal difference and viewing the seasonal root tests.

We can see that after applying seasonal difference, there are insignificant values present in the seasonal root tests.
We can confirm that there is no seasonality present in this time series. This times series is a STATIONARY time
series.

Modeling the Time Series for the part D30141135

Since the time series is a Non Stationary Time Series, we will be applying ARMA(p,q) models.

From the autocorrelation plots, the value of p = 1 (from PACF) and the value of q = 3 (from ACF)

The below models were developed and their RMSE evaluation is also shown.
From the above, if we need to choose a model with less of complexity yet lower RMSE value, we will be choosing
the model “Quadratic Trend + ARMA(1,3) model” Its RMSE is 87.53

Showing the model details below. The model prediction looks like this.

The model predictions cover most of the points as we can see above.

The Prediction errors are shown below. It is quite random in nature which is good.

The prediction error autocorrelation plots are shown below.


The Prediction error white noise test, unit root tests and seasonal root tests are shown below.

We can see that none of the noise tests are significant which is good.

The parameter estimates are shown below for this model developed.

The statistics of fit are shown below.


The forecast for this part is plotted below.

The forecasted dataset is shown below along with the upper and lower boundary values.

It can be seen that the demand is likely to be dropping in the near future for the spare part D30141135.

Variable: 600-181-6740I

Selecting the time series variable as 600-181-6740I and viewing the Time Series Plot for this selected spare part.
From the pattern observed above, Trend and Seasonality seems to be missing. Checking the other parameters to
validate the same.

Viewing the Autocorrelation Plots before any differencing is applied. ACF plot seems to be gradually decreasing.
ACF and PACF have significant lag 1 values, but IACF doesn’t. There is no significant vale for the 12 th lag. Looks like
seasonality does not exist.

Viewing the autocorrelation plots after first differencing.


The ACF now has few significant values after first differencing.

Applying seasonal differencing on the autocorrelation plots and observing the change.

Viewing the Unit Root tests and the seasonal root tests for this time series before applying any differences.
We can see that there are many insignificant values present.

Applying simple difference and viewing the unit root tests.

We can see that not all values have become significant in the unit root tests. Trend is not present. Applying
seasonal difference and viewing the seasonal root tests.
We can see that after applying seasonal difference, there are insignificant values present in the seasonal root tests.
We can confirm that there is no seasonality present in this time series. This times series is a STATIONARY time
series.

Modeling the Time Series for the part 600-181-6740I

Since the time series is a Stationary Time Series, we will be applying ARMA(p,q) models.

From the autocorrelation plots, the value of p = 1 (from PACF) and the value of q = 1 (from ACF)

The below models were developed and their RMSE evaluation is also shown.

From the above, if we need to choose a model with less of complexity yet lower RMSE value, we will be choosing
the model “AR(1)” Its RMSE is 48.69. It is to be noted that the Noise reduction is the best for this model when
compared to other models developed.

Showing the model details below. The model prediction looks like this.
The model predictions cover most of the points as we can see above.

The Prediction errors are shown below. It is quite random in nature which is good.

The prediction error autocorrelation plots are shown below.


The Prediction error white noise test, unit root tests and seasonal root tests are shown below.

We can see that none of the noise tests are significant which is good.

The parameter estimates are shown below for this model developed.

The statistics of fit are shown below.


The forecast for this part is plotted below.

The forecasted values with the Upper bound and Lower bound values are shown below.

Variable: 07063-51210I

Selecting the time series variable as 07063-51210I and viewing the Time Series Plot for this selected spare part.
From the pattern observed above, an upward Trend seems to be present. Checking the other parameters to
validate the same. Seasonality seems to be missing.

Viewing the Autocorrelation Plots before any differencing is applied. ACF plot seems to be gradually decreasing but
irregularities seem to be present. ACF and PACF have significant lag 1 values, but IACF doesn’t. There is no
significant vale for the 12th lag. Looks like seasonality does not exist.

Viewing the autocorrelation plots after first differencing.


The ACF now has few significant values after first differencing.

Applying seasonal differencing on the autocorrelation plots and observing the change.

Viewing the Unit Root tests and the seasonal root tests for this time series before applying any differences.
We can see that there are many insignificant values present.

Applying simple difference and viewing the unit root tests.

We can see that all insignificant values have now become significant. Trend is present in the time series. Applying
seasonal difference and viewing the seasonal root tests.
We can see that the seasonal root tests were all significant before applying the seasonal differencing. This proves
that seasonality does not exist. The time series has only Trend and thus is a Non Stationary time series.

Modeling the Time Series for the part 07063-51210I

Since the time series is a Stationary Time Series, we will be applying ARMA(p,q) models to model the noise and
trend models to capture the trend present in the time series for this part.

From the autocorrelation plots, the value of p = 1 (from IACF) and the value of q = 0 (from ACF)

The below models were developed and their RMSE evaluation is also shown.

From the above, if we need to choose a model with less of complexity yet lower RMSE value, we will be choosing
the model “Quadratic Trend” Its RMSE is 62.94. It is to be noted that the Noise reduction is the best for this model
when compared to other models developed.

Showing the model details below. The model prediction looks like this.
The Prediction errors are shown below. It is quite random in nature which is good.

The prediction error autocorrelation plots are shown below.


The Prediction error white noise test, unit root tests and seasonal root tests are shown below.

We can see that none of the noise tests are significant which is good.

The parameter estimates are shown below for this model developed.

The statistics of fit are shown below.

The forecast for this part is plotted below.


The forecasted values with the Upper bound and Lower bound values are shown below.

Variable - _600_319_45401

Visually, it looks like there is an upward trend.


There are significant lags at lag 1 for ACF and PACF and values at ACF are decaying slowly from lag 1. There are no
significant values at lag s in any of the graph.

These are the values of ACF, PACF and IACF before any differencing is applied.
ACF has fewer significant values after first differencing is applied.

Unit root tests becomes significant once first differencing is applied.


After seasonal differencing is applied , we can see there are no results that show there is a seasonality.

Hence we can conclude that the data has a trend but no seasonality.
After applying many models , the best model seemed to be the one selected in the picture above.

The errors look randomly distributed.


The values look well within the significance line.

White noise test is minimized.


Parameter estimates looks acceptable and p-values < 0.05

These statistics show that it has the best R-square and minimum RMSE.
Variable : _6735_51_51431

Looking at the plot, there seems to be no trend or seasonality.

Since there are no lags outside the significant lines and no error terms , we can say the variable doesn’t have trend
or seasonality and the model is stationary without any error terms.
White noise is already minimal. So we are suggesting to use the baseline model as reference to predict.

The errors look randomly distributed.


All values are within significance range.

White noise is minimal.

Parameter estimates are within significant range.


The prediction is made on based of taking the previous averages.

Variable - _7000_B2011l

Visually, there seems to be no trend and seasonality.


Since, there are no lags at lag 1 and there is no decaying pattern, we can conclude that the model is stationary and
there is no trend and seasonality. From the above plot we can see that p=11 and we will check for different model
s with different p values.

These are the tests before applying any differencing.


After applying different models , the best model was AR(11)

The error terms look randomly distributed.


All values are within significance range.

White noise is minimal.


Parameter estimates have significant values with p <0.05

Statistics shows that this model has the least RMSE

Visually, the graph looks acceptable

You might also like