Professional Documents
Culture Documents
PROBLEM A major oil company developed a petrol additive that was supposed to increase engine efficiency. Twenty two cars were test driven both with and without the additive and the number of kilometer per liter was recorded. Whether the car was automatic or manual was also recorded and coded as 1 = manual and 2 = automatic. During an earlier trial 22 cars were test driven using the additive. The mean number of kilometers per liter was 10.5.
NULL HYPOTHESIS There is no significant difference in engine efficiency between the present trial and the earlier trial.
ALTERNATE HYPOTHESIS There is a significant difference in engine efficiency between the present trial and the earlier trial.
PROCEDURE
1. Select the Analyze menu. 2. Click on compare means and then one-sample T Test. To open the One-Sample T Test dialogue box. 3. Select withadd and move the variable into the Test Variable(s): box 4. In the Test Value: box type the mean score (10.5). 5. Click Ok.
OUTPUT
One-Sample Statistics N withadd 22 Mean 13.86 Std. Deviation 2.748 Std. Error Mean .586
One-Sample Test Test Value = 10.5 95% Confidence Interval of the Difference Lower Upper 2.15 4.58
withadd
t 5.741
df 21
INFERENCE 1. The difference between the sample mean and the hypothesized mean is determined by consulting the t-value, degree of freedom (df) and two-tail significance. 2. If the value for two-tail significance is less than .05 (p<.05), then the difference between the means is significant. 3. The cars in the present trial appear to have greater engine efficiency than that of those in the earlier trial t (21) = 5.74, p<.05.
RESULT The output indicates that there is a significant difference in engine efficiency between the present trial and the earlier trial.
OBJECTIVE:
To find out the difference in opinion among two sets of people by Independent sample t-test using SPSS.
PROBLEM:
As marketers of brand jeans, we want to find out whether a set of customers in Delhi and set of customers in Mumbai thought of our brand in the same way or not. A small survey was conducted in both the cities and the ratings were obtained on an interval scale 1-7. We want to find out whether the two sets of rating are significantly different.
S.NO. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
RATING 2 3 3 4 5 4 4 5 3 4 5 4 3 3 4
CITY 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
S.NO. 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
RATING 3 4 5 6 5 5 5 4 3 3 5 6 6 6 5
CITY 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
NULL HYPOTHESIS There is no significant difference between the ratings given by the customers in Mumbai and Delhi at 95% confidence interval.
ALTERNATE HYPOTHESIS
There is significant difference between the ratings given by the customers in Mumbai and Delhi at 95% confidence interval.
PROCEDURE:
1. The variables are entered in the variable view of the SPSS data editor where city is a categorical variable using nominal measure and respondents ratings in scale. 2. In the value cell for city, enter the label values as 1- Mumbai and 2-Delhi. 3. The given data is entered in the data view. 4. Choose Analyse from the main menu. 5. Then choose Compare means > Independent sample T-test. 6. In the Independent sample t-test dialogue box, ratings given by the respondents are entered as a test variable and the city they belong to is entered as grouping variable. 7. Enter the specified values for the groups after clicking defining groups. 8. The output chart is generated and it is analyzed and inference is obtained.
Respondent's rating
N 15 15
Independent Samples Test Levene's Test for Equality of Variances F Equal variances assumed Equal variances not assumed Sig . T Df
95% Confidence Std. Mean Interval of the Error Differ Difference Differ ence ence Upper Lower -1.000 .364 -1.746 -.254
Ratings
.727
.40 -2.745 1
28
-2.745
26.759
.011
-1.000
.364
-1.748
-.252
INFERENCE:
1. The Independent samples t-test procedure compares the two group means (both Mumbai and Delhi). 2. The mean value for the two groups are displayed in the Group Statistics table (3.73 4.73 = - 1.00) 3. One test assumes that the variances of the two groups are equal. Levene tests this assumption. 4. The significance value for the Levenes test is high (0.401 is typically greater than of 0.10), so the result is assumed that there is equal variance for both the groups and the second test is ignored. 5. The significance value for the t-test 0.010 is less than 0.05 and the confidence interval for the mean difference does not contain zero. 6. So, the Null hypothesis is rejected and the Alternate hypothesis accepted. This indicates that there is a significant difference between the two group means.
RESULT:
There is a significant difference in the ratings on the brand, given by the respondents in the cities of Mumbai and Delhi.
PROBLEM A major oil company developed a petrol additive that was supposed to increase engine efficiency. Twenty two cars were test driven both with and without the additive and the number of kilometer per liter was recorded. Whether the car was automatic or manual was also recorded and coded as 1 = manual and 2 = automatic Does engine efficiency improve when the additive is used? This is a repeated measure t-test design.
NULL HYPOTHESIS There is no significant difference exists between engine efficiency with and without the additive.
ALTERNATE HYPOTHESIS There is a significant difference exists between engine efficiency with and without the additive.
PROCEDURE 1. Select the Analyze menu. 2. Click on Compare Means and then Paired-Samples T Test to open the PairedSample T Test dialogue box. 3. Select the variables without and withadd and move the variables into the Paired Variables: box 4. Click Ok.
Paired Differences 95% C nfi ence Interval f t e Difference L wer Upper -6.651 -4. 6
Pair d ampl
tati ti
t . 22
Pair d ampl
Corr lation 22
INFERENCE 1. It can be determined that whether the groups come from the same or different populations 2. The significance is determined by looking at the probability level (p) specified under the heading two tail significance. 3. If the probability value is less than the specified alpha value, then the observed t-value is significant 4. The 95 percent confidence interval indicates that 95 percent of the time the interval specified will contain the true difference between the population means 5. The additive significantly improves the number of kilometers to the liter, t(21) = 8.66, p<.05 RESULT The output shows that there is a significant difference exists between engine efficiency with and without the additive.
P ir
wit
t & wit
C rr l ti . 9
P ir
wit wit
8. .86
i ti . 2. 8
ig. .
#
t . rr r
. . 86
"!
P ir 1 wit
t - wit
!
Mean St . - . 64
eviati n 2.9 4
St . rr r Mean .619
#
t -8.663
df 21
Sig. (2-tailed) .
OBJECTIVE:
To test the preferred ad copy by the target population before the launch of its campaign.
PROBLEM:
There are three different versions of advertising copy created by an advertising agency for a campaign. Let us call these versions of copy as adcopy 1, 2 and 3. A sample of 18 respondents is selected from the target population in the nearby areas of the city. At random, these 18 respondents are assigned to the 3 versions of ad copy. Each version of ad copy is thus shown to six of the respondents. The respondents are asked to rate their liking for the ad copy shown to them on a scale of 1 to 10. (1 = Not liked at all, 10 = Liked a lot, and other values in between these two).
S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Ad copy 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3
Rating 6.00 7.00 5.00 8.00 8.00 8.00 4.00 4.00 5.00 7.00 7.00 6.00 5.00 5.00 4.00 7.00 8.00 7.00
Null Hypothesis There is no difference in the ratings between the three versions of the ad copy at 95% confidence level.
Alternative Hypothesis
There is a significant difference between the three versions of the ad copy at 95% confidence level.
PROCEDURE:
1. The given data is entered in the variable view and then in the data view. 2. Choose Analyse > Compare means > One-way ANOVA. 3. In the one-way ANOVA dialog box, select ratings as the dependent list and ad copy as its factor. 4. Select other variables as required. 5. The output chart is generated and analysed and inference obtained.
OUTPUT:
Descriptives Ratings N Ad copy1 Ad copy2 Ad copy3 Total Mea n 7.00 00 5.50 00 6.00 00 6.16 67 Std. Deviatio n 1.26491 1.37840 1.54919 1.46528 Std. Erro r .516 40 .562 73 .632 46 .345 37 95% Confidence Interval for Mean Lower Upper Bound Bound 5.6726 4.0535 4.3742 5.4380 8.3274 6.9465 7.6258 6.8953 Minimu m 5.00 4.00 4.00 4.00 Maximu m 8.00 7.00 8.00 8.00
6 6 6 1 8
df1 2
df2 15 ANOVA
Sig. .596
Ratings Sum of Square s Between Groups Within Groups Total 7.000 29.500 36.500 df 2 15 17 Mean Square 3.500 1.967 F 1.78 0 Sig. .203
INFERENCE:
1. The descriptive of the ratings are obtained in terms of mean and standard deviation.
3. The significance value for the Levenes test of homogeneity of variables is high (0.596 >0.05) and the ANOVA table, sig represents the significance level of F-test.
4. Therefore the null hypothesis is not rejected and alternate hypothesis is not accepted. Hence the variances for the three versions are equal and the assumption is justified.
RESULT:
There is no significant difference in the preferences over the three versions of the ad copy by a target population before the launch of its campaign.
CORRELATION
OBJECTIVE:
To find the interrelationship between the dependent and the independent variables.
PROBLEM:
A manufacturer and the marketer of electric motors would like to build a regression model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15 sales territories, on sales and 6 different independent variables. Build a regression model and recommend whether or not it should be used by the company.
Dependent Variable:
Independent Variables:
X1
X2
X3
X4
X5
X6
S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sales (Y) 5 60 20 11 45 6 15 22 29 3 16 8 18 23 81
Dealers (X2) 1 12 5 2 12 3 5 7 4 1 4 2 7 10 15
People (X3) 6 30 15 10 20 8 18 16 15 6 11 9 14 10 35
Competition (X4) 5 4 3 3 2 2 4 3 2 5 4 3 3 4 4
Service (X5) 2 5 2 2 4 3 5 6 5 2 2 3 4 3 7
Customers (X6) 20 50 25 20 30 16 30 40 39 5 17 10 31 43 70
Null Hypothesis
There is no significant relationship between the independent and the dependent variables at 95% confidence interval.
Alternative Hypothesis
There is significant relationship between the independent and dependent variables at 95% confidence interval.
PROCEDURE: Let the estimating equation be Y= a1X1+a2X2+a3 X3+a4X4+a5 X5 1. The variables are defined in the variable view of the SPSS data editor. 2. Enter the data in the data view. 3. Choose Analyze > Correlate > Bivariate from the main menu. 4. In the bivariate correlations dialogue box select all the dependent and independent variables. 5. Select Pearsons correlation coefficient with test of significance being one tailed. 6. Also include the statistics for mean and standard deviation. 7. The output chart is generated, analyzed and inference obtained
OUTPUT:
Descriptive Statistics Std. Deviation 21.980 42.543 4.408 8.340 .986 1.633 16.829
Mean Sales in Rs.lakhs in the territory Market potential in the territory (in Rs. lakhs) No. of dealers of the company in the territory No. of sales people in the territory Index of competitor activity in the territory No. of service people in the territory No. of existing customers in the territory 24.13 55.80 6.00 14.87 3.40 3.67 29.73
N 15 15 15 15 15 15 15
Correlation No. of sales people in the territory Index of competit or Sales in activity Rs.lakhs in the in the territory territory No. of Market service potential people in in the the territory territory (in Rs. No. of Lakhs) existing No. of customer dealers of s in the the territory company in the territory Pearson Correlation Sig. (1-tailed) N Pearson Correlation Sig. (1-tailed) Pearson Correlation N Sig. (1-tailed) N Pearson Pearson Correlation Correlation Sig. (1-tailed) Sig. (1-tailed) N N Pearson Correlation Pearson Sig. (1-tailed) Correlation Sig. (1-tailed) N N .953(** ) Sales in .000 Rs.lakh s 15 in the territor -.046 y .436 1 15 15 .726(** .945(** ) ) .001 .000 15 15 .878(** ) .908(** .000 ) .000 15 15
.877(**) Market .000 potential in15 the territory (in Rs. .140 lakhs) .309 .945(**) 15 .000 15 .613(**) 1 .008 15 15 .831(**) .837(**) .000 .000 15 15
.855(**) No. of .000 dealers of the 15 company in the -.082 territory .385 .908(**) 15 .000 15 .685(**) .837(**) .002 .000 15 15 .860(**) 1 .000 15 15
1 No. of sales people 15 in the territor -.036 y .449 .953(* *) 15 .000 15 .794(* .877(* *) *) .000 .000 15 15 .854(* *) .855(* .000 *) .000 15 15
-.036 .794(**) Index of .449 .000 competit No. of or service 15 15 activity people in in 1 the the -.178 territory territory .263 -.046 .726(**) 15 15 .436 .001 15 15 -.178 1 .140 .613(**) .263 .309 .008 15 15 15 15 -.015 .818(**) -.082 .479 .385 15 15 .685(**) .000 .002 15 15
.854(**) .000 No. of existing 15 customer s-.015 in the territory .479 .878(**) 15 .000 15 .818(**) .831(**) .000 .000 15 15 1 .860(**) .000 15 15
INFERENCE:
The correlations table shows Pearson correlation coefficients, significance values, and the number of cases with non missing values.
1. The Pearson correlation coefficient measures the linear association between two variables if the value of the correlation coefficient ranges from -1 to 1. 2. The sign of the correlation coefficient indicates the direction of the relationship. Hence from the inference there is a negative relation between sales and the index of the competitor activity and the positive relationship with market potential, number of dealers, no of salespersons, number of service people and the no of existing customers. 3. The absolute value of the correlation coefficient indicates the strength, with larger absolute values indicating stronger relationships. 4. The significance levels of market potential is 0.000, no of service people is 0.001 and no of existing customers is 0.000 which is less than 0.05. So null hypothesis is rejected and alternate hypothesis is accepted. Hence it indicates that the correlation is significant and the variables are linearly related with sales. 5. The significance level of the index of the competitor 0.436 is greater than 0.05 then the correlation is not significant and the variable is not linearly related. 6. This indicates that the manufacturer should not consider the index of the competitor since it does not affect the sales.
RESULT:
Hence there is dependence between the sales (dependent variable) and the market potential of the territory, number of dealers of the company in the territory, number of sales person in the territory, number of service people in the territory, number of existing customers in the territory (independent variables). Index of the competitor activity in the territory and sales are negatively correlated.
FACTOR ANALYSIS
OBJECTIVE:
To find the factors which are fewer but linear combinations of original 10 variables.
PROBLEM:
A two wheeler manufacturer is interested in determining which variables potential customers think about when they consider his product. Twenty two-wheeler owners were surveyed by the manufacturer. They were asked to indicate on a 7 point scale, 1- Completely agree to 7- Completely disagree. Their agreement or disagreement with a set of 10 statements relates to their perception and some attributes of the two-wheeler. Use factorial analysis to find underlying factors which are fewer but are linear combinations of original 10 variables.
TEN STATEMENTS:
1. I use a two-wheeler because it is affordable. 2. It gives me a sense of freedom to own a two-wheeler. 3. Low maintenance cost makes a two-wheeler very economical in a long run. 4. Two-wheeler is essentially a mans vehicle. 5. I feel very powerful when I am on my two-wheeler. 6. Some of my friends who dont have their own vehicle is jealous of me. 7. I feel good whenever I see the ad of my two-wheeler. 8. My vehicle gives me a comfortable ride. 9. I think two-wheelers are safe way to travel. 10. Three people should be legally allowed to travel on a two-wheeler.
S.No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Q1 1 2 2 5 1 3 2 4 2 1 1 1 3 2 2 5 1 2 3 4
Q2 4 3 2 1 2 2 2 4 3 4 5 6 1 2 5 6 4 3 3 3
Q3 1 2 2 4 2 3 5 3 2 2 1 1 4 2 1 3 2 1 2 2
Q4 6 4 1 2 5 3 1 4 6 2 3 1 4 2 3 2 2 1 3 7
Q5 5 3 2 2 4 3 2 4 5 1 2 1 4 2 2 1 1 2 4 6
Q6 6 3 1 2 4 3 1 5 6 2 3 1 3 2 3 3 2 2 3 6
Q7 5 3 1 2 4 3 2 3 5 1 2 1 3 2 2 2 1 2 4 6
Q8 2 5 7 3 1 6 4 2 1 4 2 1 6 1 2 5 1 3 3 2
Q9 3 5 6 2 1 5 4 3 4 4 2 2 5 3 1 5 1 2 3 3
Q10 2 2 2 3 2 3 5 3 1 1 1 2 3 2 6 4 3 2 3 6
PROCEDURE:
1. The variables are defined in the variable view of the SPSS data editor.
3. Choose Analyze > Data reduction > Factor analysis from the main menu and enter the variables.
4. In the factor analysis dialogue box select the analysis variables and check the options as required.
OUTPUT:
Descriptive Statistics Mean It is affordable Gives sense of freedom Economical Man's vehicle Feel powerful Friends would be jealous Feel good to see ad of my vehicle Comfortable driving Safe way to travel 3 people should be legally allowed 2.35 3.25 2.25 3.10 2.80 3.05 2.70 3.05 3.20 2.80 Std. Deviation 1.309 1.482 1.118 1.804 1.508 1.605 1.455 1.905 1.508 1.473 Analysis N 20 20 20 20 20 20 20 20 20 20
It is affordable 1.000 Gives sense of freedom 1.000 Economical 1.000 Man's vehicle 1.000 Feel powerful 1.000 Friends would be jealous 1.000 Feel good to see ad of my 1.000 .955 vehicle Comfortable driving 1.000 .799 Safe way to travel 1.000 .777 3 people should be legally 1.000 .789 allowed Extraction Method: Principal Component Analysis.
Component Matrix (a) Component 2 .670 -.608 .820 -.036 .166 -.084 .096 .775 .735 .319
It is affordable Gives sense of freedom Economical Man's vehicle Feel powerful Friends would be jealous Feel good to see ad of my vehicle Comfortable driving Safe way to travel 3 people should be legally allowed
1 .176 -.136 -.107 .966 .951 .952 .971 -.322 -.069 .161
3 .493 .254 .218 -.097 -.136 -.025 -.046 -.308 -.482 .814
Extraction Method: Principal Component Analysis. 1. 3 components extracted. Component Score Coefficient Matrix Component 2 .023 -.278 .176 .003 .081 -.038 .026 .360 .406 -.203
It is affordable Gives sense of freedom Economical Man's vehicle Feel powerful Friends would be jealous Feel good to see ad of my vehicle Comfortable driving Safe way to travel 3 people should be legally allowed
1 .004 -.063 -.041 .256 .257 .245 .253 -.047 .033 -.032
3 .434 .043 .283 -.042 -.030 -.007 .014 -.057 -.166 .568
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Component Scores.
Component Score Covariance Matrix Compon ent 1 2 3 1 1.000 .000 .000 2 .000 1.000 .000 3 .000 .000 1.000
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Component Scores. INFERENCE:
Factor analysis is primarily used for data reduction or structure detection. 1. Communalities indicate the amount of variance in each variable that is accounted for. 2. Communalities table reports the factor loadings for each variable on the unrotated components or factors. 3. Rotated component matrix table (called the Pattern Matrix for oblique rotations) reports the factor loadings for each variable on the components or factors after rotation. 4. Group the factors which have high values. 5. Here mans vehicle, feel powerful, friend would be jealous and feel good to see ad of my vehicle have high value (greater than 0.5). So we can group them into component 1. 6. Similarly economical, comfortable driving and safe way to travel have high value and hence are grouped in component 2. 7. Finally, it is affordable and three people should be legally allowed are grouped into component 3. 8. Since value of sense of freedom is negative in all the three components this factor is eliminated.
DISCRIMINANT ANALYSIS
OBJECTIVE:
To conduct Discriminant Analysis for the given data using SPSS software
PROBLEM:
Conduct Discriminant Analysis that predicts membership of two groups based on the dependent variable category and creating the discriminant equation with inclusion of 17 independent variables selected by a step-wise procedure based on minimization of Wilks Lambda at each step. NULL HYPOTHESIS There is no discrimination in membership of two groups. ALTERNATE HYPOTHESIS There is discrimination in membership of two groups. PROCEDURE:
Classify Discriminant.
3. Define the range min:1 & max: 2. 4. Select all the variables as independent variables select Use stepwise method. 5. Click Statistics check Means, Univariate ANOVA, Boxs M and Under
standardized Continue. 6. Click select Wilks Lambda method and enter F value Entry: 1.15 and Exit: 1
Continue. 7. Click Classify Check All groups equal, Case wise results, Summary table, Combined-groups 8. Click OK. Continue.
OUTPUT:
Analy i
ase Pr
L g Determinants 1= OMPL TED PHD, 2=DID OT COMPLETE PHD FI ISH OT FI ISH Pooled wit in-groups Rank 6 6 6 Log Determinant 3.031 2.918 3.800
The ranks and natural logarithms of determinants printed are those of the group covariance matrices.
Test Results Box's M F Approx. df1 df2 Sig. 39.633 1.633 21 8474.108 .034
Unwei ted ases Valid xcluded Missing r t- f-range gr des At least ne missing discriminating variable Bot missing or out-of-range group codes and at least one missing discriminating variable Total Total
'&
2 25 432 2 32 2
%$$
&
0)
0 0 50
.0 .0 100.0
Tes s of Eq a Wilk ' La bda .795 .951 .998 .628 .974 .650 .756 .534 .679 1.000 .993 .768 .904 .787 .972
of G o p Means d1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 d2 48 48 48 48 48 48 48 48 48 48 48 48 48 48
VER LL C LLEGE G A MAJ R AREA G A GRE SC RE ON S EC AL Y EXAM GRE SCORE ON QUAN A VE GRE SCORE ON VER AL F RS LE ER OF RECOMMENDA ON SECOND LE ER OF RECOMMENDA ON H RD LE ER OF RECOMMENDA ON S UDEN S MO IVA ION S UDEN S EMO IONAL S ABILI Y FINAICIAL/ ERSONAL RESOURCES O COM LE E AGE IN YEARS A EN RY ABILI Y O IN ERAC EASILY RA ING OF S UDEN HOS ILI Y MEAN RA ING OF SELEC ORS IM RESSION OF A LICAN
F 12.356 2.493 .113 28.393 1.283 25.889 15.476 41.969 22.722 .007 .337 14.526 5.079 13.017
1.378
9 C
B8A@ 9
Sig. .001 .121 .738 .000 .263 .000 .000 .000 .000 .934 .564 .000 .029 .001 .246
H H
H H H G H
H F
HH
HH
HH
G G HD H H H H H H H H
H H
D H
FF F
H H H
48
z~
ct
m h j i o g r w z zy x rwvut
ri r f l s
is t ti
}q
i|
f
o|
f
sr
vvv
vvv us f d
ct
t
f
wxw ts t
f t tistic .
vvv us y
a,b,c,d
t r WS `YSX WS S VU TS
t r / v Variabl f
t
f
srq t
f
d ed e d e d d f i h qtt t d dd ff ee h c vrd d e e h p dehyedg c dg eyd cp iey d cp g pi f x pg cei p i yyfde he gdycde y p i pdfh q i d ip y pd s i dp e pbdi id hhegfe d y dc f r i p ee yh idb y e pbii d hhegd d fdec d b a yt y
t t c st . c. . . ,t i i l i i ri l, t l r l t T R I I R TI R T R T IR TT R I IR T TT R T T T R R rti l TR rti l TI R I R t I I T c , r I I TI TI T i i iz s t R r f st TI t t r t r t r is . s is i s ffici is . . r ll . t f r f rt ilks' rc t r .
vvv
vvv rs
.
xss rt
.
vvv us
.
usr
t t t t
s r
s r
c
ilks'
Stepwise Statistics:
t tistic
i . . . . . . .
m h j i o g
kj kj kj kj kj kj
g g g g g g
m h j i o g
m h j i o g q
i .
. . . . . .
Va ab es n he Ana s s
HIRD LE ER OF RECOMMENDA ION HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION AGE IN YEARS A EN RY HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION AGE IN YEARS A EN RY MEAN RA ING OF SELEC ORS IM RESSION OF A LICAN HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION AGE IN YEARS A EN RY MEAN RA ING OF SELEC ORS IM RESSION OF A LICAN FINAICIAL/ ERSONAL RESOURCES O COM LE E
1.000 .987 .987 .957 .934 .909 .955 .913 .908 .969 .935 .898 .897 .943 .926 .935 .882 .892 .885 .926
41.969 23.774 8.633 15.183 4.617 3.943 12.842 3.122 3.418 2.733 13.679 3.642 2.417 3.452 2.941 .679 .534 .552 .457 .451 .503 .419 .421 .415 .481 .397 .387 .396 .391 .448 .381 .369 .385 .370
.897
2.372
S ep 1
ole an e
F to Re ove
.367
Eigenvalues Function 1 Eigenvalue % of Variance 1 876a 100 0 Cumulative % 100 0 Canonical Correlation 808
Wilks' Lambda
FI
T L TT F C MM ND TI N
THI D L TT F C MM ND TI N TUD NT M TI
TI N
G IN
RS T NTRY
.316
Test of Function(s) 1
Chi-square 47 541
df
Sig 000
Str ct re Matrix Function 1 THIRD LETTER OF RECOMMENDATION RE SCORE ON a UANTATIVE FIRST LETTER OF RECOMMENDATION STUDENTS MOTIVATION A E IN YEARS AT ENTRY RATIN OF STUDENT a HOSTILITY SECOND LETTER OF a RECOMMENDATION a OVERALL COLLE E A ABILITY TO INTERACT a EASILY a MA OR AREA A RE SCORE ON a S ECIALITY EXAM MEAN RATIN OF SELECTORS IM RESSION OF A LICANT a RE SCORE ON VERBAL FINAICIAL/ ERSONAL RESOURCES TO COM LETE STUDENTS EMOTIONAL a STABILITY .683 .547 .536 .502 .402 -.335 .278 .237 .178 .129 -.126
ooled within-groups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function. a. This variable not used in the analysis.
Canon ca D sc
F nc ons a G o p Cen o s
NOT FINISH
Un tanda di ed oe i ient
FIRST LETTER OF RECOMMENDATION THIRD LETTER OF RECOMMENDATION STUDENTS MOTIVATION FINAICIAL/ ERSONAL RESOURCES TO COM LETE AGE IN YEARS AT ENTRY MEAN RATING OF SELECTORS IM RESSION OF A LICANT Con tant)
Casew se S a s cs Highe t Group Se ond Highe t Group Squared Mahalanobi Di tan e to Centroid .645 .150 .007 .149 .097 1.812 .004 .090 .599 .068 .891 .246 .184 .769 .318 1.386 .062 3.393 1.436 .001 .141 .173 .301 1.118 .001 .214 .034 .278 .306 1.184 .015 .348 .982 .000 .617 .706 .893 .207 .064 .148 .966 2.054 .000 1.629 .001 .140 .018 .634
Ca e Numbe 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
A tual G oup 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 2 2 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
P(G=g | D=d) .997 .990 .967 .929 .941 .999 .968 .988 .997 .987 .998 .906 .991 .777 .994 .609 .986 1.000 .595 .976 .931 .991 .994 .998 .975 .992 .957 .993 .893 .664 .964 .883 .998 .973 .997 .997 .744 .992 .949 .990 .724 .999 .972 .999 .976 .990 .962 .812
Group 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
P(G=g | D=d) .003 .010 .033 .071 .059 .001 .032 .012 .003 .013 .002 .094 .009 .223 .006 .391 .014 .000 .405 .024 .069 .009 .006 .002 .025 .008 .043 .007 .107 .336 .036 .117 .002 .027 .003 .003 .256 .008 .051 .010 .276 .001 .028 .001 .024 .010 .038 .188
P(D>d | G=g) p df .422 .698 .934 .699 .755 .178 .947 .765 .439 .795 .345 .620 .668 .381 .573 .239 .804 .065 .231 .970 .708 .677 .583 .290 .980 .644 .854 .598 .580 .277 .903 .555 .322 .999 .432 .401 .345 .649 .800 .701 .326 .152 .986 .202 .973 .709 .892 .426
Di
Squared Mahalanobi Di tan e to Centroid 12.160 9.436 6.766 5.281 5.629 16.244 6.851 8.901 11.958 8.669 13.165 4.788 9.693 3.266 10.551 2.271 8.601 20.487 2.207 7.408 5.332 9.611 10.450 13.998 7.340 9.902 6.251 10.316 4.541 2.547 6.564 4.385 13.508 7.195 12.037 12.422 3.025 9.852 5.909 9.414 2.894 16.953 7.108 15.687 7.385 9.350 6.495 3.565
INFERENCE:
1. The F and significant F values identify for which variables the two groups differ significantly. 2. The canonical correlation coefficient is .808 which shows strong correlation. 3. The significance values are <0.05
RESULT:
f ri i
l r
c s s c rr ctly cl ssifi
ri i
INISH 22 2 88. 8.
a sults
Pr
ict
r rs i N T INISH 3 23 2. 2.
T t l 25 25 . .
CLUSTER ANALYSIS
PROBLEM: Brands of 21 VCRs are given along with their attributes. Determine the hierarchical K-means cluster analysis.
PROCEDURE: 1. 2. 3. 4. 5. 6. Select Analyze Classify K-means cluster Select all variables and move into the variables boz. Label case as brand Enter number of clusters as 3 Then select the required options from Save. Then Continue Click Options check initial cluster centre continue ok
Iteration 1 2 3
Change in Cluster Centers 1 2 3 52.791 70.267 80.393 14.037 12.547 .000 .000 .000 .000
a. Convergence achieved due to no or small change in cluster centers. The maximum absolute coordinate change for an center is .000. The current iteration is 3. The minimum distance between initial centers is 335.404. Nu ber f Cases in each C uster Cluster 1 2 3 8.000 8.000 5.000 21.000 .000
Valid Missing
pric pict r1 pict r2 pict r3 pict r4 pict r5 pr r r c pt1 r c pt3 audi 1 audi 2 audi 3 f atur s ts days r t 1 r t 2 r t 3 extras1 extras2 extras3
200 3 3 3 3 3 3 4 2 4 4 4 4 8 365 3 3 3 3 3 3
3 380 2 2 2 2 2 3 3 3 4 4 4 4 4 30 4 4 4 6 6 6
1 price pictur1 pictur2 pictur3 pictur4 pictur5 pr ram recept1 recept3 audi 1 audi 2 audi 3 features events days remote1 remote2 remote3 extras1 extras2 extras3 239 3 3 3 3 3 3 4 2 4 4 4 4 8 365 3 3 3 3 3 3
3 460 4 4 4 4 4 3 3 3 4 3 4 4 6 30 4 4 4 10 10 10
INFERENCE: Three clusters were formed. RESULT: Thus cluster analysis was done using SPSS.
REGRESSION
OBJECTIVE:
To find the dependency of the variables with respect to the sales of the company.
PROBLEM:
A manufacturer and the marketer of electric motors would like to build a regression model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15 sales territories, on sales and 6 different independent variables. Build a regression model and recommend whether or not it should be used by the company.
Dependent Variable:
Independent Variables:
X1 X2 X3 X4 X5 X6
Market potential of the territory. No. of dealers of the Company in the territory. No. of sales person in the territory. Index of the competitor activity in the territory on a 5 point scale. No. of service people in the territory. No. of existing customers in the territory.
S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sales (Y) 5 60 20 11 45 6 15 22 29 3 16 8 18 23 81
Dealers (X2) 1 12 5 2 12 3 5 7 4 1 4 2 7 10 15
People (X3) 6 30 15 10 20 8 18 16 15 6 11 9 14 10 35
Competition (X4) 5 4 3 3 2 2 4 3 2 5 4 3 3 4 4
Service (X5) 2 5 2 2 4 3 5 6 5 2 2 3 4 3 7
Customers (X6) 20 50 25 20 30 16 30 40 39 5 17 10 31 43 70
Null Hypothesis There is no dependence between the independent variables, market potential, no. of dealers, no. of sales person, index of the competitor activity, no. of service people, no. of existing customers and the dependent variable sales at 95% confidence interval.
Alternative Hypothesis There is dependence between the independent and dependent variables at 95% confidence interval.
PROCEDURE: Let the estimating equation be Y= a1X1+a2X2+a3 X3+a4X4+a5 X5 1. The variables are defined in the variable view of the SPSS data editor. 2. Enter the data in the data view. 3. Choose Analyze >Regression > Linear from the main menu. 4. In the linear regression dialogue box enter sales as the dependent variable and all the other variables as the independent variables. 5. Click the statistics button and click the regression coefficient estimates, model fit and descriptive check boxes. 6. The output chart is generated and it is analyzed and inference obtained.
OUTPUT: Model Summary (b) Mode l 1 Adjusted R Square R Square .977 .960 Std. Error of the Estimate 4.391
R .989(a)
a Predictors: (Constant), No. of existing customers in the territory, Index of competitor activity in the territory, No. of service people in the territory, Market potential in the territory (in Rs. lakhs), No. of dealers of the company in the territory, No. of sales people in the territory b Dependent Variable: Sales in Rs.lakhs in the territory
ANOVA (b) Mode l 1 Regressi on Residual Total Sum of Squares 6609.48 5 154.249 6763.73 3 df 6 8 14 Mean Square 1101.581 19.281 F 57.1 33 Sig. .000( a)
a Predictors: (Constant), No. of existing customers in the territory, Index of competitor activity in the territory, No. of service people in the territory, Market potential in the territory (in Lakhs), No. of dealers of the company in the territory, No. of sales people in the territory b Dependent Variable: Sales in Lakhs in the territory
Coefficients (a) Mod el Standardi ed Coeffi ie nts Beta -.546 .439 .600
Sig.
95% Confidence Interval for B Lower Bound -16.579 .055 Upper Bound 10.233 .399
(Constant Market potential in the territory (in Rs. .227 .075 Lakhs) No. of dealers of the company in the .819 .631 territory No. of sales people 1.091 .418 in the territory Index of competitor activity in the -1.893 1.340 territory No. of service people in the -.549 1.568 territory No. of existing customers in the .066 .195 territory
3.040 .016
-.041
-.350
.735
-4.166
3.067
.050
.338
.744
-.384
.516
a Dependent Variable: Sales in Rs.lakhs in the territ ry Therefore the esti ating equation is: Y= 0.439X1+0.164X2+0.414X3-0.085X4-0.041X5+0.05X6
INFERENCE: 1. The variables are selected using the enter method. 2. The values of R ranging from 0 to 1 are determined. Larger values indicate stronger relationship. 3. The significance value .000 arrived through ANOVA is less than 0.05. So null hypothesis is rejected and alternate hypothesis is accepted. Hence the independent variables explain the variations of the dependent variable. 4. The t statistics shows the relative importance of each variable with respect to the regression coefficients where t values below -2 or above +2 are good predictors. 5. The t statistic and its significance value are used to test the null hypothesis that the regression coefficient is zero (or that there is no linear relationship between the dependent and independent variable). 6. The significance levels of the market potential (0.016) and no of sales people (0.031) are less than 0.05. So null hypothesis is rejected and alternate hypothesis is accepted. Hence the variables are linearly related with sales. 7. The significance level of the number of dealers (0.230), index of competitor (0.195), no of service people (0.735) and no. of existing customer (0.744) is greater than 0.05. So null hypothesis is not rejected and alternate hypothesis is not accepted. Hence the variables are not linearly related. 8. Residuals are estimates of the true errors in the model. The residual statistic gives the difference between the observed value of the dependent variable and the value predicted by the model. 9. Since the residual value (154.249) is less than regression value (6609.485) the estimating equation is the best fit. 10. Since the significance value is less than 0.05 the estimating equation is the best fit. 11. Since the model is appropriate for the data, the residuals follow a normal distribution as indicated by a histogram. RESULT There is dependency between the sales (dependent variable) and the market potential of the territory and number of sales person in the territory (independent variables). The other independent variables, number of dealers, number of service people, index of the competitor activity and number of existing customers in the territory has a non-linear relationship with sales.The estimating equation is : Y= 0.439X1+0.164X2+0.414X3 -0.085X4-0.041X5+0.05X6
CORRELATION
OBJECTIVE:
To find the interrelationship between the dependent and the independent variables.
PROBLEM:
A manufacturer and the marketer of electric motors would like to build a regression model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15 sales territories, on sales and 6 different independent variables. Build a regression model and recommend whether or not it should be used by the company.
Dependent Variable:
Independent Variables:
X1 X2 X3 X4 X5 X6
Market potential of the territory. No. of dealers of the Company in the territory. No. of sales person in the territory. Index of the competitor activity in the territory on a 5 point scale. No. of service people in the territory. No. of existing customers in the territory.
S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sales (Y) 5 60 20 11 45 6 15 22 29 3 16 8 18 23 81
Dealers (X2) 1 12 5 2 12 3 5 7 4 1 4 2 7 10 15
People (X3) 6 30 15 10 20 8 18 16 15 6 11 9 14 10 35
Competition (X4) 5 4 3 3 2 2 4 3 2 5 4 3 3 4 4
Service (X5) 2 5 2 2 4 3 5 6 5 2 2 3 4 3 7
Customers (X6) 20 50 25 20 30 16 30 40 39 5 17 10 31 43 70
Null Hypothesis
There is no significant relationship between the independent and the dependent variables at 95% confidence interval.
Alternative Hypothesis There is significant relationship between the independent and dependent variables at 95% confidence interval.
PROCEDURE: Let the estimating equation be Y= a1X1+a2X2+a3 X3+a4X4+a5 X5 8. The variables are defined in the variable view of the SPSS data editor. 9. Enter the data in the data view. 10. Choose Analyze > Correlate > Bivariate from the main menu. 11. In the bivariate correlations dialogue box select all the dependent and independent variables. 12. Select Pearsons correlation coefficient with test of significance being one tailed. 13. Also include the statistics for mean and standard deviation. 14. The output chart is generated, analyzed and inference obtained.
OUTPUT:
Descriptive Statistics Std. Deviation 21.980 42.543 4.408 8.340 .986 1.633 16.829
Mean Sales in Rs.lakhs in the territory Market potential in the territory (in Rs. lakhs) No. of dealers of the company in the territory No. of sales people in the territory Index of competitor activity in the territory No. of service people in the territory No. of existing customers in the territory 24.13 55.80 6.00 14.87 3.40 3.67 29.73
N 15 15 15 15 15 15 15
CHI-SQUARE TEST
OBJECTIVE:
To find out whether there is a significant association between the income of the individuals and intention to purchase.
PROBLEM:
A customer survey was conducted for a brand of detergent. One of the questions dealt with the income category and the other asked the respondent to rate his purchase intention. These two variables are listed in the table below. Both variables are coded as follows:
INCOME TABLE
CODE 1 2 3 4
PURCHASE TABLE
Code 1 2 3 4 5
S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Income <5000 <5000 <5000 <5000 <5000 5001-10000 5001-10000 5001-10000 5001-10000 5001-10000 10001-20000 10001-20000 10001-20000 10001-20000 10001-20000 >20000 >20000 >20000 >20000 >20000
Code 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4
Intent None Low Low None High Low High Very high High Low High Very high Certain High Very high High Certain Very high Certain Certain
Intent Code 1 2 2 1 3 2 3 4 3 2 3 4 5 3 4 3 5 4 5 5
Null Hypothesis
Alternate Hypothesis There is significant association between income and purchase intention.
PROCEDURE: 1. The field names and the corresponding data types are entered in the variable view with the income and purchase intention in nominal measure. 2. The given data is entered in the data view. 3. Choose Analyze > Descriptive statistics > Cross tabs > statistics from the main menu. 4. Select Chi-square test and the required cells are checked. 5. Select income of the respondent (the independent variable) into the rows option and the intention of the respondents (the dependent variable) into the columns option. 6. The value and significance column are compared from the output and the inference is made.
OUTPUT: CHI-SQUARE TEST Case Processing Summary Cases Missing N Percent 0 .0%
N 20
Income of the respondent * Intention to purchase Cross tabulation Count None Income of the respondent <5000 5001-10000 1000120000 >20000 Total 2 0 0 0 2 Intention to purchase Very Low High high 2 1 0 2 2 1 0 0 4 2 1 6 2 1 4 Total Certain 0 0 1 3 4 None 5 5 5 5 20
Chi-Square Tests Value Pearson Chi-Square Likelihood Ratio Linear-by-Linear Association N of Valid Cases 18.667(a) 21.134 11.790 20 Df 12 12 1 Asymp. Sig. (2-sided) .097 .048 .001
a. 20 cells (100.0%) have expected count less than 5. The minimum expected count is .50.
Phi Cramer's V
a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis.
Bar Chart
Intention to purchase
none low high very high certain
Count
1 0 <5000 5001-10000 10001-20000 >20000
INFERENCE:
1. The Chi-square tests the hypothesis that the row and column variables in a cross tabulation are independent. 2. The significance value 0.097 is greater than 0.05 3. So null hypothesis is not rejected and alternate hypothesis not accepted. Hence there is no association between the two variables, Income and the intention. 4. The nominal directional measures indicate both the strength and significance of the relationship between the row and column variables of the cross tabulation. 5. The value of each statistic can range from 0 to 1 and indicates the proportional reduction in error in predicting the value of one variable based on the value of other variable. 6. Also the significance value is greater than 0.05 indicating that there is no relationship between the two variables. 7. Hence the two attributes are independent.
RESULT:
There is no association between the income and the purchase intention of the individual.