Project of Harish

Factors Affecting GDP, using 3 country data
A minor research project

(In partial fulfilment of the requirements of the course: MSTA 427)
Submitted by HARISH KUMAR Under the supervision of Dr. R.N. Rattihali sir
Department of Statistics Central University of Rajasthan Kishangarh-305801 April 2012
CERTIFICATE
This is to certify that the minor research project report entitled .submitted by........... a student of M.A./M.Sc.Statistics (Actuarial) IV semester. This is a record of bonafied work carried out by him, under my guidance as part of the course: MSTA 427. To the best of my knowledge, the report presented in the project has not been submitted earlier for the award of any degree/diploma.
Place: KishangarhProject supervisor Date: Name:
INDEX
1) Introduction and description of the problem 2) Objectives 3) Formulation: Model, Hypothesis, etc. 4) Definitions 5) Illustrations 6) Review of the work with references 7) Methodology (Statistical Tools, packages, graphs, etc. used) 8) Data: Main source (which has been analysed), other sources 9) Data Analysis 10) Conclusions 11) Major findings 12) References
The introducntion to the problem : Gross domestic product (GDP) refers to

the market value of all officially recognized final goods and services produced within a country in a given period. The GDP is an economic indicator to guess the
economic health of a country. The objective of this project is to give an analysis to the factors affecting the GDP in India
Formulation and model : Although there so many factors affecting GDP of any country but the factors used in this project are Crop production Percentage change in industrial production Inflation rate Interest rate Taxes on goods and services
Methodology and statistical tools used:The methodology and statistical tools used were regression analysis, testing of hypothesis and some nonparametric tests.
A brief introduction to regression analysis: Regression analysis is the most often applied technique of statistical analysis and modeling , widely used technique for analyzing multifactor data and for modelling the relationship between response variable (dependent variable)and regress variable (independent variable). In general, it is used to model a response variable (Y) as a linear function of one or more regress variables (X1, X2... Xp). Here linearity means linearity in parameters. If there is only one regress variable then it is called simple linear regression otherwise it is called multiple linear regression. The notations for expressing the linear regression is generally given as
Yi = 0 + 1Xi + i (simple linear regression) Yi = 0 + 1X1i + 2X2i + ... +pXpi + i (multiple linear regressions). The e term in the model is referred to as a random error term which may be due to various causes and may have following some particular statistical distribution. Model Assumptions:We assume that error term is following normal distribution with common mean zero and common variance 2and are independent to each other.
Estimation of the parameters:in regression analysis our interest lies in estimating the best fitted model.Ordinarily the regression coefficients (the s) are of unknown value and must be estimated from sample information. There are wellestablished statistical/ mathematical methods for determining these estimates. The generally used methods are 1. Method of least square 2. Method of maximum likelihood. The resulting estimated model is
The random error then is estimated by Estimation of parameters in multiple linear regressions:Method of least square :- let us assume that n observations were taken on p regress variables and assuming errors to be i.i.d. N(0,2). Then the model can be expressed as Y1 = 0 + 1X11 + 2X22 + ... +pXp1 + 1 Y2 = 0 + 1X12 + 2X22 + ... +pXp2 + 2 . . . . . . . . . . . .
Yi = 0 + 1X1i + 2X2i + ... +pXpi + i . . . . . . .. . .
Yn = 0 + 1X1n + 2X2n + ... +pXpn + n
The above model can be written in the matrix form as Y = X + Where Y is a nx1 vector is (p+1)x1 vector X is a nx(p+1) vector. Then the least square estimate of regression coefficient vector is given by Analysis of Variance for regression The ANOVA is used to test whether there is a linear relationship between response variable Yi and regress variables Xi. The ANOVA table in multiple linear regressions is given as following. Source of Variation Regression Residuals Total Sum of squares SSR SSRES SST Degrees of freedom p n-(p+1) n-1 Mean Square MSR MSRES F0 MSR/ MSRES
Where the above notation are explained as following SSR = sum of squares due to regress variables SSRES = sum of squares due to residuals SST = Total sum of squares MSR = SSR/p = Mean square regressor Here
SSRES = YY - SST= YY
One another method for testing the model adequacy is R2 which is also known as the coefficient of determination. Value of R2lies between 1and 0. Higher the value of R2 the model fitted is considered to be better. The value of R2 is given as
Are the residuals (or errors) approximately normally distributed? A variety of methods are available for checking this regression assumption: Durbin Watson test Anderson Daring test Chi Square test Testing the assumption for errors to be uncorrelated or independent can be carried out by using a non-parametric test Durbin Watson Test The underlying Hypothesis is H0 : = 0 H0 : > 0 The test statistics is If d < dL If d > dU reject H0 do not reject H0
The Packages and statistical soft wares used are SPSS MS- Excel
The data collection: The secondary data for 10 year (1999-2009) was collected from the following sources www.data.wordbank.com www.tradingeconomics.com www.rbi.org.in
Objectives:
To fit a multiple linear regression model considering GDP as a

dependent variable
Testing the significance of the regression by constructing the

ANOVA table. i.e. testing the hypothesis H0 : 1= 2 = 3 = 4 = 5 =0 H1 : j 0 for at least one j
Testing the significance of the model by computing R2. Constructing 95 % confidence interval for estimates. Testing Normality assumption for errors
DATA ANALYSIS The 15 year data (1994 - 2009) of three countries is as following

Project of Harish

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project of Harish

Uploaded by

Copyright:

Available Formats

Factors Affecting GDP, using 3 country data

A minor research project

Department of Statistics Central University of Rajasthan Kishangarh-305801 April 2012

Place: KishangarhProject supervisor Date: Name:

The introducntion to the problem : Gross domestic product (GDP) refers to

Yi = 0 + 1X1i + 2X2i + ... +pXpi + i . . . . . . .. . .

Yn = 0 + 1X1n + 2X2n + ... +pXpn + n

To fit a multiple linear regression model considering GDP as a

Testing the significance of the regression by constructing the

You might also like