You are on page 1of 5

Population in Ukraine 1996-2014 dependent variable

Migration, GDP, Number of hospitals established, Number of the


medical personnel, Criminality, Total number of deaths, Morbidity,
Average salary, The number of birth, Employment - a list of the
explanatory variables

The multiple regression analysis allows to measure how much of the


variation in a series can be attributed to another series.

Goodness-of-fit-measures:
Multiple R:
Tells how strong the linear relationship is. R = square root of R 2. In this case
approaches to 1 (0,99), so the relationship is positive.
Adjusted R square:
Adjusted R-squared, incorporates the models degrees of freedom. It is
interpreted as the proportion of total variance that is explained by the model.
97,55% of total variance is explained by regression model.
R Square:
Coefficient of determination. This shows how well our model predicts the
quantity of population. 98,91% of the variation of the population quantity is
explained by the regression model.
Standard error:
The observed values of the quantity of population differ from the theoretical
(or estimated) values, on average, by 285,27 thousands of people.
Observations:
Number of observations used in the regression (n) = 19
The ANOVA (analysis of variance) table splits the sum of
squares into its components.
SS = Sum of Squares.
Regression MS = Regression SS / Regression degrees of freedom.
Residual MS = mean squared error (Residual SS / Residual degrees of
freedom).
F: Overall F test for the null hypothesis.
Significance F: The significance associated P-Value.
These are the categories we will examine: Regression, Residual, and Total.
The Total variance is partitioned into the variance which can be explained by
the independent variables (Regression) and the variance which is not
explained by the independent variables (Residual, sometimes called Error).

Regression degree of freedom (df) = 10, means the number of


explanatory variables (X).
Df is he number of independent values or quantities which can be assigned
to a statistical distribution. If the amount of data isn't sufficient for the
number of terms in the model, there may not even be enough degrees of
freedom (DF) for the error term and no p-value or F-values can be calculated
at all. You'll get output something like this: Error 0 0 0
Residual degree of freedom = n-k-1 = 19-10-1 = 8

Total degree of freedom = n-1 = 19-1 = 18

Regression Sum of Squares (SS) - the Mean Sum of Squares between the
groups. It quantifies the variability between the groups of interest.
Residual Sum of Squares (SS) - the Error Mean Sum of Squares. It
quantifies the variability within the groups of interest.
Total Sum of Squares (SS)

Regression mean squared error (MS) - the Sum of Squares divided by


their respective DF
Residual mean squared error (MS) - the Sum of Squares divided by their
respective DF

F F test
The F column, not surprisingly, contains the F-statistic. Because we want to
compare the "average" variability between the groups to the "average"
variability within the groups, we take the ratio of the Between Mean Sum of
Squares to the Error Mean Sum of Squares. That is, the F-statistic is
calculated as F = MSB/MSE.
Since the p-value (significance F) = 0,00000094 < .05 = , we conclude
that the regression model is a significantly good fit; i.e. there is only a
0.000094 % possibility of getting a correlation this high assuming that the
null hypothesis is true.

Coefficients give the least squares estimate. Coefficients parameters of


the equation
Equation: y= 51663,4267- 0,00084 x1 - 0,00180 x2 + 2687,1508 x3 -
24,9280 x4 + 2,0763 x5 -7,1064 x6 - 127,251 x7 + 600,9752 x8 -
9,5511 x9 + 0,3225 x10
Using the equation, we can predict the quantity of population in Ukraine.

The interception of column Coefficients and raw Intercept indicates what


values gains y if all the variables in the model will be equal to 0. In our case
intercept coefficient is equal to 51663,4267. It describes the quantity of
population in Ukraine if all the other factors will be equal to 0.
Standard error:
The observed values of population quantity differ from the theoretical (or
estimated) values, on average, by 5 266,5056 mln of people in case all the
variables equal to 0.

For every unit increase in migration a 0,000847-unit decrease in


population is predicted, holding all other variables constant.
Migration Population
For every unit increase in GDP a 0,001802-unit decrease in population
is predicted, holding all other variables constant. GDP
Population
For every unit increase in of hospitals a 2687,15083-unit increase in
population is predicted, holding all other variables constant. Hospitals
Population
For every unit increase in of medical personnel a 24,9280-unit
decrease in population is predicted, holding all other variables
constant. Personnel Population
For every unit increase in of crimes a 2,07631-unit increase in
population is predicted, holding all other variables constant. Crimes
Population
For every unit increase in death a 7,10640-unit decrease in population
is predicted, holding all other variables constant. Death
Population
For every unit increase in morbidity a 127,2518-unit decrease in
population is predicted, holding all other variables constant. Morbidity
Population
For every unit increase in minimum salary a 600,975-unit increase in
population is predicted, holding all other variables constant. Salary
Population
For every unit increase in natality a 9,551121-unit decrease in
population is predicted, holding all other variables constant. Birth
Population
For every unit increase in employment a 0,32252-unit increase in
population is predicted, holding all other variables constant.
Employment Population
Standard error - these are the standard errors associated with the
coefficients. Shows for how much the observed values differ on average from
the theoretical (or estimated) values.

t Stat this column shows the 2-tailed p-values used in testing the null
hypothesis that the coefficient (parameter) is 0. Using an alpha of 0.05:
The coefficient for intercept is significantly different from 0 because
its p-value is 0,00000094, which is smaller than 0.05.
The coefficient for Number of hospitals established is significantly
different from 0 because its p-value is 0,042, which is smaller than
0.05.
The rest values are not statistically significant at the 0,05 level since the p-
values are greater than 0,05.
The p-values for all the coefficients with the exception of the coefficient for
Number of hospitals established are bigger than 0.05. This means that we
cannot reject the hypothesis that they are zero (and so can be eliminated
from the model). This is also confirmed from the fact that 0 lies in the
interval between the lower 95% and upper 95% (i.e. the 95% confidence
interval) for each of these coefficients.

95% Confidence Interval these are the 95% confidence intervals for the
coefficients. Confidence intervals help to put the estimate from the
coefficient into perspective by seeing how much the value could vary.

You might also like