You are on page 1of 9

NAME

ROLL
NO

Pooja Dutt

1411008323

1 Statistics plays a vital role in almost every facet of human life. Describe
the functions of Statistics. Explain the applications of statistics.
Meaning of statistics
Functions of statistics
Applications of statistics
Answer: According to Seligman, Statistics is a science which deals with the
method of collecting, classifying, presenting, comparing and interpreting the
numerical data to throw light on enquiry.
Functions of Statistics
To represent facts in the form of numerical data: The first function of statistics
is to present a given problem in terms of numerical figures. We know that the
numerical presentation helps in having a better understanding of the nature of
problem.
To condense and summarize a mass of data: Generally, the problem to be
investigated is represented by a large mass of numerical figures which are very
difficult to understand and remember. Using various statistical methods, this large
mass of data can be reduced to totals, averages, percentages, etc., and presented
either graphically or diagrammatically.
To facilitate comparison of data: Many times, the purpose of undertaking a
statistical analysis is to compare various phenomena by computing one or more
measures like mean, variance, ratios, percentages and various types of coefficients
because the absolute figures do not convey any significant meaning. But their
comparison helps us to draw the conclusion.
To formulate and test hypothesis for the purpose of co-relation: A
hypothesis is a statement about some characteristics of a population.
To forecast future trend: The success of planning by the Government or of an
organization depends to a large extent upon the accuracy of their forecasts.
Statistical method provides a scientific basis for making such forecasts.
Application of Statistics
In the field of medicine, statistical tools like t-tests are used to test the efficiency of
the new drug or medicine. In the field of economics, statistical tools such as index
numbers, estimation theory and time series analysis are used in solving economic
problems related to wages, price, production and distribution of income. In the field
of agriculture, an important concept of statistics such as analysis of variance
(ANOVA) is used in experiments related to agriculture, to test the significance
between two sample means. Insurance companies decide on the insurance
premiums based on the age composition of the population and the mortality rates.
Actuarial science is used for the calculation of insurance premiums and dividends.
Statistics is a part of Economics, Commerce and Business. Statistical analysis of the
variations in price, demand and production are helpful to both businessmen and
economists. Cost of living index numbers help governments in economic planning
and fixation of wages. A governments administrative system is fully dependent on
production statistics, income statistics, labour statistics, economic indices of cost,

and price. Economic planning of any nation is entirely based on the statistical facts.
Cost of living index numbers are also used to estimate the value of money. In
business activities, analysis of demand, price, production cost, and inventory costs
help in decision making. Management of limited resources and labour needs
statistical methods to maximize profit.

2 a) Explain the approaches to define probability.


b) State the addition and multiplication rules of probability giving an
example of each case.
a) Explanation of the approaches to define probability
b) Addition and multiplication rules of probability giving an example of each
Answer: a) There are four approaches to define Probability. They are:
1) Classical / mathematical / priori approach
Under this approach the probability of an event is known before conducting the
experiment. In this case, each of possible outcomes is associated with equal
probability of occurrence and number of outcomes favorable to the concerned event
is known.
Let a random experiment have n equally likely, mutually exclusive and exhaustive
outcomes. Let m of these outcomes be favorable to an event A.
Then, probability of A is
P(A) = Number of favorable outcomes
Total number of outcomes
P(A) = m / n
Where, m is the number of favorable outcomes, n is the total number of
outcomes of the experiments.
2) Statistical / relative frequency / empirical / posteriori approach
Under this approach the probability of an event is arrived at after conducting an
experiment. If we want to know the probability that a particular household in an
area will have two earning members, then we have to gather data on all households
in that area and then arrive at the probability. Greater the number of households
surveyed, greater will be the accuracy in the probability, arrived.
3) Subjective approach
Under this approach the investigator or researcher assigns probability to the events
either from his experience or from past records. It is more suitable when the sample
size is ten or less than ten. The investigator has full knowledge about the
characteristics of each and every individual. However, there is a chance of personal
bias being introduced in such probability.
4) Axiomatic approach
Let S be a sample space consisting of all events of a random experiment and A S ,
then the probability of an event A is a set function satisfying the following axioms:
i) Axioms of positivity, P(A) 0
ii) Axiom of certainty, P(S) = 1
b) Addition rule
The addition rule of probability states that:
i) If A and B are any two events then the probability of the occurrence of either A or B is
given by:

=+
ii) If A and B are two mutually exclusive events then the probability of occurrence of either A
or B is given by:
= +
iii) If A, B and C are any three events then the probability of occurrence of either A or B or
C is given by:
C= ++CCC+C
Multiplication rule
If A and B are two independent events then the probability of occurrence of A and B is given
by:

=
3 a) The procedure of testing hypothesis requires a researcher to adopt
several steps. Describe in brief all such steps.
b) Explain the components of time series.
a) Hypothesis testing procedure
b) Components of time series
Answer: a) Procedure of testing hypothesis
Null and Alternate hypothesis
In hypothesis testing, we must state the assumed or hypothesized value of the
population parameter before we begin sampling. The assumption we wish to test is
called the null hypothesis and is symbolized by H0.
Interpreting the level of significance
The purpose of hypothesis testing is not to question the computed value of the
sample statistic but to make a judgment about the difference between that sample
statistic and a hypothesised value for population parameter.
Hypothesis are accepted and not proved
Even if our sample statistic does fall in the non-shaded region, this does not prove
that our null hypothesis (H0) is true; it simply does not provide statistical evidence
to reject it. Why? It is because the only way in which the hypothesis can be
accepted with certainty is for us to know the population parameter; unfortunately,
this is not possible. Therefore, whenever we say that we accept the null hypothesis,
we actually mean that there is no sufficient statistical evidence to reject it. Use of
the term accept, instead of do not reject, has become a standard practice. It means
that when sample data do not suggest us to reject a null hypothesis, we believe that
the hypothesis is true.
b) Components of time series
i) Long term trend or secular trend: It can be defined as a consistent long term
change in the average level of the forecast variable per unit of time. The steady
increase in the population of India recorded by the census department is an
example of secular trend.

ii) Seasonal variations: Seasonal variations are caused by the seasonal influence
(spring, summer autumn & winter) on business and economic activities. Seasonal
variations involve pattern of changes within a year that tend to be repeated from
year to year. For Example the hotel industry can expect a substantial increase in the
number of tourists during the spring & autumn every year. Similarly physician can
expect an increase in the number of flu cases during the summer .As they are
regular pattern they are useful in forecasting the future .
iii) Cyclic variations: The second type of variation is cyclic fluctuations which are
generally business cycles or the values of the variable under study tend to rise and
fall in line with the fluctuations of the business cycle.
iv) Random variations: This is the fourth type of change in time series analysis. In
many situations the value of a variable may be completely unpredictable, changing
in a random manner. Irregular variations describe such movements. This can occur
due to strikes, break down of plants, non-seasonal illness, and bad weather etc.
These variations either go very deep downward or too high to attain peaks abruptly.
4 a) What is a Chi-square test? Point out its applications. Under what
conditions is this test applicable?
b) Discuss the types of measurement scales with examples.
a) Meaning, applications and conditions
b) Types of measurement scales with examples
Answer: a) Meaning of Chi-square test
The Chi-square test is one of the most commonly used non-parametric tests in
statistical work. The Greek Letter 2 is used to denote this test. 2describe the
magnitude of discrepancy between the observed and the expected frequencies. The
value of 2 is calculated as:

Applications of Chi-square test


Tests for independence of attributes: In the test for independence, the null
hypothesis is that the row and column variables are independent of each other. The
hypothesis testing is done under the assumption that the null hypothesis is true.
Test of goodness of fit: The test of goodness of fit of a statistical model measures
how accurately the test fits a set of observations. This test measures and
summarises the differences if any, between the observed and expected values of
the considered statistical model. These test results are helpful to know whether the
samples are drawn from identical distributions or not.
Test for comparing variance: When we have to use ^2 as a test of population
variance, then,

Where s^2= variance of the sample


p^2= variance of the population
(n -1) = degrees of freedom, n being the number of items in the sample.
Conditions for applying the Chi-Square test

1. The frequencies used in Chi-Square test must be absolute and not in relative
terms.
2. The total number of observations collected for this test must be large.
3. Each of the observations which make up sample of this test must be independent
of each other.
4. As ^2 test is based wholly on sample data, no assumption is made concerning
the population distribution. In other words, it is a nonparametric-test.
5. ^2 test is wholly dependent on degrees of freedom. As the degrees of freedom
increase, the Chi-Square distribution curve becomes symmetrical.
6. The expected frequency of any item or cell must not be less than 5, the
frequencies of adjacent items or cells should be polled together in order to make it
more than 5.
7. The data should be expressed in original units for convenience of comparison and
the given distribution should not be replaced by relative frequencies or proportions.
8. This test is used only for drawing inferences through test of the hypothesis, so it
cannot be used for estimation of parameter value.

b) Types of measurement scales


Qualitative (categorical) data
Qualitative, also known as categorical data, cannot be measured on a numerical
scale (quantified). Examples of categorical variables are gender (male or female)
and size of T-shirt (XXS, XS, S, M, L, XL and XXL); yet, these two variables differ in a
sense; the first is said to be nominal or purely categorical whereas the second is
known as ordinal.
Nominal (purely categorical) data
Nominal variables allow for only qualitative classification. They can be measured
only in terms of whether the individual items belong to some distinctively different
categories; however, we cannot quantify or even rank order these categories. For
example, 2 individuals are different in terms of a certain variable (for example, they
are of different race), we cannot say which one has more of the quality represented
by the variable. Typical examples of nominal variables are gender, race, colour, city,
marital status, etc.
Ordinal data
Ordinal variables allow us to rank order the items we measure in terms of which has
less and which has more of the quality represented by the variable, however they
do not allow us to say how much more. A typical example of an ordinal variable is
the socioeconomic status of families. For example, we know that upper-middle is
higher than middle but we cannot say that it is, for example, 18% higher. Also, this
very distinction between nominal, ordinal, and interval scales itself represents a
good example of an ordinal variable. For example, we can say that nominal
measurement provides less information than ordinal measurement, but we cannot
say how much less or how this difference compares to the difference between
ordinal and interval scales.
Quantitative (numerical) data

Quantitative data can be easily measured on a numerical scale; variables which can
be quantified in terms of units are all quantitative. Examples of quantitative
variables are number of students per class and height (measured in centimetres).
Again, these two variables differ in their nature; the first is said to be discrete
whereas the second is continuous.
Discrete data: Discrete data occur as definite and separate values; a discrete
variable assumes values which are countable so that there are gaps between its
successive values. For example, when counting the number of children in a class,
we use numbers (0, 1, 2 n).
Continuous data: Continuous data occur as the whole set of real numbers or a
subset of it. In other words, there are no gaps between successive values so that a
continuous variable assumes all the values (including all the decimals) between
given boundaries. Temperature is a good example of a continuous variable though
thermometer readings are recorded to the nearest tenth of a degree (Centigrade or
Fahrenheit), temperature does not jump from, for example, 17.10 C to 17.20 C. It
passes through all the real numbers between these two values. Height, weight and
speed are also continuous variables. Continuous data can be measured on Interval
scale & Ratio scale.
5 Business forecasting acquires an important place in every field of the
economy. Explain the objectives and theories of Business forecasting.
Meaning of Business forecasting
Objectives of Business forecasting
Theories of Business forecasting
Answer: Business forecasting refers to the analysis of past and present economic
conditions with the object of drawing inferences about probable future business
conditions. The process of making definite estimates of future course of events is
referred to as forecasting and the figure or statements obtained from the process is
known as forecast; future course of events is rarely known. In order to be assured
of the coming course of events, an organised system of forecasting helps.
Objectives of Business forecasting
To a very large extent, success or failure would depend upon the ability to
successfully forecast the future course of events. Without some element of
continuity between past, present and future, there would be little possibility of
successful prediction. But history is not likely to repeat itself and we would hardly
expect economic conditions next year or over the next 10 years to follow a clear cut
prediction. Yet, past patterns prevail sufficiently to justify using the past as a basis
for predicting the future. A businessman cannot afford to base his decisions on
guesses. Forecasting helps a businessman in reducing the areas of uncertainty that
surround management decision making with respect to costs, sales, production,
profits, capital investment, pricing, expansion of production, extension of credit,
development of markets, increase of inventories and curtailment of loans. These
decisions are to be based on present indications of future conditions. However, we
know that it is impossible to forecast the future precisely. There is a possibility of
occurrence of some range of error in the forecast. Statistical forecasts are the
methods in which we can use the mathematical theory of probability to measure the
risks of errors in predictions.

Theories of Business Forecasting


Sequence or time-lag theory: This is the most important theory of business
forecasting. It is based on the assumption that most of the business data have the
lag and lead relationships, that is, changes in business are successive and not
simultaneous. There is time-lag between different movements.
Action and reaction theory: This theory is based on the following two
assumptions.
Every action has a reaction
Magnitude of the original action influences the reaction
Economic Rhythm Theory: The basic assumption of this theory is that history
repeats itself and hence assumes that all economic and business events behave in
a rhythmic order. According to this theory, the speed and time of all business cycles
are more or less the same and by using statistical and mathematical methods, a
trend is obtained which will represent a long term tendency of growth or decline. It
is done on the basis of the assumption that the trend line denotes the normal
growth or decline of business events.
Specific historical analogy: History repeats itself is the main foundation of this
theory. If conditions are the same, whatever happened in the past under a set of
circumstances is likely to happen in future also. A time series relating to the data in
question is thoroughly scrutinised such a period is selected in which conditions were
similar to those prevailing at the time of making the forecast. However, this theory
depends largely on past data.
Cross-cut analysis theory: This theory proceeds on the analysis of interplay of
current economic forces. In this method, the combined effects of various factors are
not studied. The effect of each factor is studied independently. Under this theory,
forecasting is made on the basis of analysis and interpretation of present conditions
because the past events have no relevance with present conditions.

6 a) What is analysis of variance? What are the assumptions of this


technique?
b) Three samples below have been obtained from normal populations with
equal variances. Test the hypothesis at 5% level that the population
means are equal.
A
B
C
8
7
12
10 5
9
7
10 13
14 9
12
11 9
14
[The table value of F at 5% level of significance for 1 = 2 and 2 = 12 is
3.88]

a) Meaning and Assumptions


b) Formulas/Calculation/Solution to the problem
Answer: a) Meaning of Analysis of Variance
Analysis of Variance (ANOVA) is useful in such situations as comparing the mileage
achieved by five different brands of gasoline, testing which of four different training
methods produce the fastest learning record, or comparing the first-year earnings of
the graduates of half a dozen different business schools. In each of these cases, we
would compare the means of more than two samples. Hence, in most of the fields,
such as agriculture, medical, finance, banking, insurance, education, etc., the
concept of ANOVA is used. In statistical terms, the difference between two statistical
data is known as variance.
Assumptions of Analysis of Variance
The samples are simple random samples.
The samples are independent of each other
The parent populations from which they are drawn are normally distributed.
b) b) Let H0: There is no significant difference in the means of three samples
The three samples
X1
X2
X3
8
7
12
10
5
9
7
10
13
14
9
12
11
9
14
X1= X2= X3=
50
40
60
T= Sum of all observations = 150
Correction factor = (T^2)/N = (150^2)/15
SST (Total Sum of the Squares)= Sum of squares of all observations - (T^2)/N
= 8^2 + 7^2 +12^2 +10^2 +..........+14^2-1500= 1600 -1500 =100
Sum of the Squares of Error between the columns (samples):

= 50^2 + 40^2 + 60^2 1500 = 1540 1500 = 40


5
5
5
Sum of the squares of the Error within columns (samples):
SSE = SST SSC = 100 40 = 60
Variance between samples:
MSC =SSE = 60 . = 5
(n-k) (15-3)
The degree of freedom = (k 1, n k) = (2, 12).
[k is the number of columns and n is the total number of observations]

ANOVA Table
Source
variation
Between

of

Sum
of
squares
SSC = 40

Df
2

Mean
square
MSC = 20

F-value
Fcal=20/5
=4

Within
SSE = 60
12 MSE = 5
Total
TSS = 100
14
F table value for degrees of freedom (2,12) [1 = 2, 2 =12] at 5% level of
significance is 3.88. Since F table value is smaller than the F calculated value, we
reject the null hypothesis and conclude that sample means are not equal.

You might also like