You are on page 1of 7

Chapter : 1 INTRODUCTION TO STATISTICS

The word ‘statistics’ seems to have been derived from the Latin word ‘Status’ or Italian
word ‘Statista’ or the German word ‘Statistik’ or the French word ‘Statistique’ each of
which means a political state. It is not a new discipline, but is as old as the human
society. In good old days, the term statistics was applied to a branch of statecraft –
science of statecraft. As such, the term statistics was applied to mean facts and
figures which were needed by the state in its day to day life. Statistics was regarded
as a by-product of administrative activities of the State. Now statistics is usually not
studied for its own sake (as a separate branch), but statistics is employed as a tool in
solving or analyzing the problems of the State.
In the present age, statistics is regarded as one of the most important tools for taking
decisions. All the branches of science make use of statistics. Statistics helps in
forming suitable policies; as such it is being used in all the fields. In science, statistics
is freely used. In research work, it has got its own status as a tool of research. Thus in
every situation there is a demand for statistics. The sampling techniques further
reduce the cost of statistics. This is because by studying a part of the population, the
characteristics of the whole population can be known. Thus the increasing demand
and decreasing cost of statistics give way to growth.
Planning and control are the twin-babies of management. Whenever we think of a
plan we have to think of statistics. Planning cannot be devised without statistics. In
this technically advanced and competitive world, a producer has to make a number of
decisions such as what to produce, where to produce, how to produce, where to sell,
at what price to sell etc. Such decisions depend upon sound forecasting and
forecasting cannot be made without statistics. Prof. Marshall observed that “statistics
are the straw out of which I, like any other economist, have had to make bricks”.
Statistics helps in formulating suitable policies and as such its need is increasingly felt
in all the fields. A businessman needs information on daily demand of the products,
seasonal changes in demand, prices of competitive products etc. All these problems
are resolved in the light of factual information and hence the need for statistics.
“By statistics we mean aggregates of facts affected to a marked extent by
multiplicity of causes, numerically expressed, enumerated or estimated
according to reasonable standard of accuracy, collected in a systematic
manner for a predetermined purpose and placed in relation to each other.” -
Horacesecrist
Numerical data alone constitute statistics. Students can be classified very good, good,
average, poor, etc. on the basis of their performance in tests. But they are in
qualitative expressions and are not statistics. In particular, the qualitative
characteristics – honesty, beauty, intelligence, etc. which cannot be measured
numerically are not statistics. If they are expressed by giving certain scores (marks)
as numerical standards, then they can be called as statistics. Another example is
beauty competition of girls; if ranks are assigned, then the quantitative measure of
beauty of the girls can be regarded as statistics.
The numerical data pertaining to any field of enquiry can be obtained either by
enumeration (by actual counting) or by estimation. If the field of enquiry is not large,
enumeration (actual counting) can be conducted. If the field of enquiry is wide and
large, enumeration is out of question; and in such cases, data can be estimated. For
instance, in the MBA class there are 60 students; this is a case of enumeration. (We
count the number of students). At the same time we may say that 1,00,000 people
attended the Independence Day Celebration; it is a case of estimation
(approximation).

1
A reasonable standard of accuracy is needed in both enumeration and estimation. For
instance, if the weights of students are being measured, fractions of kilogram (say
1/10th or 1/20th ) can be ignored; when measuring the distance from Chennai to
Kanyakumari, fraction of a kilometer can be easily ignored. No hard and fast rule can
be laid down for all cases. Hence mathematical accuracy cannot be attained in
statistical studies.
“Statistics is the science of estimates and probabilities”
This definition is narrow, as the other methods like enumeration, classification;
analysis, etc. have been ignored. Therefore, this definition narrows down the scope of
the science of statistics.
“Statistics may be defined as the collection, presentation, analysis and
interpretation of numerical data”.
This definition is clear and concise. The data are collected to study a particular
problem. The collected data in mass may be converted in the form of diagrams,
graphs, etc. According to this definition, there are four stages.
a. Collection of data: The first step of an investigation is the collection of data.
Careful collection is needed, because further analysis is based on this. There
are different methods of collection of data (Census, sampling, primary,
secondary etc.) and they must be reliable. If the collected data are faulty,
results will also be faulty. Therefore, the investigator must take special care in
collection.
b. Presentation of data: The collected data are generally in an unintelligible
form and need to be classified and tabulated before they can be analyzed. For
example, the investigator is interested to know the average income of 1000
families of a village. The mass data collected should be difficult to understand
and analyze. Therefore, the collected data are to be presented in tabular or
diagrammatic or graphic form. The data presented in a systematic order will
facilitate further analysis.
c. Analysis of data: After the presentation of data, the next step is to analyse
the presented data. Analysis includes condensation, summarization, conclusion,
etc. though the means of measures of central tendencies, dispersion, skewness,
kurtosis, correlation, re-gression, etc.
d. Interpretation of data: Figures do not speak for themselves. The duty of
the statistician is not complete with mere collection and analysis of data. But,
valid conclusions must be drawn on the basis of analysis. A high degree of skill
and experience is necessary for the interpretation. Correct interpretation leads
to valid conclusion.
Without an adequate understanding of the statistical methods, the investigator in the
social sciences may be like the blind man groping in a dark room for a black cat that is
not there. The methods of statistics are useful in an over-widening range of human
activities in any field of thought in which numerical data may be had.
The real purpose of statistical methods is to make sense out of facts and figures, to
prove the unknown and to cast light upon the situation.
Broadly speaking, one may say that the statistical methods can be fruitfully applied to
any problem of decision making where numerical data are available or can be made
available. Therefore, in business, industry and economics; the statistical techniques
are applicable to problems like maintenance to trends of population, production of
agricultural industries, prices, internal and external trades, gross national product,
taxation laws and rates; preparation of budgets, computation of consumer price

2
indices from time to time o revise the wage structures, preparation of price policies of
new products, scheduling of the projects and then exercising control over the
operations till the completion, resource allocations for any job carrying out inquiries to
know the potential markets, stock control, quality control, maintenance and
replacement of equipments etc.
In research and technology, the statistical techniques are used to develop optimum
designs of experiments that can be applied to obtain the relevant information with
highest precision at minimum cost. In social sciences, Statistics help in studying the
distribution of wealth, intelligence etc. It is also used in studying the changes in
standards of living, food habits and attitudes of people.
Functions of Statistics:
In various fields discussed above and many others, the science of statistics us used to
perform the following functions:
1. Statistics helps in developing sound methods of collecting data so that the data
collected can be used to draw the valid inference regarding the desired
objectives.
2. It presents the information in numerical form.
3. It helps in simplifying the complex data by way of classification / tabulation /
graphical representation.
4. The tabular / graphical representation of data and other complex statistics help
in comparison.
5. Statistics can be used to study the relationship between two or more factors.
The use of such relationship can be made in estimating one factor when other/s
are known.
6. The data regarding a characteristic for a series of past periods can be used to
forecast its value for a future period.
7. The powerful function of forecasting leads us to the need of planning and thus
facilitates in formulating policies and helps in planning to implement these
policies.
Limitations of Statistics:
Statistics is a very powerful science to study quantitative data. Qualitative data
cannot be studied with the help of Statistics except when we make them to be looked
upon a quantitative by defining suitable varieties.
More often than not, Statistics is used to draw conclusions regarding a group of units
rather than single unit. In case of individual units, the inference drawn is always with
an element of chance or on an average.
Sometimes due to bias involved in the collection of data the inference drawn is a
biased one.
The potential danger involved in the use of statistics is its misuse. It is easy to misuse
it for supporting or contradicting any proposition or a conjuncture. For instance, a
statement like “During the last month, six street accidents were recorded in the
middle of the road compared to twenty one accidents recorded on the sides of the
road in busy streets of Mumbai” may lead one to the conclusion like “It is safer to walk
in the middle of the road”.
SOURCES OF DATA
An application of statistics involves data and therefore the foremost question that
arises is from where to get the data or what the sources of data are.

3
We have seen that collection of the data is always the first step in any statistical
enquiry. Before starting any enquiry, the following concepts must be clearly defined.
POPULATION:
Any finite or infinite aggregation of all possible objects under study, not necessarily
animate is called a POPULATION. In statistical study we may have a population of
number of students at the University, number of employees of a company, number of
misprints in a book, the production of a factory, the number of cheques cleared in a
month etc.
SAMPLE:
Any finite set of objects selected from a population is called a SAMPLE. The objects
included in the sample are representative of the items in the population so that by
studying the sample values in detail, an idea about the characteristics of the objects in
the population can be obtained.
VARIATE:
A characteristic from the population which can be expressed numerically and which
varies from object to object is called a VARIATE. For example, the wages of persons or
the heights of students can be measured quantitatively and so these are variates.
ATTRIBUTE:
Certain characteristic cannot be expressed quantitatively but they can be described
qualitatively. For example, beauty, intelligence, skill, talent, etc. These are called
ATTRIBUTES.
PARAMETER:
A statistical measure like mean, standard deviation, which is calculated for all the
objects included in the population is called a PARAMETER. It is usually expressed by
Greek letters like µ for mean, σ for standard deviation etc.
STATISTIC:
A statistical measure calculated for all units in the sample is called a statistic. It is
expressed by using English alphabets. For example the sample mean is denoted by x
and the sample standard deviation is denoted by S.
The following points must be decided before collection of data begins:
1. The purpose or the objective of the collection must be precisely defined. The
type of data to be included, the characteristics to be considered, the sources
from where the data is to be obtained and the steps to be followed to collect the
data – every step should be worked out in advance.
2. The scope of the enquiry with respect to the time, the places to be covered
should be decided first. There are different types of enquiries like official or
non-official, regular or ad-hoc, direct or indirect etc. The proper type which
suites the purpose and the scope should be decided.
3. The measurement of values of a variable is done in a particular unit which is
called Statistical unit. For example, for incomes of employees, the unit is a
rupee. For heights of persons, centimeter etc. Along with the unit, the degree
of accuracy also should be decided.
After considering the above mentioned points, the type of data whether primary of
secondary, is to be decided.
METHODS OF COLLECTING PRIMARY DATA

4
The primary data is the information collected by an enumerator or investigator for the
purpose of the enquiry for the first time. The following are the methods using which
the primary data can be collected.
a. DIRECT PERSONAL INVESTIGATION: Here the investigator meets the
informants personally and collects the information by asking questions. The
questions should be simple, short and should be so formed as to get brief and
unambiguous answers. The enumerator must be trained, specially hired for the
job. His observation should be keen and he should be well acquainted with the
local conditions. He should possess sufficient knowledge of tastes and preferences
of the informants. The investigator should be polite and courteous yet he should
be firm, determined to get answers tactfully from the respondents.
This type of investigation, though very costly and time-consuming is the best
method available as far as accuracy concerned. If the scope of the enquiry is very
wide, this method cannot be used. Also, care has to be taken to avoid personal
bias entering the answers of the respondents; otherwise it will affect the validity of
the data collected.
b. INDIRECT ORAL INVESTIGATION: If the persons, directly concerned with
the investigation are not willing to supply the necessary information, then it is
obtained by questioning witnesses who are supposed to know the situation, to have
knowledge about the persons concerned or the problem involved.
This method is adopted by Inquiry Committees or Commissions. It is applicable in
those situations where indirect informants can give more reliable and accurate
information than the persons involved. This method can be successful only when
the witnesses are honest and are not hostile towards the persons concerned. They
should be able to express themselves precisely, accurately, without exaggerating
the situation. The investigators should be able to judge whether the information
provided by the witness is correct and without bias.
c. QUESTIONNAIRES AND SCHEDULES: In this method, a list of questions is
prepared and it is sent by post to various informants. Usually, a sample of
informants is selected from the concerned population. Sometimes the schedules
are filled in by the enumerators who question the people and write down the
necessary information. If the questionnaire is sent by mail, then a forwarding
letter, explaining the objective of the survey and requesting co-operation, should
accompany the form. The advantage in this method is that the respondents can
write the answers of the questions as per their convenience and would not hesitate
to give some confidential information asked in the questionnaire. This method has
a wide coverage, it is quick and inexpensive. But still, the response is not very
good. If possible, there should be some incentive like a small prize, lucky number
draw, concession at some shops etc. to get better response. Every questionnaire
must be accompanied by an addressed and stamped envelope.
If a schedule is to be filled in by an enumerator, he should be trained, qualified
person. The enumerator should be a person of unquestioned integrity. He must be
patient and tactful with the respondents. He must explain the purpose of the
investigation and also the questions in detail. While writing the answers, he has to
take care that personal bias does not affect the investigation. The reports of the
enumerators should be periodically checked by the supervisors. Now, let us see
how a good questionnaire should be prepared.

REQUISITES OF A GOOD QUESTIONNAIRE OR A SCHEDULE:

5
1. The number of questions should be as few as possible but at the same time, the
questions should cover all the essential topics on which information is required.
2. The questions should be short, simple and unambiguous. Clarity is essential in
forming the questions.
3. The questions should be drafted in such a way that the answers to them are of
objective type and brief in nature, for example, the answers printed should be
‘yes’ or ‘no’ or multiple-choice answers of the type ‘single, married, widowed,
divorced’.
4. It is possible that some questions cannot be answered accurately by the
respondents. So the degree of accuracy for a statistical unit should be mentioned
with the question itself. For example, for age – the answer is expected in
completed years or the monthly income is to be expressed in hundreds of rupees
etc.
5. The questions which are unduly inquisitive or which are likely to offend the
respondents should not be included in a questionnaire. Questions regarding
personal habits behaviour with the family members, income should be tactfully
asked. Leading questions providing a hint to the possible answer should be
avoided.
6. The questions should be so worded that personal bias of an investigator is not
reflected.
7. The arrangement of the questions should be carefully planned. Proper space for
answers must be kept and there should be logical flow from one question to the
other.
8. The questionnaire should be neatly printed on a high quality paper creating
good impression on the respondents.
9. If possible, the questionnaire should be tried on a small sample before applying
it to a large group so that some revision or amendment of the questionnaire can
be made, if necessary.
EDITING OF THE PRIMARY DATA
The collected data should be edited and then only it can be processed further. While
editing the data, the following points must be remembered.
i. The data should be consistent. That is, the answers obtained should not
contradict one another.
ii. The answers should be complete and uniform in all respects. If some, important
questions are left unanswered then the respondent should be contacted again
to complete the questionnaire.
iii. The answers should be checked for accuracy. Inaccuracy due to mathematical
errors is to be corrected.
iv. The data must be checked for homogeneity of answers. For example, if one
respondent has mentioned the gross pay and if the other has mentioned net
pay after tax deduction, then these cannot be compared.
SECONDARY DATA
The data compiled through various published or unpublished sources is known as
Secondary Data. The following are the main sources of the secondary data.
a) Various Central or State Government publications supply reliable data, on many
social and economic activities. For example, Census reports, Pay Commission

6
reports, monthly or annual publications like Bulletin on Index of Industrial
Production, Retail Price Bulletin, Estimates or national product etc.
b) Various international institutions publish the reports on matters of international
importance. Organizations like W.H.O., I.M.F., U.N.O., I.B.R.D., regularly publish
official reports.
c) Semi-official publications of corporations like municipal corporations, Life
Insurance Corporation of India, etc.
d) Publications of private bodies like Chambers of Commerce, Institute of
Chartered Accountants, Institute of Bankers provide secondary data, on various
issues.
e) Periodicals like Economic Weekly, Commerce, Economic Times supply reliable
information.
f) Various universities, research organizations collect data in different fields which
can be used as Secondary data.
g) Some reference books also supply information over a long period.
h) There are also sources like records of government departments, trade union
offices, railways, state transport offices which can be used as secondary data.
The secondary data should be carefully checked before using it in any investigation.
The data should be suitable and adequate for the investigation. The information
should be checked for the reliability and accuracy of data. The integrity of the
investigators or enumerators should be ascertained. The secondary data should never
be accepted at its face value without checking.
We have seen different methods of data collection. If the data is collected for all units
of the population, it is called Census and if it is collected only for a sample then it is
called a Sample Survey.

You might also like