Professional Documents
Culture Documents
Introduction
Data processing is simply the conversion of raw data to meaningful
information through a process. Data is manipulated to produce results
that lead to a resolution of a problem or improvement of an existing
situation. Similar to a production process, it follows a cycle where
inputs (raw data) are fed to a process (computer systems, software,
etc.) to produce output (information and insights).
Generally, organizations employ computer systems to carry out a
series of operations on the data in order to present, interpret, or obtain
information. The process includes activities like data entry, summary,
calculation, storage, etc. Useful and informative output is presented in
various appropriate forms such as diagrams, reports, graphics, etc
Stages of Data processing
There are five main stages in data processing, they are followed
below:
Collection of data
Preparation of data
Input of data
Processing of data
Output of data
Collection of data
1|Page
Here data are obtain or gather from various sources available. The two
main sources are followed:
Primary Data
Primary data are the data collected for first time. It is a firsthand copy.
Secondary Data
Secondary data are the data extracted from primary data, it is second
hand data.
Preparation of Data
In this stage data are made available for future use by performing
various processes that is:
Classifying the data,
Rearranging the data,
Editing the raw data, etc
Here data are made filter for further use. Preparation of data is the
stage where researchers make sure that they have sufficient raw
material of data from which he can satisfied his research subject. The
researcher made raw data into usable output so that he can use it in
future interrogations.
Input of Data
2|Page
Input data is the stage where the prepared data is put in the data
processing system to obtain information. Here the data are passed to
the person or department responsible for processing data. For
instance, if computer is used then the data are made recorded into the
computer.
Processing of Data
In this stage the data are manipulated by sorting, studying, analysing,
calculating, updating, etc the obtained contain by the research to get
answer of their questions. It is usually a set of working procedures or
instructions are followed.
Output of Information
This is the final stage where the information is made available for the
future use. The end result is produce in better format.
2. Explain In Brief The Measures Of Central Tendency?
Introduction
A measure of central tendency is a single value that describes the way
in which a group of data cluster around a central value. To put in other
words, it is a way to describe the centre of a data set. There are three
measures of central tendency: the mean, the median, and the mode.
Important of Central Tendency
3|Page
4|Page
data points greater than this value and two data points less than this
value. In this case, the median is equal to the mean. But consider the
data set {1, 2, 3, 4, 10}. In this dataset, the median still is three, but
the mean is equal to 4. If there is an even number of data points in the
set, then there is no single point at the middle and the median is
calculated by taking the mean of the two middle points.
The median can be determined for ordinal data as well as interval and
ratio data. Unlike the mean, the median is not influenced by outliers at
the extremes of the data set. For this reason, the median often is used
when there are a few extreme values that could greatly influence the
mean and distort what might be considered typical. This often is the
case with home prices and with income data for a group of people,
which often is much skewed. For such data, the median often is
reported instead of the mean. For example, in a group of people, if the
salary of one person is 10 times the mean, the mean salary of the
group will be higher because of the unusually large salary. In this
case, the median may better represent the typical salary level of the
group.
Mode
The mode is the most frequently occurring value in the data set. For
example, in the data set {1, 2, 3, 4, 4}, the mode is equal to 4. A data
set can have more than a single mode, in which case it is multimodal.
In the data set {1, 1, 2, 3, 3} there are two modes: 1 and 3.
7|Page
The mode can be very useful for dealing with categorical data. For
example, if a sandwich shop sells 10 different types of sandwiches,
the mode would represent the most popular sandwich. The mode also
can be used with ordinal, interval, and ratio data. However, in interval
and ratio scales, the data may be spread thinly with no data points
having the same value. In such cases, the mode may not exist or may
not be very meaningful.
3. What Is Hypothesis? Explain Steps In Hypothesis Testing?
Introduction
A hypothesis is a specific, testable prediction. It describes in concrete
terms what you expect will happen in a certain circumstance.
Hypothesis is not a written conclusion. It is a preservation that is
taken before any research is done.
Definition
William Goode and Paul Hatt define hypothesis as a proposition,
which can be put to a test to determine its validity.
G.A. Lundberg defines hypothesis as a tentative generalization, the
validity of which remains to be tested.
Hypothesis can also be define as an unproved theory, proposition,
supposition, etc., tentatively accepted to explain certain facts or to
provide a basis for further investigation, argument, etc
8|Page
PURPOSE OF HYPOTHESIS
A hypothesis is used in an experiment to define the relationship
between two variables. The purpose of a hypothesis is to find the
answer to a question. A formalized hypothesis will force us to think
about what results we should look for in an experiment. The first
variable is called the independent variable. This is the part of the
experiment that can be changed and tested. The independent variable
happens first and can be considered the cause of any changes in the
outcome. The outcome is called the dependent variable. The
independent variable in our previous example is not studying for a
test. The dependent variable that you are using to measure outcome is
your test score.
Let's use the previous example again to illustrate these ideas. The
hypothesis is testable because you will receive a score on your test
performance. It is measurable because you can compare test scores
received from when you did study and test scores received from when
you did not study.
A hypothesis should always:
Explain what you expect to happen
Be clear and understandable
Be testable
Be measurable
9|Page
facility must ensure that the diameters of its pipes equal 5cm. The
manager follows the basic steps for doing a hypothesis test.
NOTE
We should determine the criteria for the test and the required sample
size before we collect the data.
1. Specify the hypotheses.
First, the manager formulates the hypotheses. The null hypothesis is:
The population mean of all the pipes is equal to 5 cm. formally, this is
written as: H0: = 5
Then, the manager chooses from the following alternative hypotheses:
Condition to test
The population mean is less than the target.
The population mean is greater than the target.
The population mean differs from the target.
Because they need to ensure that the pipes are not larger or smaller
than 5 cm, the manager chooses the two-sided alternative hypothesis,
which states that the population mean of all the pipes is not equal to 5
cm. Formally, this is written as H1: 5
2. Determine the power and sample size for the test.
11 | P a g e
You will have continuous data when you evaluate the mean, median,
standard deviation, or variance.
When you measure a characteristic of a part or process, such as
length, weight, or temperature, you usually obtain continuous data.
Continuous data often includes fractional (or decimal) values.
For example, a quality engineer wants to determine whether the mean
weight differs from the value stated on the package label (500 g). The
engineer samples cereal boxes and records their weights.
Binomial Data
You will have binomial data when you evaluate a proportion or a
percentage.
When you classify an item, event, or person into one of two
categories you obtain binomial data. The two categories should be
mutually exclusive, such as yes/no, pass/fail, or defective/no
defective.
For example, engineers examine a sample of bolts for severe cracks
that make the bolts unusable. They record the number of bolts that are
inspected and the number of bolts that are rejected. The engineers
want to determine whether the percentage of defective bolts is less
than 0.2%.
Poisson Data
13 | P a g e
You will have Poisson data when you evaluate a rate of occurrence.
When you count the presence of a characteristic, result, or activity
over a certain amount of time, area, or other length of observation,
you obtain Poisson data. Poisson data are evaluated in counts per unit,
with the units the same size.
For example, inspectors at a bus company count the number of bus
breakdowns each day for 30 days. The company wants to determine
the daily rate of bus breakdowns.
About the Null and Alternative hypotheses
A hypothesis test examines two opposing hypotheses about a
population: the null hypothesis and the alternative hypothesis. How
you set up these hypotheses depends on what you are trying to show.
Null hypothesis (H0)
The null hypothesis states that a population parameter is equal to a
value. The null hypothesis is often an initial claim that researchers
specify using previous research or knowledge.
Alternative Hypothesis (H1)
The alternative hypothesis states that the population parameter is
different than the value of the population parameter in the null
hypothesis. The alternative hypothesis is what you might believe to be
true or hope to prove true.
14 | P a g e
When you do a hypothesis test, two types of errors are possible, they
are as follow: Type I
Type II.
The risks of these two errors are inversely related and determined by
the level of significance and the power for the test. Therefore, you
should determine which error has more severe consequences for your
situation before you define their risks.
No hypothesis test is 100% certain. Because the test is based on
probabilities, there is always a chance of drawing an incorrect
conclusion.
Type I error
When the null hypothesis is true and you reject it, you make a type I
error. The probability of making a type I error is , which is the level
of significance you set for your hypothesis test. An of 0.05 indicates
that you are willing to accept a 5% chance that you are wrong when
you reject the null hypothesis. To lower this risk, you must use a
lower value for . Type II error
When the null hypothesis is false and you fail to reject it, you make a
type II error. The probability of making a type II error is , which
depends on the power of the test. You can decrease your risk of
15 | P a g e
to Correct
False
Decision Type II Error - fail to reject
(probability = 1 - )
Reject
17 | P a g e
interpretation is
people. Interpretation is
part
the
of
daily
life
process
of
making
for
sense
most
of
numerical data that has been collected, analyzed, and presented. There
are two types of data interpretation, they are
Quantitative data interpretation
Qualitative data interpretation
Quantitative Interpretation
18 | P a g e
Qualitative Interpretation
Certain academic disciplines, such as sociology, anthropology and
womens studies, rely heavily on the collection and interpretation of
qualitative data. Researchers seek new knowledge and insight into
phenomena such as the stages of grief following a loss, for example.
Instead of controlled experiments, data is collected through
techniques such as field observations or personal interviews of
research subjects that are recorded and transcribed. Social scientists
study field notes or look for themes in transcriptions to make
meaning, out of the data.
19 | P a g e
Cultural Background
In 1968, researchers Marshall Segall et al presented a series of optical
illusions to people of different cultural groups. The conclusion
reached was that different groups perceived the illusions in various
ways. This experiment illustrated that a person's cultural background
influences how data is interpreted.
Significance of interpretation of data
Interpretation is essential for the simple reason that the usefulness and
utility of research findings lie in proper interpretation. It is being
considered a basic component of research process because of the
following reasons:
1. It is through interpretation that the researcher can well
understand the abstract principle that works beneath his
findings. Through this he can link up his findings with those of
other studies, having the same abstract principle, and thereby
can predict about the concrete world of events. Fresh inquiries
can test these predictions later on. This way the continuity in
research can be maintained.
2. Interpretation leads to the establishment of explanatory concepts
that can serve as a guide for future research studies; it opens
new avenues of intellectual adventure and stimulates the quest
for more knowledge.
22 | P a g e
2. The researcher must remain cautious about the errors that can
possibly arise in the process of interpreting results. Errors can
arise due to false generalization and/or due to wrong
interpretation of statistical measures, such as the application of
findings beyond the range of observations, identification of
correlation with causation and the like. Another major pitfall is
the tendency to affirm that definite relationships exist on the
basis of confirmation of particular hypotheses. In fact, the
positive test results accepting the hypothesis must be interpreted
as being in accord with the hypothesis, rather than as
confirming the validity of the hypothesis. The researcher must
remain vigilant about all such things so that false generalization
may not take place. He should be well equipped with and must
know the correct use of statistical measures for drawing
inferences concerning his study.
3. He must always keep in view that the task of interpretation is
very much intertwined with analysis and cannot be distinctly
separated. As such he must take the task of interpretation as a
special aspect of analysis and accordingly must take all those
precautions that one usually observes while going through the
process of analysis viz., precautions concerning the reliability of
data, computational checks, validation and comparison of
results.
24 | P a g e
4. He must never lose sight of the fact that his task is not only to
make sensitive observations of relevant occurrences, but also to
identify and disengage the factors that are initially hidden to the
eye. This will enable him to do his job of interpretation on
proper lines. Broad generalisation should be avoided as most
research is not amenable to it because the coverage may be
restricted to a particular time, a particular area and particular
conditions. Such restrictions, if any, must invariably be specified
and the results must be framed within their limits.
5. The researcher must remember that ideally in the course of a
research study, there should be constant interaction between
initial hypothesis,
empirical
observation
and theoretical
orientation
and
empirical
observation
that
25 | P a g e
26 | P a g e