You are on page 1of 13

SH1606

PRESENTATION AND ANALYSIS OF BUSINESS DATA


Data refers to factual information in raw or unorganized form. There are two (2) types of
data, namely:

Qualitative Data is a categorical measurement expressed not in terms of numbers, but


rather by means of a natural language description.

Quantitative Data is information about quantities; that is, information that can be
measured and written down with numbers.

Statistics is the branch of mathematics that focuses on collecting, organizing, analyzing, and
interpreting data.
The initial step in analyzing business data is gathering data. Data can be gathered from either
primary sources or secondary sources. Gathering data from primary sources means the person
doing the research gathers the data him/herself while gathering from secondary sources means
that the person doing the research uses data gathered by somebody.
Data in their original form, before organized is called raw data. Once gathered, it is
assembled and presented making it easier to handle and interpret. It may be shown in a graph, a
chart, or a table, whichever is applicable and can be easily understood.

I. Table
A table is a set of facts or figures systematically displayed or organized, especially in
columns. Tables are easily constructed using the Word processors table function or a
spreadsheet program like Excel.
Tables have different distinct parts. A statistical table has at least four (4) major parts and
some other minor parts.
1. A title is the main heading shown at the top of the table. It must explain the contents
of the table and throw light on the table as a whole. Different parts of the heading can
be separated by columns.
2. The vertical heading and subheading of the column are called column captions. The
spaces where these column headings are written are called the box head.
3. The horizontal headings and subheading of the row are called row captions and the
space where these rows headings are written is called stub.
4. The main part of the table that contains the numerical information classified with
respect to row and column captions is called the body.
5. (Optional) A statement given below the tile and is enclosed in brackets, which usually
describes the units of measurements is called prefatory notes.
6. (Optional) Footnotes appear immediately below the body of the table providing
further additional explanations.
7. (Optional) The Source Note is given at the end of the table indicating the source from
where information has been taken. It includes the information about compiling
agency, publication, etc.
05 Handout 1

*Property of STI
Page 1 of 13

SH1606

Title
Prefatory Notes

Row Captions
Stub entry 1
Stub entry 2

Stub entry
Footnotes
Source notes

Column Caption 1

Box Head
Column Caption 2

Column caption

The Body

Fig. 1.1
General Rules of Tabulation:

A table should be simple and attractive. There should be no need for further explanation.

There should be proper and clear headings for columns and rows.

Suitable approximation may be adopted and figures may be rounded.

The unit of measurement should be well defined.

If the observations are large in number, they can be broken into two (2) or three (3)
tables.

Thick lines should be used to separate the data under big classes and thin lines to separate
the subclasses of data.

II. Frequency Distribution


The frequency () of a particular observation is the number of times the observation
occurs in the data. A frequency distribution table is the organization of raw data in table
form, using classes and frequencies. Frequency distributions can show either the actual
number of observations while the relative frequency ( ) distribution shows observations
as percentage of the total number of observations. The total number of observations is
referred to as the sample size (). Categorical frequency distribution table is used to
record frequencies of qualitative data. Ungrouped frequency distribution table is used for
quantitative, which can be enumerated with sufficiently small number of categories.
If there is a large amount of categories, a frequency distribution table is constructed to
make the task more manageable and to save time in calculating different statistics. In such
instances, grouped frequency distribution is used. Here, data are clustered into intervals.
In creating grouped frequency distribution, the following steps are followed:
1. The range is determined range is the highest data ( ) minus the lowest data ( ).
=

(Eq. 2.1)

2. Determine the width of intervals.

05 Handout 1

*Property of STI
Page 2 of 13

SH1606

( );

=
=

(Eq. 2.2)

3. Construct the intervals.


4. Tally the data and find the numerical frequencies.
5. Find the frequency for each interval by adding the frequencies of categories falling under
the interval.

III.

Graphs
A table can be presented in a way that the data can be easily recognized even when
not in the form of numbers. There are several types of charts/graphs. Among which are:
1. Line Chart/Graph Line graphs are most suitable when you are just comparing one
value as it changes with another value.
2. Bar Chart/Graph Bar charts show data in the form of bars that illustrate the
relationship between the items of information in terms of size: the bars get taller as
the amounts being shown increase. When the bars touch, they show continuous data.
In such cases, these charts are called histograms.
3. Pie Chart A pie chart is a diagram in the form of a circle, with proportions of the
circle clearly marked. A pie chart is a good method of representation if we wish to
compare a part of a group with the whole group. It gives an immediate idea of the
relative sizes of the shares.
Frequency distributions can be well-represented with the use of graphs and charts.
A histogram is a type of vertical bar graph in which the bars represent grouped
continuous data. A histogram has no spaces between the bars. The quantitative data is
grouped according to a determined interval. The width of each bar represents the width of
the intervals.
Another type of graph that can be drawn to represent the same set of data as a
histogram represents is a frequency polygon. A frequency polygon is a graph
constructed by using lines to join the midpoints of the intervals. The heights of the points
represent the frequencies. A frequency polygon can be created by calculating the
midpoints of the intervals, referred to as classmark, from the frequency distribution table.
The midpoint ( ) of an interval is calculated by adding the upper and lower boundary
values of the interval and dividing the sum by 2.
( + )
;
2
=

(Eq. 3.1)

=
=
A qualitative variable can be plotted using a bar graph. A bar graph is a plot made of
bars whose heights represent the frequencies of each category. While there are
05 Handout 1

*Property of STI
Page 3 of 13

SH1606

similarities between a bar graph and a histogram, such as there is one (1) bar for each
category, a bar graph has space between each bar, and the data that is plotted is discrete
data (data which can be counted). Each category is represented by intervals of the same
width. When constructing a bar graph, the category is usually placed on the horizontal
axis, and the frequency is usually placed on the vertical axis.
General Rules in Making a Chart/Graph
1. Check the data. If your data is weak, the graph is weak, so make sure your data makes
sense.
2. Explain the encodings. Provide a legend, directly labeling shapes, or describing your
graphic in a lead-in paragraph
3. Label axes. Labels must communicate to readers what scale points are plotted on.
4. Include units. Indicate the unit of numbers (example: hours, thousands of pesos, units
of length, etc.)
5. Keep your geometry in check. Use drawing and measuring tools (like compasses,
protractors, rulers, etc.) in forming shapes and lines.
6. Include your sources. Include where the data is from.
7. Consider your audience. Take into account who and what your graphs and charts are
for.

IV.

Measures of Central Tendency


Besides graphs, another summary measure that attempts to describe a whole set of
data with a single value is the measure of central tendency. There are three (3)
measures of tendency that represents the middle or the center of distribution of
frequencies, namely the mean, median, and the mode.
Ungrouped Data
Ungrouped data can easily give the measures of central tendency. The mean ( ) is
also known as the arithmetic average. Assuming a sample size , to find the mean ( ),
the items or observations are added and then divide the sum by the sample size .

=1

(Eq. 4.1)

When is odd, the median () is the middle value in a distribution when values are
arranged in ascending order. If is even, is the average of the two middle values.
The mode () is the observation that appears the most number of times in a
distribution, or basically, the data with the highest frequency. There can be more than one
(1) mode. If in case the frequencies of observations are all equal, then a mode does not
exist.

05 Handout 1

*Property of STI
Page 4 of 13

SH1606

Grouped Data
For grouped data, to find the mean, the classmark ( ) of each interval ( )
is first determined using Eq. 3.1. Suppose there are intervals, the formula for the mean
is then computed as follows:

=1
=

(Eq. 4.2)

=
=1

The median is in the class where the cumulative frequency reaches half the sum of
the absolute frequencies. The cumulative frequency for the interval is the sum of the
frequencies of that interval and of the intervals preceding it. That is,
1 = 1

(Eq. 4.3)

= + 1 ; > 1

(Eq. 4.4)

To determine the median class, is divided by 2. The interval with cumulative


frequency immediately above the quotient obtained is the median class.
Suppose the median class is the interval, that is, ( ), Before finding the
median , the lower boundary of the median class ( ) is first computed as follows:
(1) +
(Eq. 4.5)
=
2
Assuming that the width of each interval is equal to , the mode then is given by:

1
= + (2
)
(Eq. 4.6)

To find the mode () of grouped data, the modal class is first identified. It is the
interval with the highest frequency.
Suppose the modal class is the interval, that is, ( ), is the classmark
of the modal class ( ) and is solved using Eq. 3.1.

V.

Measures of Variability
Variability provides a quantitative measure of the degree to which scores in a
distribution are spread out or clustered together.
The range, as defined by Eq. 2.1, is one gross measure of variability (that relies on
only two scores from the distribution).
The interquartile range () is a measure of variability based on dividing a set into
quartiles. Quartiles divide a rank-ordered data set into four equal parts. The values that
divide each part are called the first, second, and third quartiles denoted as 1 , 2 , and 3 ,
respectively.

05 Handout 1

*Property of STI
Page 5 of 13

SH1606

1 (lower quartile) is the middle value in the first half of the rank-ordered data
set.

2 (middle quartile) is the median value in a set.

3 (upper quartile) is the middle value in the second half of the rank-ordered data
set.

The interquartile range is given by:


= 3 1

(Eq. 5.1)

The variance is a measure of how close the scores in the data set are to the middle of
distribution. It is important to distinguish between the variance of a population ( 2 ) and
variance of a sample. ( 2 ). The population includes all members of a defined group. A
part of the population is called a sample.
The variance of the population is the average squared deviation from the population
mean and is given by the formula:
( )2
;


2 =

(Eq. 5.2)


The variance of a sample is defined by a slight difference in Eq.
( )2
=
;
1

2

(Eq. 5.3)


The standard deviation is the square root of the variance. Therefore, for the
population, the standard deviation is
=

( )2

(Eq. 5.4)

and standard deviation of a sample is


( )2
=
1

VI.

(Eq. 5.5)

Tests of Significant Difference


Steps in Hypothesis Testing
1. Formulate the null and alternative hypotheses
Every test of significance begins with a null hypothesis 0 . 0 represents a
theory that has been put forward, either because it is believed to be true or because it

05 Handout 1

*Property of STI
Page 6 of 13

SH1606

is to be used as a basis for argument, but has not been proven. For example, in a
clinical trial of a new drug, the null hypothesis might be that the new drug is no
better, on average, than the current drug. We would write 0 : there is no difference
between the two (2) drugs on average.
The alternative hypothesis, , is a statement of what a statistical hypothesis test
is set up to establish. For example, in a clinical trial of a new drug, the alternative
hypothesis might be that the new drug has a different effect, on average, compared to
that of the current drug. We would write : the two (2) drugs have different effects,
on average. The alternative hypothesis might also be that the new drug is better, on
average, than the current drug. In this case we would write : the new drug is better
than the current drug, on average.
Hypotheses are always stated in terms of population parameter, such as the
population mean . An alternative hypothesis may be one-tailed or two-tailed. A
one-tailed hypothesis claims that a parameter is either larger or smaller than the value
given by the null hypothesis. A two-tailed hypothesis claims that a parameter is
simply not equal to the value given by the null hypothesisthe direction does not
matter.
Hypotheses for a one-tailed test for a population mean take the following form:

0 : 0

: > 0 (upper-tailed)

or

0 : 0

: < 0 (lower-tailed)
Hypotheses for a two-tailed test for a population mean take the following form:

0 : = 0

: 0

2. Collect data and decide on an appropriate statistical testing procedure. Specify the
level of significance .
A researcher who uses sample data to infer sufficient evidence favoring 0 arrives
at one of these two (2) conclusions:

reject 0 in favor of

do not reject 0

A statistical test of hypothesis must first be conducted. It is a method used to


decide whether to reject or fail to reject 0 . A one-tailed test is used to test 0
against a one-tailed . A two-tailed test is used to test 0 against a two-tailed .

05 Handout 1

*Property of STI
Page 7 of 13

SH1606

In a statistical test, four (4) possibilities may exist when a researcher makes a
decision:
Decision
Do not Reject 0
Reject 0

Fact
0 is true

0 is false

Correct Decision

Type II Error

Type I Error

Correct Decision

Table 6.1
A type I error is an error of rejecting 0 when in fact, it is true. A type II error
is an error of not rejecting 0 when in fact, it is false.
The probability of committing a type I error is given by
= ( )

(Eq. 6.1)

known as the level of significance. Relating it to estimation, a confidence level of


1 is the probability of making the correct decision of not rejecting a true 0 .
The probability of committing a type II error is given by
= ( )

(Eq. 6.2)

which measures the risk of accepting a false 0 .


The researcher should ensure that these errors are small. Usually, the level of
significance is set to 0.05 (5 in a hundred chance) or 0.01 (1 in a hundred chance).
To reduce the probabilities of committing a type I and a type II error, increasing the
sample size must be done.
3. Compute the test statistic or the probability value ( ).
A test statistic is a numerical value computed from sample data that is sensitive
to the differences between 0 and .
The significance probability or the is the probability using the test
statistic value in the direction of , assuming that 0 is true. It is the lowest level of
significance at which the test statistic value is significant.
z-test
Suppose a random sample of size is drawn from a population with an unknown
population mean and a known standard deviation , or suppose a sample size is
large ( > 30). You want to test the null hypothesis 0 , which assumes that the
unknown population mean is equal to some hypothesized population mean 0 , That
is, you test
0 : = 0
Against one (1) of the three (3) possible alternatives:
a. : 0
b. : > 0
05 Handout 1

*Property of STI
Page 8 of 13

SH1606

c. : < 0
The is given by
=

0
;

(Eq. 6.3)

When the sample size is large, can be estimated by .


t-test
Suppose a random sample of size is drawn from a population with an unknown
population mean and unknown population standard deviation , but is small (
30), the for population mean with degree of freedom 1 is
given by
=

0
;

(Eq. 6.4)


4. Determine the critical region or the rejection region.
The distribution of the entire set of possible values of the test statistic is divided
into two (2) regions: the critical or rejection region/s and the non-rejection region. A
critical or rejection region consists of values that support , and leads to the
rejection of 0 . A non-rejection region consists of values that support 0 and leads
to its non-rejection.
In a two-tailed test, the critical region consists of values with absolute value
greater than | |, or | |, whichever test is used.
2

In a one tailed test, if involves the quantifier >, the critical region consists of
values greater than | |, or | |, whichever test is used. If involves the quantifier
<, then the critical regions consist of value less than | |, or | |, whichever test is
used.
The table below summarizes the critical regions and decision rules for one-tailed
and two-tailed :

Hypothesis
0 :

Decision Rule: Reject 0 if

= 0

< or >

> 0

>

< 0

<

Otherwise, fail to reject 0

05 Handout 1

*Property of STI
Page 9 of 13

SH1606

Table 6.2
The table below summarizes the critical regions and decision rules for one-tailed
and two-tailed :

Hypothesis
0 :

Decision Rule: Reject 0 if

= 0

< or >

> 0

>

< 0

<

Otherwise, fail to reject 0


Table 6.3
5. Make a decision or conclusion about the hypothesis.
The test statistic, or the serves as the basis for deciding whether to
reject or not to reject 0 .
If the test-value falls in the critical region, then it leads to the rejection of 0 .
Otherwise, it leads to non-rejection of 0 .

05 Handout 1

*Property of STI
Page 10 of 13

SH1606

VII. Tables for z-test and t-test

05 Handout 1

*Property of STI
Page 11 of 13

SH1606

05 Handout 1

*Property of STI
Page 12 of 13

SH1606

References:
Lopez-Mariano, Norma D. (2016). Business Mathematics. Manila, Philippines. Rex Bookstore.
Altares, P.S., Arao, R.R., Arce, M.T.B., Bugtong, D.E., Calayag, M.E., CoPo, A.R.I., Laddaran, A.T., , Yao, A.M.S.D. (2012). Business
Mathematics. Manila, Philippines. Rex Bookstore.
Lopez, Lundag, Dagal (2016). Business Mathematics. Quezon City, Philippines. Vibal Group Inc.
Pie
Charts,
Bar
Graphs,
Histograms,
and
Stem-and-Leaf
Plots.
Retrieved
from:
http://www.freelearningchannel.com/l/Content/Materials/Mathematics/Statistics/textbooks/CK12_Statistics/html/7/2.html on April 30, 2016
Gilmartin, Kathleen. Rex Karen (1999). The Open University Student Toolkit: Working with Charts, Graphs, and Tables. Retrieved from:
http://www2.open.ac.uk/students/skillsforstudy/doc/working-with-charts-graphs-and-tables-toolkit.pdf on April 30, 2016
Test of Significance. Retrieved from http://www.stat.yale.edu/Courses/1997-98/101/sigtest.htm on May 3, 2016

05 Handout 1

*Property of STI
Page 13 of 13

You might also like