You are on page 1of 209

Mathematics

Grade 12

Mathematics
Copyright 2013 by 3G Elearning FZ LLC

3G Elearning FZ LLC
UAE
www.3gelearning.com
ISBN: 978-93-5115-015-2

All rights reserved.No part of this publication maybe reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise without prior written permission of the publisher.
Reasonable efforts have been made to publish reliable data and information, but the authors, editors,
and the publisher cannot assume responsibility for the legality of all materials or the consequences of
their use. The authors, editors, and the publisher have attempted to trace the copyright holders of all
material in this publication and express regret to copyright holders if permission to publish has not
been obtained. If any copyright material has not been acknowledged, let us know so we may rectify in
any future reprint. Registered trademark of products or corporate names are used only for explanation
and identification without intent to infringe.
*Case Studies and/or Images presented in the book are the proprietary information of the respective
organizations, and have been used here specifically and only for educational purposes.
For more information visit:

www.3gu.edu.in

www.iimts.com

3G LEARNING
www.3gelearning.com

Preface
From prekindergarten through grade 12, the school mathematics curriculum includes important topics that are pivotal in students development. Students who understand these ideas cross smoothly into
new mathematical terrain and continue moving forward with assurance. However, many of these
topics have traditionally been challenging to teach as well as learn, and they often prove to be barriers rather than gateways to students progress. Students who fail to get a solid grounding in them
frequently lose momentum and struggle in subsequent work in mathematics and related disciplines.
The Essential Understanding Series identifies such topics at all levels. Teachers who engage students in
these topics play critical roles in students mathematical achievement. Each volume in the series invites
teachers who aim to be not just proficient but outstanding in the classroomteachers like youto enrich their understanding of one or more of these topics to ensure students continued development in
mathematics.
To teach these challenging topics effectively, you must draw on a mathematical understanding that
is both broad and deep. The challenge is to know considerably more about the topic than you expect
your students to know and learn.
Why does your knowledge need to be so extensive? Why must it go above and beyond what you
need to teach and your students need to learn? The answer to this question has many parts. To plan
successful learning experiences, you need to understand different models and representations and, in
some cases, emerging technologies as you evaluate curriculum materials and create lessons. As you
choose and implement learning tasks, you need to know what to emphasize and why those ideas are
mathematically important.
While engaging your students in lessons, you must anticipate their perplexities, help them avoid
known pitfalls, and recognize and dispel misconceptions. You need to capitalize on unexpected classroom opportunities to make connections among mathematical ideas. If assessment shows that students
have not understood the material adequately, you need to know how to address weaknesses that you
have identified in their understanding. Your understanding must be sufficiently versatile to allow you
to represent the mathematics in different ways to students who dont understand it the first time.

How to use this Book


This book has been divided into many chapters. Chapter gives the motivation for this book and the use
of templates.
1. Chapter in the book includes a number of pedagogical aids. The text is presented in the
simplest language. Each paragraph has been arranged under a suitable heading for easy
retention of concept. All important formulae, figures and practical steps have been presented with better visibility to grab the attention. Each unit has been uniformly organized.
2.

Objectives in the beginning of the chapter provide a glimpse of related issues which has
been discussed in the chapter.

3.

Key Vocabulary is a technique designed to use the most meaningful words in a childs
world to develop literacy. It is a structured process that can be used with individuals or
classes to expand reading vocabulary. As a student accumulates a bank of key words,
he/she develops confidence as a reader. While they are used primarily for rhetoric, they
are also used in a strictly grammatical sense for structural composition, reasoning, and
comprehension. Indeed, they are an essential part of any language.

4.

Multiple choice questions provide a set of answers from which the respondent must
choose. Multiple choice questions are closed questions. It is a form of assessment in which
respondents are asked to select the best possible answer (or answers) out of the choices
from a list.

5.

Review questions at the end of each chapter ask students to review or explain the concepts.

For an easier navigation and understanding, this book contains the complete 3G curriculum of this
subject and the topics.

Introduction
An introduction is a beginning of
section which states the purpose
and goals of the topics which are
discussed in the chapter. It also
starts the topics in brief.

Objectives
Objectives in the beginning of
the chapter provide a glimpse
of related issues which has been
discussed in the chapter.

Key Vocabulary
Key Vocabulary is a technique
designed to use the most meaningful words in a childs world to
develop literacy. It is a structured
process that can be used with individuals or classes to expand
reading vocabulary.

Multiple Choice questions


Multiple choice questions provide a set of answers from which
the respondent must choose. It is
a form of assessment in which respondents are asked to select the
best possible answer (or answers)
out of the choices from a list.

Review Questions
Review questions at the end of
each chapter ask students to review or explain the concepts.

Table of Contents

1.

1.1
1.2
1.3
1.4

2.

4.3.2 Finding the Minor for R3C2 51


4.4 Cofactors
52
4.4.1 Matrix of Cofactors
53
4.5 Expanding to Find the
Determinant 53
4.6 Applications of Matrices and
Determinants 54
4.6.1 Geometric Technique
56
4.6.2 Determinants
56
4.7 Adjoint of a Matrix
57
4.8 Inverse of a Square Matrix
58
4.9 Consistent and Inconsistent
Systems 60
4.10 System of Linear Equations
62

Relations and Functions


Types of Relations
Types of Functions
Composition of Functions and
Invertible Function
Binary Operation

1
4
7
11

Inverse Trigonometric Functions


2.1
2.2
2.3
2.4

Definition to Inverse
Trigonometric Functions
21
Identify the Domains and
Ranges of Inverse Trigonometry
Functions 23
Graphs of Inverse Trigonometry 25
Properties of Inverse
Trigonometric Functions
28

5.

5.1 Limit
5.2 Neighborhood of a Point on
the Real Line
5.2.1 Limit of a Function
5.3 Different Types of Limits
5.4 Continuity of a Function
5.5 Derivability of a Function
5.6 Differential of a Function

3. Matrices
3.1
3.2

3.3

4.

Sum of the Matrix


36
Matrix Algebra Determination 37
3.2.1 Matrix Multiplication
38
3.2.2 Inverses of Matrices
39
3.2.3 Matrix Transpose
40
3.2.4 Determinant of a Matrix 40
3.2.5 Using the Inverse Matrix
to Solve Equations
41
Adjoint of a Matrix
44
3.3.1 Finding the Adjoint Matrix 45

Limit, Continuity, and Differentiability

6.

72
73
75
77
80
87

Applications of Derivatives
6.1
6.2

Determinant of a Matrix
The Determinant of a Square
Matrix 50
4.2 Properties of Determinants
50
4.3 Minors
50
4.3.1 Finding the Minor for R2C1 51

71

4.

6.3

Rate of Change of Quantities


91
Increasing and Decreasing
Functions 93
6.2.1 Decreasing Functions
93
6.2.2 Increasing and Decreasing
Function Theorem
95
6.2.3 Increasing and Decreasing
Function Rules
95
First Derivative Test for

6.4
6.5
6.7
6.8

Increasing and Decreasing


Functions 96
6.3.1 Increasing and Decreasing
Functions Graph
96
6.3.2 Increasing and Decreasing
Behavior of Some Known
Functions 97
Tangent and Normal Lines
97
Angle between Two Curves
98
Approximation of Applications
of Derivatives
102
Maxima and Minima
104
6.8.1 Stationary Points
104
6.8.2 Turning Points
105
6.8.3 Distinguishing Maximum
Points from Minimum
Points 105

7.4.1
7.4.2
7.4.3
7.4.4

8.

7.2

7.3

7.4

Scalars and Vectors


111
7.1.1 Geometrical
Representation of Vectors 112
7.1.2 Multiplying a Vector by
a Scalar
114
Coordinate Systems
115
7.2.1 Right-handed Cartesian
Coordinate System
116
7.2.2 The Dot Product (Scalar
Product) 117
7.2.3 Distance between a Point
and a Line in R2
118
7.2.4 The Cross Product (Vector
Product) 119
Direction Ratios and Direction
Cosines 120
7.3.1 Direction Ratios and
Cosines in Three
Dimensions 121
7.3.2 Components of a Vector 122
Magnitude and Direction of a
Vector 122

ThreeDimensional Geometry
8.1 ThreeDimensional Space
139
8.2 Dot Product
141
8.3 Lines
145
8.4 The Angle between Two
Vectors 149
8.5 The Crossproduct of Two
Vectors 154
8.6 Planes
158

9.

Linear Programming
9.1

7. Vectors
7.1

Adding Vectors
124
Multiplication by a Scalar 125
Subtracting Vectors
126
Relative Position Vectors 126

9.2

9.3
9.4
9.5

Assumptions of Linear
Programming 166
Formulation of Linear
Programming Problems
168
9.2.1 Structure of Linear
Programming Model
169
9.2.2 General Mathematical
Model of an LPP
169
9.2.3 Guidelines for
Formulating Linear
Programming Model
169
Graphical Method
173
Simplex Method
179
Two Phase Method
181
9.5.1 Elementary Ideas about
Duality 183

10. Probability and Probability Distribution


10.1 Probability
10.2 Probability Distribution
10.2.1 Calculating Probabilities
10.3 Conditional Probability and
Bayes Theorem
10.3.1 Bayes Theorem

viii

189
190
193
194
196

Chapter 1

Relations and Functions

Objectives
After studying this
chapter, you will be
able to:
Define types
of relations
Discuss the
types of functions
Discuss
composition
functions and
invertible
function
Define binary
operation

INTRODUCTION

ecall that the notion of relations and functions, domain, co-domain and
range has been introduced along with different types of specific real valued functions and their graphs. The concept of the term relation in mathematics has been drawn from the meaning of relation in English language,
according to which two objects or quantities are related if there is a recognizable connection or link between the two objects or quantities. Let A be the set
of students of Class XII of a school and B be the set of students of the XII of a
school and B be the set of students of Class XI of the same school. Then some
of the examples of relations from A to B are
{(a, b) A B: a is brother of b},
{(a, b) A B: a is sister of b},

{(a, b) A B: age of a is greater than age of b},

{(a, b) A B: total marks obtained by a in the final examination is


less than the total marks obtained by b in the final examination},

{(a, b) A B: a lives in the same locality as b}. However, abstracting


from this, we define mathematically a relation R from A to B as an
arbitrary subset of A B.
If (a, b) R, we say that a is related to b under the relation R and we
write as a R b. In general, (a, b) R, we do not bother whether there is a recognizable connection or link between a and b. Functions are special kind of
relations.
Here, we will study different types of relations and functions, composition of functions, invertible functions and binary operations.

1.1 TYPES OF RELATIONS


Here, we would like to study different types of relations. We know that a
relation in a set A is a subset of A A. Thus, the empty set and AA are two
extreme relations. For illustration, consider a relation R in the set A = {1, 2, 3,
4} given by
R = {(a, b): a b = 10}. This is the empty set, as no pair (a, b) satisfies the
condition a b = 10. Similarly, R = {(a, b): | a b | 0} is the whole set AA,
as all pairs(a, b) in AA satisfy | a b | 0. These two extreme examples lead
us to the following definitions.

Mathematics

Definition 1 A relation R in a set A is called empty relation, if no


element of A is related to any element of A, i.e., R = AA.
Definition 2 A relation R in a set A is called universal relation, if
each element of A is related to every element of A, i.e., R = AA.
Both the empty relation and the universal relation are sometimes called
trivial relations.
Example 1: Let A be the set of all students of a boys school. Show that
the relation R in A given by R = {(a, b): a is sister of b} is the empty relation
and R = {(a, b) : the difference between heights of a and b is less than 3 meters} is the universal relation.

Key Vocabulary
Associative: A binary
operation : AA A is
said to be associative if
(a*b) c = a (b c), a,
b, c, A.

Solution: Since the school is boys school, no student of the school can
be sister of any student of the school. Hence, R = , showing that R is the
empty relation. It is also obvious that the difference between heights of any
two students of the school has to be less than 3 meters. This shows that R =
AA is the universal relation.
Remark:We have seen two ways of representing a relation, namely raster method and set builder method. However, a relation R in the set {1, 2,
3, 4} defined by R = {(a, b) : b = a + 1} is also expressed as a R b if and only
if b = a + 1 by many authors. We may also use this notation, as and when
convenient.
If (a, b) R, we say that a is related to b and we denote it as a R b.

One of the most important relations, which play a significant role in


Mathematics, is an equivalence relation. To study equivalence relation, we
first consider three types of relations, namely reflexive, symmetric and transitive.
Definition 3 A relation R in a set A is called
reflexive, if (a, a) R, for every a A,

symmetric, if (a1, a2) R implies that (a2, a1) R, for all a1, a2 A.

transitive, if (a1, a2) R and (a2, a3) R implies that (a1, a3) R, for all
a1, a2, a3 A.

Definition 4 A relation R in a set A is said to be an equivalence relation


if R is reflexive, symmetric and transitive.
Example 2: Let T be the set of all triangles in a plane with R a relation in
T given by R = {(T1, T2): T1 is congruent to T2}. Show that R is an equivalence
relation.
Solution:R is reflexive, since every triangle is congruent to itself. Further, (T1, T2) R T1 is congruent to T2 T2 is congruent to T1 (T2, T1) R.
Hence, R is symmetric. Moreover, (T1, T2), (T2, T3) R T1 is congruent to
T2 and T2 is congruent to T3 T1 is congruent to T3 (T1, T3) R. Therefore,
R is an equivalence relation.

Example 3: Let L be the set of all lines in a plane and R be the relation
in L defined as R = {(L1, L2) : L1 is perpendicular to L2}. Show that R is symmetric but neither reflexive nor transitive.
Solution: R is not reflexive, as a line L1 cannot be perpendicular to itself,
i.e., (L1, L1) R. R is symmetric as (L1, L2) R

L1 is perpendicular to L2

L2 is perpendicular to L1

(L2, L1) R.

Relations and Functions


L3

L2

L1

R is not transitive. Indeed, if L1 is perpendicular to L2 and L2 is perpendicular to L3, then L1 can never be perpendicular to L3. In fact, L1 is parallel
to L3, i.e., (L1, L2) R, (L2, L3) R but (L1, L3) R.

Example 4: Show that the relation R in the set {1, 2, 3} given by R = {(1,
1), (2, 2), (3, 3), (1, 2), (2, 3)} is reflexive but neither symmetric nor transitive.
Solution: R is reflexive, since (1, 1), (2, 2) and (3, 3) lie in R. Also, R is
not symmetric, as (1, 2) R but (2, 1) R. Similarly, R is not transitive, as (1,
2) R and (2, 3) R but (1, 3) R.
Example 5: Show that the relation R in the set Z of integers given by R
= {(a, b) : 2 divides a b} is an equivalence relation.

Solution: R is reflexive, as 2 divides (a a) for all a Z. Further, if (a, b)


R, then 2 divides a b. Therefore, 2 divides b a. Hence, (b, a) R, which
shows that R is symmetric. Similarly, if (a, b) R and (b, c) R, then a b
and b c are divisible by 2. Now, a c = (a b) + (b c) is even (Why?). So,
(a c) is divisible by 2. This shows that R is transitive. Thus, R is an equivalence relation in Z.
In Example 5, note that all even integers are related to zero, as (0, 2),
(0, 4) etc., lie in R and no odd integer is related to 0, as (0, 1), (0, 3) etc.,
do not lie in R. Similarly, all odd integers are related to one and no even
integer is related to one. Therefore, the set E of all even integers and the set
O of all odd integers are subsets of Z satisfying following conditions:
(i) All elements of E are related to each other and all elements of O are
related to each other.
(ii) No element of E is related to any element of O and vice-versa.
(iii)E and O are disjoint and Z = E O.
The subset E is called the equivalence class containing zero and is denoted by [0]. Similarly, O is the equivalence class containing 1 and is denoted by [1]. Note that [0] [1], [0] = [2r] and [1] = [2r + 1], r Z. Infact,
what we have seen above is true for an arbitrary equivalence relation R in
a set X. Given an arbitrary equivalence relation R in an arbitrary set X, R
divides X into mutually disjoint subsets Ai called partitions or subdivisions
of X satisfying:
(i) All elements of Ai are related to each other, for all i.
(ii) No element of Ai is related to any element of Aj , i j.
(iii) Aj= X and Ai Aj = , i j.
The subsets Ai are called equivalence classes. The interesting part of
the situation is that we can go reverse also. For example, consider a subdivision of the set Z given by three mutually disjoint subsets A1, A2 and A3
whose union is Z with
A1 = {x Z : x is a multiple of 3} = {..., 6, 3, 0, 3, 6, ...}

A2 = {x Z : x 1 is a multiple of 3} = {..., 5, 2, 1, 4, 7, ...}

Key Vocabulary
Commutative: A binary
operation on the set X
is called commutative, if
a*b = b a, for every a, b
X.

Mathematics

A3 = {x Z : x 2 is a multiple of 3} = {..., 4, 1, 2, 5, 8, ...}

Define a relation R in Z given by R = {(a, b) :3 divides a b}. Following the arguments similar to those used in Example 5, we can show that R
is an equivalence relation. Also, A1 coincides with the set of all integers in
Z which are related to zero, A2coincides with the set of all integers which
are related to 1 and A3 coincides with the set of all integers in Z which are
related to 2. Thus, A1 = [0], A2 = [1] and A3 = [2]. In fact, A1 = [3r], A2 = [3r +
1] and A3 = [3r + 2], for all r Z.

Key Vocabulary
Equivalence Relation:
A relation R in a set A is
said to be an equivalence
relation if R is reflexive,
symmetric and transitive.

Example 6: Let R be the relation defined in the set A = {1, 2, 3, 4, 5, 6,


7} by R = {(a, b) : both a and b are either odd or even}. Show that R is an
equivalence relation. Further, show that all the elements of the subset {1, 3,
5, 7} are related to each other and all the elements of the subset {2, 4, 6} are
related to each other, but no element of the subset {1, 3, 5, 7} is related to any
element of the subset {2, 4, 6}.
Solution: Given any element a in A, both a and a must be either odd
or even, so that (a, a) R. Further, (a, b) R both a and b must be either
odd or even (b, a) R. Similarly, (a, b) R and (b, c) R all elements
a, b, c, must be either even or odd simultaneously (a, c) R. Hence, R is
an equivalence relation.

Further, all the elements of {1, 3, 5, 7} are related to each other, as all the
elements of this subset are odd. Similarly, all the elements of the subset {2,
4, 6} are related to each other, as all of them are even. Also, no element of
the subset {1, 3, 5, 7} can be related to any element of {2, 4, 6}, as elements of
{1, 3, 5, 7} are odd, while elements of {2, 4, 6} are even.

1.2 TYPES OF FUNCTIONS


The notion of a function along with some special functions like identity
function, constant function, polynomial function, rational function, modulus function, etc. along with their graphs have been given.
Addition, subtraction, multiplication and division of two functions
have also been studied. As the concept of function is of paramount importance in mathematics and among other disciplines as well, we would like
to extend our study about function from where we finished earlier. In this
section, we would like to study different types of functions.
Consider the functions f1, f2, f3 and f4 given by the following diagrams.
In Figure 1.1, we observe that the images of distinct elements of X1 under the functionf1 are distinct, but the image of two distinct elements 1 and 2
of X1 under f2 is same, namely b. Further, there are some elements like e and
f in X2 which are not images of any element of X1 under f1, while all elements
of X3 are images of some elements of X1 under f3. The above observations
lead to the following definitions:
Definition 1: A function f: X Y is defined to be one-one (or injective), if the images of distinct elements of X under f are distinct, i.e.,
for every x1, x2 X, f (x1) = f (x2) implies x1 = x2. Otherwise, f is called
many-one.

The function f1 and f4 in Figure 1.1 (i) and (iv) are one-one and the
function f2 and f3 in Figure 1.1 (ii) and (iii) are many-one.

Definition 2: A function f : X Y is said to be onto (or surjective),


if every element of Y is the image of some element of X under f, i.e.,

Relations and Functions

for every y Y, there exists an element x in X such that f(x) = y.

The function f3 and f4 in Figure 1.1 (iii), (iv) are onto and the function
f1 in Figure 1.1 (i) is not onto as elements e, f in X2 are not the image of any
element in X1 under f1.
1

f1

2
3
4
X1

(i)

a
b
c
d
e
f
X2

4
X1

(iii)

X3

a
b
c
d
e
f

2
3
4
X1

f3
1

f2

(ii)
f4

X2

X1

(iv)

X4

Figure1.1: Types of Functions.


Remark: f: X Y is onto if and only if Range of f = Y.
Definition 3: A function f: X Y is said to be one-one and onto (or
bijective), if f is both one-one and onto.
The function f4 in Figure 1.1 (iv) is one-one and onto.
Example 1: Let A be the set of all 50 students of Class X in a school. Let
f: A N be function defined by f(x) = roll number of the student x. Show
that f is one-one but not onto.
Solution: No two different students of the class can have same roll
number. Therefore, f must be one-one. We can assume without any loss of
generality that roll numbers of students are from 1 to 50. This implies that
51 in N is not roll number of any student of the class, so that 51 cannot be
image of any element of X under f. Hence, f is not onto.
Example 2: Show that the function f : N N, given by f(x) = 2x, is oneone but not onto.
Solution: The function f is one-one, for f (x1) = f (x2) 2x1 = 2x2 x1 =
x2. Further, f is not onto, as for 1 N, there does not exist any x in N such
that f(x) = 2x = 1. Figure 1.1 (i) to (iv)

Example 3: Prove that the function f : RR, given by f(x) = 2x, is oneone and onto.
Solution: f is one-one, as f(x1) = f(x2) 2x1 = 2x2 x1 = x2. Also, giv-

y
y
y
en any real number y in R, there exists in are such that f = 2 = y
2
2
2
Hence, f is onto

Key Vocabulary
Function: A function is a
correspondence between
two sets (called the domain and the range) such
that to each element of
the domain, there is assigned exactly one element of the range.

Mathematics
Y

y = f(x) =2x
X

Example 4: Show that the function f : NN, given by f(1) = f(2) = 1 and
f(x) = x 1, for every x > 2, is onto but not one-one.

Key Vocabulary

Solution: f is not one-one, as f (1) = f (2) = 1. But f is onto, as given any


y N, y 1, we can choose x as y + 1 such that f (y + 1) = y + 1 1 = y. Also
for 1 N, we have f (1) = 1.

Relation: A relation is a
correspondence between
Example 5: Show that the function f : R R,defined as f(x) = x2, is neitwo sets (called the
ther one-one nor onto.
domain and the range)
Solution: Since f ( 1) = 1 = f (1), f is not one to one.Also, the element 2
such that to each element
in the co-domain R is not image of any element x in the domain R (Why?).
of the domain, there is
Therefore f is not onto.
assigned one or more
Example 6: Show that f: N N, given by x1, if is odd,f(x)x1, if is odd, is
elements of the range.
both one-one and onto.
Y
f (x) = x2

f (1) = 1

f (1) = 1
x = 1

x=1

Solution: Suppose f (x1) = f (x2). Note that if x1 is odd and x2 is even,


then we will havex1 + 1 = x2 1, i.e., x2 x1 = 2 which is impossible. Similarly,
the possibility of x1 being even and x2 being odd can also be ruled out, using
the similar argument. Therefore, both x1 and x2 must be either odd or even.
Suppose both x1 and x2 are odd. Then f (x1) = f (x2) x1 + 1 = x2 + 1 x1 = x2.
Similarly, if both x1 and x2 are even, then alsof (x1) = f (x2) x1 1 = x2 1
x1 = x2. Thus, f is one-one. Also, any odd number 2r + 1 in the co-domain
N is the image of 2r+ 2 in the domain N and any even number 2r in the codomain N is the image of 2r 1 in the domain N. Thus, f is onto.
Example 7: Show that an onto function f : {1, 2, 3} {1, 2, 3} is always
one-one.
Solution: Suppose f is not one-one. Then there exists two elements, say
1 and 2 in the domain whose image in the co-domain is same. Also, the image of 3 under f can be only one element. Therefore, the range set can have
at the most two elements of the co-domain {1, 2, 3}, showing that f is not
onto, a contradiction. Hence, f must be one-one.
Example 8: Show that a one-one function f : {1, 2, 3} {1, 2, 3} must be
onto.

Relations and Functions

Solution: Since f is one-one, three elements of {1, 2, 3} must be taken to 3


different elements of the co-domain {1, 2, 3} under f. Hence, f has to be onto.
Remark: The results mentioned in Examples 13 and 14 are also true for
an arbitrary finite set X, i.e., a one-one function f : X X is necessarily onto
and an onto map f : X X is necessarily one-one, for every finite set X. In
contrast to this, Examples 8 and 10 show that for an infinite set, this may
not be true. In fact, this is a characteristic difference between a finite and an
infinite set.

1.3 COMPOSITION OF FUNCTIONS AND


INVERTIBLE FUNCTION
Here, we will study composition of functions and the inverse of a bijective
function. Consider the set A of all students, who appeared in Class X of a
BoardExamination in 2006. Each student appearing in the Board Examination is assigned aroll number by the Board which is written by the students
in the answer script at the time of examination. In order to have confidentiality, the Board arranges to deface the roll numbers of students in the answer scripts and assigns a fake code number to each roll number. Let B
N be the set of all roll numbers and C N be the set of all code numbers.
This gives rise to two functions f: A B and g: B C given by f (a) = the
roll number assigned to the student a and g(b) = the code number assigned
to the roll number b. In this process each student is assigned a roll number
through the function f and each roll number is assigned a code number
through the function g. Thus, by the combination of these two functions,
each student is eventually attached a code number.
This leads to the following definition:
Definition 1 Let f: A B and g: B C be two functions. Then the
composition off and g, denoted by gof, is defined as the function gof
: A C given by

A
x

gof(x) = g(f(x)), x A.

B
f(x)

C
g(f (x))

gof
Example 1: Let f: {2, 3, 4, 5} {3, 4, 5, 9} and g: {3, 4, 5, 9} {7, 11, 15}
be functions defined as f(2) = 3, f (3) = 4, f (4) = f (5) = 5 and g (3) = g (4) = 7
and g (5) = g (9) = 11. Find gof.
Solution: We have gof(2) = g (f (2)) = g (3) = 7,gof (3) = g (f (3)) = g (4) =
7, gof (4) = g (f (4)) = g (5) = 11 and gof(5) = g (5) = 11.
Example 2: Find gof and fog, if f : R R and g : R R are given by f(x)
= cos x and g(x) = 3x2. Show that gof fog.
Solution: We have gof(x) = g(f(x)) = g(cos x) = 3(cos x)2 = 3 cos2x. Similarly, fog(x) = f (g(x)) = f (3x2) = cos (3x2). Note that 3cos2 x cos 3x2, for x =
0. Hence, gof fog.

Mathematics

Example 3:
7
3
Show that if f : R R is defined by
2

5
3x + 4
3
7
f(x) =
and g : R ,R ; I A (x), xA
5x 7
5
2
,I B (x) = x, x Bare called identity functions on sets A and B,respectively.

Solution:

We have

3x + 4
+4
3x + 4
21x + 28 + 20x 28 41x
5x 7
gof(x) = g
=
=
=
=x

3x
+
4
15x + 20 15x + 21 41
5x 7 5
3
5x 7
3x + 4
3
+4
3x + 4
21x + 12 + 20x 12
5x

7
Similarly,fog(x) = f
=
=
=x
3x + 4
5x 7
35x + 20 35x + 21
3
5
5x 7
Thus,gof(x) = x, x Band fog(x) = x, xAwhich
implies that gof = I B and fog = I A
7

Example 4: Show that if :A Bandg : B C are one-one, then


gof : A C is also one-one.
Solution:Suppose gof(x1) = gof(x2)

g(f(x1)) = g(f(x2))

so,

f(x1) = f(x2), as g is one-one

so,

x1 = x2 as f is one-one

Hence gof is one- one


Example 5: Show that if f: A B and g : B C are onto, then gof: A
C is also onto.
Solution Given an arbitrary element z C, there exists a pre-image y
of z under g such that g (y) = z, since g is onto. Further, for y B, there exists
an element x in A with f(x) = y, since f is onto. Therefore, gof(x) = g(f(x)) =
g(y) = z, showing that gof is onto.
Example 6: Consider functions f and g such that composite gof is defined and is oneone.
Are f and g both necessarily one-one
Solution: Consider f : {1, 2, 3, 4} {1, 2, 3, 4, 5, 6} defined as f(x) = x,
x and g : {1, 2, 3, 4, 5, 6} {1, 2, 3, 4, 5, 6} as g(x) = x, for x = 1, 2, 3, 4 and g
(5) = g (6) = 5.Then, gof(x) = x x, which shows that gofis one-one. But g is
clearly not one-one.
Example 7: Are f and g both necessarily onto, if gof is onto?

Solution: Consider f: {1, 2, 3, 4} {1, 2, 3, 4} and g: {1, 2, 3, 4} {1, 2, 3}


defined as f(1) = 1, f (2) = 2, f(3) = f(4) = 3, g(1) = 1, g(2) = 2 and g(3) = g(4) =
3. It can be seen that gof is onto but f is not onto.
Remark: It can be verified in general that gof is one-one implies that f is
one-one. Similarly, gof is onto implies that g is onto.

Relations and Functions

Now, we would like to have close look at the functions f and g described in the beginning in reference to a Board Examination. Each student
appearing in Class X Examination of the Board is assigned a roll number
under the function f and each roll number is assigned a code number under
g. After the answer scripts are examined, examiner enters the mark against
each code number in a mark book and submits to the office of the Board.
The Board officials decode by assigning roll number back to each code
number through a process reverse to g and thus mark gets attached to roll
number rather than code number. Further, the process reverse to f assigns a
roll number to the student having that roll number. This helps in assigning
mark to the student scoring that mark. We observe that while composing f
and g, to get gof, first f and then g was applied, while in the reverse process
of the composite gof, first the reverse process of g is applied and then the
reverse process of f.
Example 8: Let f: {1, 2, 3} {a, b, c} be one-one and onto function given
by f (1) = a, f (2) = b and f (3) = c. Show that there exists a function g : {a, b, c}
{1, 2, 3} such that gof = IX and fog = IY, where, X = {1, 2, 3} and Y = {a, b, c}.
Solution: Consider g: {a, b, c} {1, 2, 3} as g (a) = 1, g (b) = 2 and g (c)
= 3. It is easy to verify that the composite gof = IX is the identity function on
X and the composite fog = IY is the identity function on Y.
Remark: The interesting fact is that the result mentioned in the above
example is true for an arbitrary one-one and onto function f : X Y. Not
only this, even the converse is also true , i.e., if f : X Y is a function such
that there exists a function g : Y X such that gof = IX and fog = IY, then f
must be one-one and onto.
Definition 2: A function f : X Y is defined to be invertible, if there
exists a function g : Y X such that gof = IX and fog = IY. The function g is called the inverse of f and is denoted by f 1.
Thus, if f is invertible, then f must be one-one and onto and conversely,
if f is one-one and onto, then f must be invertible. This fact significantly
helps for proving a function f to be invertible by showing that f is one-one
and onto, specially when the actual inverse of f is not to be determined.
Example 9: Let f: N Y be a function defined as f(x) = 4x + 3, where,
Y = {y N: y = 4x + 3 for some x N}. Show that f is invertible. Find the
inverse.

Solution: Consider an arbitrary element y of Y. By the definition of Y, y

= 4x + 3, for some x in the domain N. This shows that x = y 3 Define g:y


4

N by g ( y ) =

y3
4x + 3 3
Now, gof(x) = g(f(x)) = g(4x + 3) =
=x
4
4

(y 3) 4(y 3)
f
+3= y3+3= y
=
4
And fog(y) = f(g (y)) = 4
and fog =
IY, which implies that f is invertible and g is the inverse of f.
Example 10: Consider f: N N, g : N N and h : N R defined as
f(x) = 2x, g (y) = 3y + 4 and h (z) = sin z, x, y and z in N. Show that ho(gof
) = (hog) of.
Solution: We have
ho(gof)(x)

= h(gof(x)) = h(g(f(x))) = h(g(2x))

10

Mathematics

= h(3(2x) + 4) = h(6x + 4) = sin (6x + 4) xN.

Also, ((hog)of )(x) = (hog)(f(x)) = (hog)(2x) = h(g (2x))


= h(3(2x) + 4) = h(6x + 4) = sin (6x + 4), x N.

This shows that ho(gof) = (hog)of.

This result is true in general situation as well.

Theorem 1: If f: X Y, g : Y Z and h : Z S are functions, then


ho(gof ) = (hog)of.
Proof:We have
ho(gof ) (x) = h(gof(x)) = h(g(f(x))), x in X
and

(hog) of(x) = hog(f(x)) = h(g(f(x))), x in X.

Hence,

ho(gof) = (hog)of.

Example 11: Consider f: {1, 2, 3} {a, b, c} and g : {a, b, c} {apple,


ball, cat} defined as f (1) = a, f (2) = b, f (3) = c, g(a) = apple, g(b) = ball and
g(c) = cat. Show that f, g andgofare invertible. Find out f 1, g1 and (gof)1
and show that
(gof)1 = f1og1.
Solution: Note that by definition, f and g are bijective functions. Let f1:
{a, b, c} (1, 2, 3} and g1 : {apple, ball, cat} {a, b, c} be defined as f1{a} =
1, f 1{b} = 2, f 1{c} = 3, g1{apple} = a, g1{ball} = b and g1{cat} = c.
It is easy to verify that f1of = I{1, 2, 3}, f o f1 = I{a, b, c}, g1og = I{a, b, c} and gog1
= ID, where, D = {apple, ball, cat}. Now, gof: {1, 2, 3} {apple, ball, cat} is
given by gof(1) = apple, gof(2) = ball, gof(3) = cat. We can define
(gof)1: {apple, ball, cat} {1, 2, 3} by (gof)1 (apple) = 1,(gof)1 (ball) =
2 and
(gof)1 (cat) = 3. It is easy to see that (gof)1 o (gof) = I{1, 2, 3} and
(gof)o(gof)1 = ID. Thus, we have seen that f, g and gof are invertible.
Now, f1og1 (apple)= f1(g1(apple)) = f1(a) =1 = (gof)1 (apple)
f1og1 (ball) = f1(g1(ball)) = f1(b) = 2 = (gof)1 (ball) and
f1og1 (cat) = f1(g1(cat)) = f1(c) = 3 = (gof)1 (cat).
Hence (gof)1 = f1og1.
The above result is true in general situation also.
Theorem 2: Let f: X Y and g: Y Z be two invertible functions. Then
gof is also invertible with (gof)1 = f1og1.
Proof: To show that gof is invertible with (gof)1 = f1og1, it is enough
to show that
(f1og1)o(gof) = IX and (gof)o( f1og1) = IZ.
Now, (f1og1)o(gof) = ((f1og1) og) of, by Theorem 1
= (f1o(g1og)) of, by Theorem 1
= (f1oIY) of, by definition of g1
= IX.
Similarly, it can be shown that (gof)o (f1og1) = IZ.
Example 12: Let S = {1, 2, 3}. Determine whether the functions f:S S

Relations and Functions

defined as below have inverses. Find f1, if it exists.


(a)f = {(1, 1), (2, 2), (3, 3)}
(b)f = {(1, 2), (2, 1), (3, 1)}
(c) f = {(1, 3), (3, 2), (2, 1)}
Solution:
(a) It is easy to see that f is one-one and onto, so that f is invertible with
the inverse
f1of f given by f1 = {(1, 1), (2, 2), (3, 3)} = f.
(b) Since f(2) = f(3) = 1, f is not one-one, so that f is not invertible.
(c) It is easy to see that f is one-one and onto, so that f is invertible with
f1 = {(3, 1), (2, 3), (1, 2)}.

1.4 BINARY OPERATION


Right from the school days, you must have come across four fundamental
operations namely addition, subtraction, multiplication and division. The
main feature of these operations is that given any two numbers a and b, we
a
associate another number a + b or a b or abor , b 0. It is to be noted that
b
only two numbers can be added or multiplied at a time. When we need to
add three numbers, we first add two numbers and the result is then added
to the third number. Thus, addition, multiplication, subtraction and division are examples of binary operation, as binary means two. If we want
to have a general definition which can cover all these four operations, then
the set of numbers is to be replaced by an arbitrary set X and then general
binary operation is nothing but association of any pair of elements a, b from
X to another element of X.
This gives rise to a general definition as follows:
Definition 1 A binary operation on a set A is a function :AA A.
We denote (a, b) by a*b.

Example 1: Show that addition, subtraction and multiplication are binary operations on R, but division is not a binary operation on R. Further,
show that division is a binary operation on the set R of nonzero real numbers.
Solution:+ :R R R is given by

(a, b) a + b

: R R R is given by

(a, b) a b

: R R R is given by

(a, b) ab

Since +, and are functions, they are binary operations on R.


a
But : R RR, given by (a, b) , is not a function and hence not a
b
binary operation, as for b = 0, a b is not defined. However, : R R R,

11

12

Mathematics

given by (a, b) a b is a function and hence a binary operation on R.


Example 2: Show that subtraction and division are not binary operations on N.
Solution: :N NN, given by (a, b) a b, is not binary operation,
as the image of (3, 5) under is 3 5 = 2 N. Similarly, :N NN,
given by (a, b) a b is not a binary operation, as the image of (3, 5) under
3
is 3 5 = 5 N.

Example 3: Show that : R R R given by (a, b) a + 4b2 is a binary


operation.

Solution: Since carries each pair (a, b) to a unique element a + 4b2 in


R, is a binaryoperation on R.

Example4: Let P be the set of all subsets of a given set X. Show that :
P P P given by (A, B) A B and : P P P given by (A, B) A
B are binary operations on the set P.
Solution: Since union operation carries each pair (A, B) in P P to a
unique element A B in P, is binary operation on P. Similarly, the intersection operation carries each pair (A, B) in P P to a unique element A
B in P, is a binary operation on P.
Example 5: Show that the : R R R given by (a, b) max {a, b} and
the : R R R given by (a, b) min {a, b} are binary operations.
Solution: Since carries each pair (a, b) in R R to a unique element
namely maximum of a and b lying in R, is a binary operation. Using the
similar argument, one can say that is also a binary operation.
Remark: (4, 7) = 7, (4, 7) = 4, (4, 7) = 4 and (4, 7) = 7.
When number of elements in a set A is small, we can express a binary
operation on the set A through a table called the operation table for the
operation . For example consider A = {1, 2, 3}. Then, the operation on A
defined in Example 31 can be expressed by the following operation table
(Table 1.1). Here, (1, 3) = 3, (2, 3) = 3, (1, 2) = 2.
Table 1.1
V

Here, we are having 3 rows and 3 columns in the operation table with
(i, j) the entry of the table being maximum of ith and jth elements of the set A.
This can be generalised for general operation :AA A. If A = {a1, a2, ...,
an}. Then the operation table will be having n rows and n columns with (i,
j)th entry being aiaj. Conversely, given any operation table having n rows
and n columns with each entry being an element of A = {a1, a2, ..., an}, we can
define a binary operation : AA A given by aiaj = the entry in the i th
row and j th column of the operation table.

Relations and Functions

One may note that 3 and 4 can be added in any order and the result is
same, i.e., 3 + 4 = 4 + 3, but subtraction of 3 and 4 in different order give different results, i.e., 4 4 3. Similarly, in case of multiplication of 3 and 4,
order is immaterial, but division of 3 and 4 in different order give different
results. Thus, addition and multiplication of 3 and 4 are meaningful, but
subtraction and division of 3 and 4 are meaningless. For subtraction and
division we have to writesubtract 3 from 4, subtract 4 from 3, divide 3
by 4 or divide 4 by 3.
This leads to the following definition:
Definition 2: A binary operation on the set X is called commutative, if a*b = b a,for every a, b X.

Example 1: Show that + : R R R and : R R R are commutative


binaryoperations, but :R R R and : R R R are not commutative.
Solution: Since a + b = b + a and a b = b a, a, b R, + and
arecommutative binary operation. However, is not commutative, since
3 4 4 3.Similarly, 3 4 4 3 shows that is not commutative.
Example 2: Show that : R RR defined by a*b = a + 2b is not commutative.
Solution: Since 34 = 3 + 8 = 11 and 43 = 4 + 6 = 10, showing that the
operation is not commutative.

If we want to associate three elements of a set X through a binary operation on X,we encounter a natural problem. The expression a*b c may be
interpreted as(a*b) c or a (b c) and these two expressions need not be
same. For example, (8 5) 2 8 (5 2). Therefore, association of three
numbers 8, 5 and 3 throughthe binary operation subtraction is meaningless, unless bracket is used. But in caseof addition, 8 + 5 + 2 has the same
value whether we look at it as ( 8 + 5) + 2 or as8 + (5 + 2). Thus, association
of 3 or even more than 3 numbers through addition ismeaningful without
using bracket. This leads to the following:
Definition 3: A binary operation :AA A is said to be associative if(a*b) c = a (b c), a, b, c, A.

Example 3: Show that addition and multiplication are associative binary operation on R. But subtraction is not associative on R. Division is not
associative on R.
Solution: Addition and multiplication are associative, since (a + b) + c
= a + (b + c) and(a b) c = a (b c) a, b, c R. However, subtraction
and division are notassociative, as (8 5) 3 8 (5 3) and (8 5) 3 8
(5 3).
Example 4: Show that : R R R given by a*b a + 2b is not associative.
Solution: The operation is not associative, since

(8 5) 3 = (8 + 10) 3 = (8 + 10) + 6 = 24,while 8 (5 3) = 8 (5 + 6)


= 8 11 = 8 + 22 = 30.

Remark: Associative property of a binary operation is very important


in the sense thatwith this property of a binary operation, we can write a1
a2 ... an which is notambiguous. But in absence of this property, the expression a1 a2 ... an is ambiguousunless brackets are used. Recall that
in the earlier classes brackets were used wheneversubtraction or division
operations or more than one operation occurred.

13

14

Mathematics

For the binary operation + on R, the interesting feature of the number


zero is that a + 0 = a = 0 + a, i.e., any number remains unaltered by adding
zero. But in case of multiplication, the number 1 plays this role, as a 1 = a
= 1 a, a in R. This leads to the following definition:

Definition 4: Given a binary operation :AA A, an element e


A, if it exists, is called identity for the operation , if a e = a = e a,
a A.

Example 5: Show that zero is the identity for addition on R and 1 is


the identity for multiplication on R. But there is no identity element for the
operations :R R R and :R R R.
Solution: a + 0 = 0 + a = a and a 1 = a = 1 a, a R implies that 0 and
1 are identity elements for the operations + and respectively. Further,
there is no element e in R with a e = e a, a. Similarly, we cannot find
any element e in R such thata e = e a, a in R. Hence, and do not
have identity element.
Remark: Zero is identity for the addition operation on R but it is not
identity for the addition operation on N, as 0 N. In fact the addition operation on N does not have any identity.
One further notices that for the addition operation + : R R R, given
any a R, there exists a in R such that a + ( a) = 0 (identity for +) = ( a)
+ a.
Similarly, for the multiplication operation on R, given any a 0 in R,

1
1
1
we can choose a in R such that a = 1(identity for ) = a. This leads
a
a
to the following definition:
Definition 5: Given a binary operation : AA A with the identity
element e in A, an element a A is said to be invertible with respect
to the operation , if there exists an element b in A such that a*b = e
= b a and b is called the inverse of a and is denoted by a1.
Example 6: Show that a is the inverse of a for the addition operation

1
+ on R and is the inverse of a 0 for the multiplication operation on
a
R.
Solution: As a + ( a) = a a = 0 and ( a) + a = 0, a is the inverse of a
1
1
1
for addition. Similarly, for a 0, a = 1 = a implies that is the ina
a
a
verse of a for multiplication.
Example 7: Show that a is not the inverse of a N for the addition

1
operation + on N and
is not the inverse of a N for multiplication operaa
tion on N, for a 1.

Solution: Since a N, a cannot be inverse of a for addition operation


on N, although a satisfies a + ( a) = 0 = ( a) + a.
1
Similarly, for a 1 in N, a N, which implies that other than 1 no element of Nhas inverse for multiplication operation on N.

15

Relations and Functions

Example 8: If R1 and R2 are equivalence relations in a set A, show that


R1 R2 is also an equivalence relation.
Solution: Since R1and R2 are equivalence relations, (a, a) R1 and (a, a)
R2a A.

This implies that (a, a) R1 R2, a, showing R1 R2 is reflexive. Further,


(a, b) R1 R2 (a, b) R1and (a, b) R2 (b, a) R1 and (b, a) R2

(b, a) R1 R2, hence, R1 R2 is symmetric. Similarly, (a, b) R1 R2 and


(b, c) R1 R2 (a, c) R1and (a, c) R2 (a, c) R1 R2. This shows that

R1 R2 is transitive. Thus R1 R2 is an equivalence relation.

Example 9: Let R be a relation on the set A of ordered pairs of positive


integers defined by (x, y) R (u, v) if and only if xv = yu. Show that R is an
equivalence relation.
Solution: Clearly, (x, y) R (x, y), (x, y) A, since xy = yx. This shows
that R is reflexive. Further, (x, y) R (u, v) xv = yuanduy = vxand hence
(u, v) R (x, y). This shows that R is symmetric. Similarly, (x, y) R (u, v) and

b
a
a
(u, v) R (a, b) xv = yu and ub = va xv = yu u xv v =yu u xb = ya
u
and hence (x, y) R (a, b). Thus, R is transitive. Thus, R is an equivalence
relation.

Example 10: Let X = {1, 2, 3, 4, 5, 6, 7, 8, 9}. LetR1 be a relation in X given


by R1 = {(x, y) : x y is divisible by 3} and R2 be another relation on X given
by R2 = {(x, y): {x, y} {1, 4, 7}} or {x, y} {2, 5, 8} or {x, y} {3, 6, 9}}. Show
that R1= R2.

Solution: Note that the characteristic of sets {1, 4, 7}, {2, 5, 8} and {3, 6,
9} is that difference between any two elements of these sets is a multiple of
3. Therefore, (x, y) R1 x y is a multiple of 3 {x, y} {1, 4, 7} or {x, y}
{2, 5, 8} or {x, y} {3, 6, 9} (x, y) R2. Hence, R1R2. Similarly, {x, y} R2
{x, y} {1, 4, 7} or {x, y} {2, 5, 8} or {x, y} {3, 6, 9} x y is divisible by 3
{x, y} R1. This shows that R2R1. Hence, R1 = R2.

Example 11: Let f: X Y be a function. Define a relation R in X given by


R = {(a, b): f(a) = f(b)}. Examine whether R is an equivalence relation or not.
Solution: For every a X, (a, a) R, since f (a) = f (a), showing that R is
reflexive. Similarly, (a, b) R f (a) = f (b) f (b) = f (a) (b, a) R. Therefore, R is symmetric. Further, (a, b) R and (b, c) R f (a) = f (b) and f (b)
= f (c) f (a) = f (c) (a, c) R, which implies that R is transitive. Hence, R
is an equivalence relation.

Example 12: Find the number of all one-one functions from set A = {1,
2, 3} to itself.
Solution: One-one function from {1, 2, 3} to itself is simply a permutation on three symbols 1, 2, 3. Therefore, total number of one-one maps from
{1, 2, 3} to itself is same as total number of permutations on three symbols
1, 2, 3 which is 3! = 6.
Example 13: Let A = {1, 2, 3}. Then show that the number of relations
containing (1, 2) and (2, 3) which are reflexive and transitive but not symmetric is three.
Solution: The smallest relation R1 containing (1, 2) and (2, 3) which is
reflexive and transitive but not symmetric is {(1, 1), (2, 2), (3, 3), (1, 2), (2, 3),
(1, 3)}. Now, if we add the pair (2, 1) to R1 to get R2, then the relation R2 will

16

Mathematics

be reflexive, transitive but not symmetric. Similarly, we can obtain R3 by


adding (3, 2) to R1 to get the desired relation.
However, we cannot add two pairs (2, 1), (3, 2) or single pair (3, 1) to
R1 at a time, as by doing so, we will be forced to add the remaining pair in
order to maintain transitivity and in the process, the relation will become
symmetric also which is not required. Thus, the total number of desired
relations is three.
Example 14:Show that the number of equivalence relation in the set {1,
2, 3} containing (1, 2) and (2, 1) is two.
Solution: The smallest equivalence relation R1 containing (1, 2) and
(2, 1) is {(1, 1), (2, 2), (3, 3), (1, 2), (2, 1)}. Now we are left with only 4 pairs
namely (2, 3), (3, 2), (1, 3) and (3, 1). If we add any one, say (2, 3) to R1, then
for symmetry we must add.
(3, 2) also and now for transitivity we are forced to add (1, 3) and (3, 1).
Thus, the only equivalence relation bigger than R1 is the universal relation.
This shows that the total number of equivalence relations containing (1, 2)
and (2, 1) is two.
Example 15: Show that the number of binary operations on {1, 2} having 1 as identity and having 2 as the inverse of 2 is exactly one.
Solution: A binary operation on {1, 2} is a function from {1, 2} {1, 2}
to {1, 2}, i.e., a function from {(1, 1), (1, 2), (2, 1), (2, 2)} {1, 2}. Since 1 is the
identity for the desired binary operation , (1, 1) = 1, (1, 2) = 2, (2, 1) =
2 and the only choice left is for the pair (2, 2). Since 2 is the inverse of 2, i.e.,
(2, 2) must be equal to 1. Thus, the number of desired binary operation is
only one.
Example 16: Consider the identity function IN : N N defined as IN (x)
= x x N. Show that although IN is onto but IN + IN : N N defined as (IN
+ IN) (x) = IN (x) + IN (x) = x + x = 2x is not onto.

Solution: Clearly IN is onto. But IN + IN is not onto, as we can find an


element 3 in the co-domain N such that there does not exist any x in the
domain N with (IN + IN) (x) = 2x = 3.

1.5 MULTIPLE CHOICE QUESTIONS


1.

2.

3.

4.

Find the domain for the relation: {(1,2), (5,3), (3,6),(2,4)}


(a) {2,3,4,6}

(b) {1,2,3,5}

(c) {1,2,4,6}

(d) none of these

Find the range for the relation:{(3,5), (2,5), (2,6),(3,7)}


(a) {2,3}

(b) {5,6,7}

(c) {5,5,6,7}

(d) none of these

Find the domain of {(1,2),(2,3),(3,5),(4,5),(5,6)}. Is it a function?


(a) {1,2,3,4,5}, yes

(b) {1,2,3,4,5}, no

(c) {2,3,5,6}, yes

(d) {2,3,5,6}, no

{(1,2), (2,3), (-1,4),(-3,7),(7,2),(1,-4)} is a function.


(a) True

5.

Evaluate f(x) = 3x - 5 for x = 2

(b) False

17

Relations and Functions

6.

7.

8.

9.

(a) f(2) = 4

(b) f(2) = 27

(c) f(2) = 1

(d) none of these

Find the range for f(x) = -x + 3, given the domain {-3, 0, 4}


2

(a) {12,-3,19}

(b) {-6,3,-13}

(c) {9,-3,-5}

(d) none of these

Find the range for g(x) = x - 4x, given the domain {-2,1,3}
2

(a) {6, 3, -3}

(b) {12, -2, -3}

(c) {6, -2, -3}

(d) {12,-3}

A relation is . a function.
(a) never

(b) always

(c) sometimes

(d) none of these

The domain of a relation is the set of x-coordinates.


(a) False

(b) True

10. The range is the set of first coordinates of a relation.


(a) False

(b) True

1.6 REVIEW QUESTIONS


1. Determine whether each of the following relations are reflexive,
symmetric and transitive:
(i) Relation R in the set A = {1, 2, 3, ..., 13, 14} defined as R = {(x, y)
: 3x y = 0}
(ii) Relation R in the set N of natural numbers defined as R = {(x, y)
: y = x + 5 and x < 4}
(iii) Relation R in the set A = {1, 2, 3, 4, 5, 6} as R = {(x, y) : y is divisible by x}
(iv) Relation R in the set Z of all integers defined as R = {(x, y) : x y
is an integer}
(v) Relation R in the set A of human beings in a town at a particular time given by
(a) R = {(x, y) : x and y work at the same place}
(b) R = {(x, y) : x and y live in the same locality}
(c) R = {(x, y) : x is exactly 7 cm taller than y}
(d) R = {(x, y) : x is wife of y}
(e) R = {(x, y) : x is father of y}
2. Give an example of a relation. Which is..
(i) Symmetric but neither reflexive nor transitive.
(ii) Transitive but neither reflexive nor symmetric.
(iii) Reflexive and symmetric but not transitive.
(iv) Reflexive and transitive but not symmetric.
(v) Symmetric and transitive but not reflexive.
3. Show that the Signum Function f: RR, given by

1, if x 0
f ( x) 0, if x 0
1, if x 0
is neither one-one nor onto.

18

Mathematics

4. Let A = R {3} and B = R {1}. Consider the function f: A B de

x2
Is f one-one and onto? Justify your answer.
x3

fined by f ( x) =

5. State with reason whether following functions have inverse


(i) f: {1, 2, 3, 4} {10} with

f = {(1, 10), (2, 10), (3, 10), (4, 10)}

(ii) g: {5, 6, 7, 8} {1, 2, 3, 4} with


g = {(5, 4), (6, 3), (7, 4), (8, 2)}

(iii) h: {2, 3, 4, 5} {7, 9, 11, 13} with


h = {(2, 7), (3, 9), (4, 11), (5, 13)}

x
is one-one. Find
( x + 2)
the inverse of the function f: [1, 1] Range f. 7. Consider f : R R
given by f (x) = 4x + 3. Show that f is invertible. Find the inverse of
f.

6. Show that f: [1, 1] R, given by f(x) =

7. For each operation defined below, determine whether is binary,


commutative or associative.
(i) On Z, define a*b = a b
(ii) On Q, define a*b = ab + 1
(iii) On Q, define a*b = ab/2
(iv) On Z+, define a*b = 2ab
(v) On Z+, define a*b = ab
(vi) On R { 1}, define a*b = a / b+1.

8. Consider a binary operation on the set {1, 2, 3, 4, 5} given by the


following multiplication table.
(i) Compute (23)4 and 2(34)
(ii) Is commutative?
(iii) Compute (23)(45).
1

9. Let be the binary operation on the set {1, 2, 3, 4, 5} defined bya


b = H.C.F. of a and b.Is the operation same as the operation
defined in Exercise 4 above? Justify your answer.
10. Let be the binary operation on N given by a*b = L.C.M. of a and b.
Find
(i) 5 7, 20 16

(ii) Is commutative?

19

Relations and Functions

(iii) Is associative?

(iv) Find the identity of in N

(v) Which elements of N are invertible for the operation ?

ANSWER FOR MULTIPLE CHOICE QUESTIONS


1. (b)

2.(b)

3.(a)

4.(b)

5.(c)

6. (b)

7.(d)

8.(c)

9.(b)

10.(b)

Chapter 2

Inverse Trigonometric
Functions
Objectives
After studying this
chapter, you will be
able to:
Define
inverse
trigonometric
functions
Discuss the
domains
and ranges
of inverse
trigonometry
functions
Describe the
graphs of
inverse trigonometryfunctions
Explain the
properties
of inverse
trigonometric
functions

INTRODUCTION

function that is a solution of the problem of finding an arc (number)


from a given value of its trigonometric function. The six inverse trigonometric functions correspond to the six trigonometric functions:
Arc sin x, the inverse sine of x;
Arc cos x, the inverse cosine of x;
Arc tan x, the inverse tangent of x;
Arc cot x, the inverse cotangent of x;
Arc sec x, the inverse secant of x; and
Arc csc x, the inverse cosecant of x.
As an example, according to these definitions, x = Arc sin a is any solution of the equation sin x = a; that is, sin Arc sin a = a. The functions Arc sin x
and Arc cos x are defined in the real domain for |x| 1, the functions Arc tan
x and Arc cot x are defined for all real x, and the functions Arc sec x and Arc
csc x are defined for x 1. The last two functions are rarely used.
Since trigonometric functions are periodic, their inverse functions are
multiple-valued. Certain single-valued branches (the principal branches) of
these functions are designated as arc sin x, arc cos x, , arc csc x. Specifically,
arc sin x: is the branch of the function Arc sin x for which /2 arc sin x
/2. Similarly, the functions arc cos x, arc tan x, and arc cot x are determined,
respectively, from the conditions 0 < arc cos x , /2 < arc tan x < /2, and
0 < arc cot x < . The inverse trigonometric functions Arc sin x, can easily
be expressed in terms of arc sin x, ; for example,
Arc sin x = (1)n arc sin x + n
Arc cos x = arc cos x + 2n
Arc tan x = arc tan x + n
Arc cot x = arc cot x + n
where n = 0, 1, 2, .
The well-known relations between the trigonometric functions yield relations between the inverse trigonometric functions, for example, the formula
tan x = (sin x) / 1 sin 2 x , / 2 < x < / 2

implies that

21

Inverse Trigonometric Functions

arc sin a = arc tan(a / 1 a 2 ), a < 1


The derivatives of the inverse trigonometric functions have the form
(arc sin x)' = 1 / 1 x 2
(arc cos x)' = 1 / 1 x 2
(arc tan x)' = 1 / (1 + x 2 )
(arc cot x)' = 1 / (1 + x 2 )
The inverse trigonometric functions can be represented as power series, for example,
1 x 3 1.3 x 5 1.3.5 x7
+
+
+
+ ...
2 3 2.4 5 2.4.6 7
x3 x5
arc tan x = x
+
...
3
5
arc sin x = x +

Key Vocabulary

Domain: A function that


has an inverse has exactly
one output (belonging
The inverse trigonometric functions can be determined for arbitrary
to the range) for every
complex values of the independent variable, but the values of these funcinput.
tions will be real only for the values of the independent variable indicated
above. Inverse trigonometric functions of a complex independent variable
can be expressed by means of a logarithmic function, for example,
These series converge for 1 x 1.

Arc tan z =

1
iz
Ln

2i
i+z

2.1 DEFINITION TO INVERSE


TRIGONOMETRIC FUNCTIONS
The inverse sine function, denoted by sin-1 x (or arcsin x), is defined to bethe
inverse of the restricted sine function

cos x, 0 x

1
2

0
1

y = sin x

1
0

y = sin x, x
2
2

y = sin1 x

1 x

The inverse cosine function, denoted by cos-1 x (or arccos x), is defined
tobe the inverse of the restricted cosine function
cos x, 0 x

22

Mathematics

y
1

1
0

0
1

1 x

1 0

y = cos x, 0 x

y = cos x

y = cos1 x

The inverse tangent function, denoted by tan-1 x (or arctan x), is defined
to be the inverse of the restricted tangent function
tan x,

Key Vocabulary
Graph: A graph is a
representation of a set
of objects where some
pairs of the objects are
connected by links.

<x<
2
2

1
0

3 x
2

2
1
0

x
2

y = tan x, 2 < x < 2

y = tan x

0 1

y = tan1 x

The inverse cotangent function, denoted by cot-1 x (or arccot x), is defined to be the inverse of the restricted cotangent function
cot x,

0<x<

0 1

y = cot1 x

y = cot x, 0 < x <

y = cot x

The inverse secant function, denoted by sec-1 x (or arcsec x), is defined
to be the inverse of the restricted secant function
sec x, x 0, / 2) , 3 / 2)

or x 0, / 2) ( / 2,

y = sec x

y
3
2
0

3
y = sec x, 0 < x < , < x <
2
2

2
1 0 1

y = sec x
1

The inverse cosecant function, denoted by csc-1 x (or arccsc x), is definedto be the inverse of the restricted cosecant function
cos x, x (0, / 2 ( , 3 / 2

or x / 2,0) (0, / 2

Inverse Trigonometric Functions

2.2 IDENTIFY THE DOMAINS AND


RANGES OF INVERSE TRIGONOMETRY
FUNCTIONS
A function that has an inverse has exactly one output (belonging to the
range) for every input (belonging to the domain), and vice versa. To keep
inverse trig functions consistent with this definition, you have to designate
ranges for them that will take care of all the possible input values and not
have any duplication.
The output values of the inverse trig functions are all angles in either
degrees or radians and they are the answer to the question, Which angle gives me this number? In general, the output angles for the individual
inverse functions are paired up as angles in Quadrants I and II or angles
in Quadrants I and IV. The quadrants are selected this way for the inverse
trig functions because the pairs are adjacent quadrants, allowing for both
positive and negative entries. The notation for these inverse functions uses
capital letters.

Domain and Range of Inverse Sine Function


The domain for Sin1 x, or Arcsin x, is from 1 to 1. In mathematical notation, the domain or input values, the xs, fit into the expression
1 x 1,

because no matter what angle measure you put into the sine function, the
output is restricted to these values. The range, or output, for Sin1 x is all
angles from 90 to 90 degrees or, in radians,

angle
If the output is the
angle
then you write these expressions as
90 90o or

2
2

The outputs are angles in the adjacent Quadrants I and IV, because the
sine is positive in the first quadrant and negative in the second quadrant.
Those angles cover all the possible input values.

23

Key Vocabulary
Inverse Sine Function:
denoted by sin-1 x (or
arcsin x), is defined to
be the inverse of the
restricted sine function.

24

Mathematics

Domain and Range of Inverse Cosine Function


The domain for Cos1 x, or Arccos x, is from 1 to 1, just like the inverse sine
function. So the x (or input) valuesare -1 x 1
The range for Cos1 x consists of all angles from 0 to 180 degrees or, in
radians,0 to . If the output is the angle then you write these expressions
as 0180 or 0
The outputs are angles in the adjacent Quadrants I and II, because the
cosine is positive in the first quadrant and negative in the second quadrant.
Those angles cover all the possible input values for the function.

Key Vocabulary

Domain and Range of Inverse Tangent Function

Range: A set of values


that a number can
have. A range is usually
specified by its maximum
and minimum value.

The domain for Tan1 x, or Arctan x, is all real numbers numbers from
- to.
This is because the output of the tangent function, this functions inverse, includes all numbers, without any bounds. The range, or output, of
Tan1 x is angles between 90 and 90 degrees or, in radians, between

and
2
2
One important note is that the range does not include those beginning
and ending angles; the tangent function is not defined for 90 or 90 degrees.
The range of Tan1 x includes all the angles in the adjacent Quadrants I and
IV, except for the two angles with terminal sides on the y-axis.

Domain and Range of Inverse Cotangent Function


The domain of Cot1 x, or Arccot x, is the same as that of the inverse tangent function. The domain includes all real numbers. The range, though,
is different it includes all angles between 0 and 180 degrees between 0
and .
So any angle in Quadrants I and II is included in the range, except
for those with terminal sides on the x-axis. Those two angles are not in
the domain of the cotangent function, so they are not in the range of the
inverse.

Domain and Range of Inverse Secant Function


The domain of Sec1 x, or Arcsec x, consists of all the numbers from 1 on up
plus all the numbers from 1 on down. Letting x be the input, you write this
expression as x 1or x -1.
In other words, the domain includes all the numbers from - to .
except for the numbers between 1 and 1. The range of Sec1 x is
all the angles between 0 and 180 degrees except for 90 degrees

(between 0 and except for ) meaning all angles in Quadrants I and II,
2
with the exception of 90 degrees, or 2 and 2

Inverse Trigonometric Functions

Domain and Range of Inverse Cosecant Function


The domain of csc1 x, or Arccsc x, is the same as that for the inverse secant
function, all the numbers from 1 on up plus all the numbers from 1 on
down. The range is different, though it includes all angles between 90

and 90 degrees except for 0 degrees or, in radians, between and except
2
2
for 0 radians.
In short, the range is all the angles in the Quadrants I and IV, with the
exception of 0 degrees, or 0 radians.

2.3 GRAPHS OF INVERSE


TRIGONOMETRY
Trigonometric functions are all periodic functions. Thus the graphs of none
of them pass the Horizontal Line Test and so are not 1-to-1. This means
none of them have an inverse unless the domain of each is restricted to
make each of them 1-to-1.
Since the graphs are periodic, if we pick an appropriate domain we can
use all values of the range.


If we restrict the domain of f(x) = sin x to , we have made the
2 2
function 1-to-1. The range is [1, 1].

(Although there are many ways to restrict the domain to obtain a 1-to-1
function this is the agreed upon interval used.)
We denote the inverse function as y = sin1 x. It is read y is the inverse of
sine x and means y is the real number angle whose sine value is x. Be careful of the notation used. The superscript 1 is not an exponent. To avoid
this notation, some books use the notation y = arcsin x instead.
To graph the inverse of the sine function, remember the graph is a reflection over the line y = x of the sine function.

25

26

Mathematics

Notice that the domain is now the range and the range is now the domain. Because the domain is restricted all positive values will yield a 1st
quadrant angle and all negative values will yield a 4th quadrant angle.
Similarly, we can restrict the domains of the cosine and tangent functions to make them 1-to-1.

The domain of the inverse cosine function is [1, 1] and the range is [0,
]. That means a positive value will yield a 1st quadrant angle and a negative value will yield a 2nd quadrant angle.

27

Inverse Trigonometric Functions

The domain of the inverse tangent function is (, ) and the range is



2 , 2

The inverse of the tangent function will yield values in the 1st and
4th quadrants.
The same process is used to find the inverse functions for the remaining trigonometric functions-cotangent, secant and cosecant.
Function

Domain

Range

sin1x

[1, 1]


2 , 2

cos1x

[1, 1]

0,

tan1x

(, )


2 , 2

cot1x

(, )

0,

28

Mathematics

sec1x

(, )


0, ,
2 2

csc1x

(, )


,0 0,
2 2

2.4PROPERTIES OF INVERSE
TRIGONOMETRIC FUNCTIONS
Following are the properties of inverse trigonometric functions:
(1) (a)sin(sin 1 x) = x

1
= cos ec 1 x
x
1
cos 1 = s ec 1 x
x
1
tan 1 = cot 1 x
x

(2) sin 1

cos(cos x) = x
tan(tan 1 x) = x
sec(sec 1 x) = x,
cos ec(cos ec 1 x) = x
cot(cot 1 x) = x
(b)sin 1 (sin x) = x
cos 1 (cos x) = x
tan 1 (tan x) = x
s ec 1 (sec x) = x
cos ec 1 (cos ec x) = x
cot 1 (cot x) = x
(3) sin 1 ( x) = sin 1 x
1

cos ( x) = cos x
tan 1 ( x) = tan 1 x
sec 1 ( x) = sec 1 x
cos ec 1 ( x) = cos ec 1 x

1
1
tan x + cot x =
2

(4) sin 1 x + cos 1 x =

sec 1 x + cosec 1 x =

cot 1 (cot x) = cot 1 x


(5) sin 1 x sin 1 y = sin 1 [x (1 y 2 )
2

y (1 x )]
cos 1 x cos 1 y = cos 1 cos1 [xy m (1 x 2 )
(1 y 2 )]
tan 1 x tan 1 y =

xy
1 m xy

2x
= cos 1
1 + x2
2x
tan 1
1 x2

(6) 2 tan 1 x = sin 1


1 x2
1 + x2

Inverse Trigonometric Functions

Proof of property number 1):


sin(sin-1x) = x
Proof:
Let (sin-1x) = y
Hence sin (sin-1x) = sin y x = sin y
Hence sin (sin-1x) = x

Similarly cos(cos-1x) = x, tan(tan-1x) = x, sec(sec-1x) = x, cosec(cosec-1x) =


x and cot(cot -1x) = x can also be proved.
b.sin-1(sin x) = x
Proof:
Let sin x = y
Hence sin-1(sin x) = sin-1y = 0
But sin x = y sin-1y = x

Hence sin-1(sin x) = sin-1y = x


Similarly other parts of b) can also be proved.
Proof of Property Number 2):
1
= cosec-1x
x
Proof:
sin-1

Let cosec-1x = y
Hence x = cosec y

1
1
= sin y sin-1 =y
x
x

Hence, sin-1 2 = cosec-1x

Similarly other parts can also be proved.


Proof of Property Number 3):
sin-1(-x) = -sin-1x
Proof:
Let sin-1(-x) = y = 0
Hence -x = sin y
Or

x = -sin y = sin (-y)

Hence x = sin (-y) sin-1x = -y or y = -sin-1x


Similarly tan and cosec can also be proved.
cos-1(-x) = n - cos-1x
Proof:
Let cos-1(-x) = y
Hence -x = cos y or x = -cos y = cos(n - y)
Hence cos-1x = n - y y = n - cos-1x
Hence, cos-1(-x) = - cos-1x

Similarly sec and cot can also be proved.


Proof of Property Number 4):

29

30

Mathematics

sin-1x + cos-1x =
Proof:

Let sin-1x = y
Hence x = sin y = cos(
Hence cos-1x =

- y)
2

- y or cos-1x
- sin-1x
2
2

2
Similarly the other 2 parts can also be proved.
Hence, sin-1x + cos-1x =

Proof of Property Number 5):


sin-1x sin-1y = sin-1[x(1 - y2) y(1 - x2)]
Proof:
Let sin-1x = and sin-1y =
Hence x = sin and y = sin
Then sin( ) = sin cos cos sin
= sin (1 - sin2) sin(1 - sin2 )
= x(1 - y2) y(1 - x2)
Hence, = sin-1[x(1 - y2) y(1 - x2)]
Thus, sin-1x sin-1y = sin-1[x(1 - y2) y(1 - x2)]
Similarly, the other two parts can also be proved.
Proof of property number 6):
2x
1 x2
2x
= cos 1
= tan 1
2
2
1+ x
1+ x
1 + x2
Let tan 1 x = y than x = tan y
2 tan 1 x = sin

2 tan y
2y = sin 1

2
1 + tan y
1 + tan y
1 tan 2 y
1 tan 2 y
cos 2y =
2y = cos 1

2
2
1 + tan y
1 + tan y
Now we know that sin 2y =

2 tan y
2

2 tan y
2y = tan 1

2
1 tan y
1 tan y
2
2 tan y

2 tan y
1 1 tan y
1
Hence 2y = sin 1
cos
=

= tan
2
2
2

1 + tan y
1 + tan y
1 tan y

and tan 2y =

2 tan y
2

Hence, 2 tan 1 x = sin 1

Example 1: Evaluate:
cos[sin 1

3
5
+ sin 1 ]
5
13

2
2x
2x
1 1 x
cos
=
= tan 1
1 + x2
1 + x2
1 x2

31

Inverse Trigonometric Functions

Solution:Let
3
5
sin 1 = and sin 1 = , then
5

13
3
5
sin = and sin =
5
13
4
12
cos = and cos =
5
13
The given expression becomes cos[+]
= cos cos sin sin
4 12 3 5 33
= . . . =
5 13 5 13 65
Example 2: Prove that:
4
12
33
cos 1 + cos 1 = cos 1
5
13


65

Solution:Applying the property

16
144
cos 1 x + cos 1 y = cos 1 xy 1
1

25
169

33
= cos 1
65

Example 3: Prove that


tan 1 x =

1
1 x
cos 1

2
1+ x

Solution:Let
x = tan then
L.H.S. = and R.H.S. =
1
= 2 =
2
L.H.S. = R.H.S.

1 tan 2 1
1
1
cos 1
= cos (cos 2 )
2
2
2
1
tan
+

2.5 MULTIPLE CHOICE QUESTIONS


1.

Use a right triangle to write the expression as an algebraic expression.


Assume that x is positive and in the domain of the given inverse trigonometric function cos(sin-1 x)
(a) 1 x 2

(b)

x2 + 1

x 2 + 1

(d)

x2 + 1

(c)
2.

Find the exact value of the expression.


sin 1

3
2

32

Mathematics

3

4

(a).

(c). 3
3.

(a).

6.

7.

6
)]
7
(b).

7

6

(d).

7
6

If x 2, then cos-1(cosx) is equal to


(a).x

(b) 2 + x

(c).-x

(d). 2 - x

The value of sin(cot )tan(cos ) x is equal to


-1

-1

(a).x

(b) 1

(c).


2
The number log 72 is..

(d). None of these

(a). an integer

(b) an irrational number

(c).a rational number

(d). a prime number

The domain of the inverse tangent function is .


(a).[1, 1]
(c).(, )

8.

9.

(d). 4

6

7

(c).

5.

Find the exact value of the expression, if possible.


sin 1 [sin(

4.

(b)


(b) ,
2 2
(d). 0,

cot-1(-3)= .
(a). - p /

(b). p / 3

(c). 5p / 6

(d). 2p / 3

tan(cos-1)x is equal to.


(a). (1 - x) / x

(b). (1 + x) / x

(c). x / (1 + x)

(d). (1 - x)

10. If ax = by = abxy, then x + y =


(a). 0

(b). xy

(c). 1

(d). None of these

2.6REVIEW QUESTIONS
1. Evaluate the following without using a calculator.

33

Inverse Trigonometric Functions

1
(a) arcsin

2
(b) sin 1 (1)

(c) arcsin ( 1)

2. Evaluate the following without using a calculator.


(a) arcsin ( 2 )

5
2
2
(b) If sin =
, then sin 1
?
2
2
4

3. What is the domain of arccos(x)?



4. What is the range of arccos(x)?
5. Graph arccos(x):
6. Evaluate:
2
1
arccos
, arccos ,arccos (0 ).
2
2

7. Evaluate

1
arctan (1) , tan 1
,a nd arctan (0 ).
3

8. Find the domain and range of arctan(x):


9. Evaluate the following

(a) cos arccos (.7 )


(b) tan 1 tan
8

(c) sin arctan (1)

5
(d) cos 1 cos
4

10. If 0 <x< 1; Find sin(arccos(x) and tan(arccos(x)).

ANSWERS FOR MULTIPLE CHOICE QUESTIONS


1. (a)

2.(c)

3.(c)

4.(d)

5.(a)

6. (b)

7.(c)

8.(c)

9.(a)

10.(c)

Chapter 3

Matrices

Objectives
After studying this
chapter, you will be
able to:
Explain the
sum of matrices
Explain the
product and
inverse of a
matrix
Describe the
adjoint of a
matrix

INTRODUCTION

matrix is a collection of numbers ordered by rows and columns. It is


customary to enclose the elements of a matrix in parentheses, brackets,
or braces. For example, the following is a matrix:

5 8 2
X=

1 0 7
This matrix has two rows and three columns, so it is referred to as a 2
by 3 matrix. The elements of a matrix are numbered in the following way:
x
11
X = x
21

x
12
x
22

x
13
x
23

That is, the first subscript in a matrix refers to the row and the second
subscript refers to the column. It is important to remember this convention
when matrix algebra is performed.
1

a = 2
3 b = (1 2 3)
,

A vector is a special type of matrix that has only one row (called a row
vector) or one column (called a column vector). Below, a is a column vector
while b is a row vector.
1 4
1 4

A=
, B = 0 3
3
6

7 2

A scalar is a matrix with only one row and one column. It is customary
to denote scalars by italicized, lower case letters (e.g., x), to denote vectors by
bold, lower case letters (e.g., x), and to denote matrices with more than one
row and one column by bold, upper case letters (e.g., X).
A square matrix has as many rows as it has columns. Matrix A is square
but matrix B is not square:

Matrices

35

1 4
1 4

A=
, B = 0 3
3
6

7 2

A symmetric matrix is a square matrix in which xij = xji for all i and j.
Matrix A is symmetric; matrix B is not symmetric.
9 3 4
9 3 4

A = 6 5 2, B = 6 5 2
5 2 6
5 2 6

A diagonal matrix is a symmetric matrix where all the off diagonal elements are 0. Matrix A is diagonal.
9 0 0

A = 0 5 0
0 0 6

Key Vocabulary

Column Vector: It (column matrix) is an m 1


An identity matrix is a diagonal matrix with 1s and only 1s on the di- matrix, i.e. a matrix conagonal. The identity matrix is almost always denoted as I.
sisting of a single column
of m elements.
1 0 0

I = 0 1 0
0 0 1

Real and Complex Matrices


As in the case of vectors, the components of a matrix may be real or complex. If they are real numbers, the matrix is called real and complex otherwise. For the present exposition all matrices will be real.

Square Matrices
The case m = n is important in practical applications. Such matrices are
called square matrices of order n. Matrices for which m n are called nonsquare. Square matrices enjoy certain properties not shared by non-square
matrices, such as the symmetry and anti symmetry conditions defined below. Furthermore many operations, such as taking determinants and computing Eigen values, are only defined for square matrices.
12 6 3

C = 8 24 2
2 5 11

Consider a square matrix A = [aij] of order n n. Its n components aij


form the main diagonal, which runs from top left to bottom right. The cross
diagonal runs from the bottom left to upper right. The main diagonal of the
example matrix is {12, 24, 11} and the cross diagonal is {2, 24, 3}. Entries that
run parallel to and above (below) the main diagonal form super diagonals
(sub diagonals). For example, {6, 7} is the first super diagonal of the example matrix.

36

Mathematics

3.1 SUM OF THE MATRIX


If A and B are matrices of the same size, their sum A + B is the matrix
formed by adding corresponding entries. If A = [ai j] and B = [bi j], this takes
the form

A + B = [aij + bij ]
Note that addition is not defined for matrices of different sizes.
Example 1:

Key Vocabulary
Row Vector: It is a 1
m matrix, that is, a
matrix consisting of a
single row.

2 1 3
1 1 1
If A =
and B =

1 2 0
2 0 6 , Compute A+B
Solution:

a + c b + a c + b = 3 2 1
Example 2:

a + c b + a c + b = 3 2 1
Solution: Add the matrices on the left side to obtain
a + c b + a c + b = 3 2 1
Because corresponding entries must be equal, this gives three equations:
a + c = 3, b + a = 2, and c + b = -1. Solving these yields a = 3, b = -1, c = 0.
If A, B, and C are any matrices of the same size, then
A+B=B+A
A + (B + C) = (A + B) + C
In fact, if A = [ai j] and B = [bi j], then the (i, j )-entries of A + B and B
+ A are, respectively, ai j + bi j and bi j + ai j . Since these are equal for all i
and j, we get

A + B = [aij + bij ] = [bij + aij ] = B + A


The associative law is verified similarly. The m n matrix in which every entry is zero is called the zero matrixes and is denoted as 0 (or 0mn if it
is important to emphasize the size). Hence,
0+X=X
Holds for all m n matrices X. The negative of an m n matrix A is
defined to be the m n matrix obtained by multiplying each entry of A by
-1. If A = [aij ], this becomes -A = [-ai j ]. Hence,
A + (-A) = 0
Holds for all matrices A where, of course, 0 is the zero matrix of the same
size as A. A closely related notion is that of subtracting matrices. If A and B
are two m n matrices, their difference AB is defined by A B = A + (-B)
Note that if A = [ai j] and B = [bi j ], then
A-B= [aij] + [-bij ] = [aij -bij]
is the m n matrix formed by subtracting corresponding entries.

37

Matrices

Example 3:

3 1 0
1 1 1
1 0 2
A=
, B =
, and C ==

1 2 4
2 0 6
3 1 1.
Compute A, A-B and A+B-C.
Solution:

3 1 0
A =

1 2 4
A B = 3 + 1 1 1 1 0 0 + 1 + 2 = 3 2 3
1 2 3 2 + 0 1 4 + 6 1
4 1 1

3 + 1 1 1 1 0 0 + 1 + 2 3 2 3
A + B C=
=

1 2 3 2 + 0 1 4 + 6 1 4 1 1

3.2 MATRIX ALGEBRA DETERMINATION


A matrix is a collection of numbers ordered by rows and columns. It is
customary to enclose the elements of a matrix in parentheses, brackets, or
braces. For example, the following is a matrix:

5 8 2

1 0 7
X =
This matrix has two rows and three columns, so it is referred to as a 2
by 3 matrix. The elements of a matrix are numbered in the following way:
x11 x12 x13

x x x
X = 21 22 23

That is, the first subscript in a matrix refers to the row and the second
subscript refers to the column. It is important to remember this convention
when matrix algebra is performed.
A vector is a special type of matrix that has only one row (called a row
vector) or one column (called a column vector). Below, a is a column vector
while b is a row vector.
7

2

a = 3 ,

b = ( 2 7 4 )

A scalar is a matrix with only one row and one column. It is customary
to denote scalars by italicized, lower case letters (e.g., x), to denote vectors
by bold, lower case letters (e.g., x), and to denote matrices with more than
one row and one column by bold, upper case letters (e.g., X).
A square matrix has as many rows as it has columns. Matrix A is square
but matrix B is not square:

Key Vocabulary
Scalars Vector: In
linear algebra, real
numbers are called
scalars and relate to
vectors in a vector
space through the
operation of scalar
multiplication.

38

Mathematics

A=

1 9

0 3
7 2

1 9

0 3
7 2

B=

A symmetric matrix is a square matrix in which xij = xji for all i and j.
Matrix A is symmetric; matrix B is not symmetric.
9 1 5

2 6 2
5 1 7
,
A=

Key Vocabulary
Invertible: In linear
algebra an n-by-n
(square) matrix A is
called invertible.

9 1 5

2 6 2
5 1 7

B=

A diagonal matrix is a symmetric matrix where all the off diagonal elements are 0. Matrix A is diagonal.
1 0 0

0 1 0
A = 0 0 1

An identity matrix is a diagonal matrix with 1s and only 1s on the diagonal. The identity matrix is almost always denoted as I.
1 0 0

0 1 0
I = 0 0 1

3.2.1 Matrix Multiplication


If A is an m n matrix and B is an n p matrix, then C is an m p matrix.
We use cij to denote the entry in row i and column j of matrix C.

Standard (Row Times Column)


The standard way of describing a matrix product is to say that cij equals
the dot product of row i of matrix A and column j of matrix B. In other
words,
n

cij = aik bkj


k =1

Columns
The product of matrix A and column j of matrix B equals column j of matrix C. This tells us that the columns of C are combinations of columns of
A.

Rows
The product of row i of matrix A and matrix B equals row i of matrix C. So
the rows of C are combinations of rows of B.

39

Matrices

Column Times Row


A column of A is an m 1 vector and a row of B is a 1 p vector. Their
product is a matrix:
2
2 12

3 1 6 = 3 18
4
4 24
The columns of this matrix are multiples of the columns of A and the
rows are multiples of the row of B. If we think of the entries in these rows
as the coordinates (2, 12) or (3, 18) or (4, 24), all these points lie on the same
line; similarly for the two column vectors. We will see that this is equivalent Key Vocabulary
to saying that the row space of this matrix is a single line, as is the column
Identity Matrix:
space. The product of A and B is column times row matrices:
Identity matrix or unit
matrix of size n is the
a1k

nn square matrix
.
n
bk 1 ... bkn
AB = .
with ones on the main

k =1
.
diagonal and zeros.
a
mk

Blocks
If we subdivide A and B into blocks that match properly, we can write the
product AB = C in terms of products of the blocks:
A1

A3

A2 B1

A4 B3

B2 C1
=
B4 C3

C2

C4

Here C1 = A1B1 + A2B3.

3.2.2 Inverses of Matrices


If A is a square matrix, the most important question you can ask about it is
whether it has an inverse A 1 If it does, then A 1 A = I = AA 1 and we say
that A is invertible or nonsingular. If A is singular that is A does not have
an inverse its determinant is zero and we can find some non-zero vector x
for which Ax = 0. For example:
1 3 a c 1 0

2 6 b d 0 1
A
A 1
I

In this example, three times the first column minus one times the second column equals the zero vector; the two column vectors lie on the same
line. Finding the inverse of a matrix is closely related to solving systems of
linear equations:

40

Mathematics

1 3 a c 1 0

2 6 b d 0 1
A
A 1
I
can be read as saying A times column j of A1 equals column j of the identity
matrix. This is just a special form of the equation Ax = b.

Gauss-Jordan Elimination
We can use the method of elimination to solve two or more linear equations
at the same time. Just augment the matrix with the whole identity matrix I:

1 3 1 0 1 3 1 0 1 0
|
|

2 6 0 1 0 1 2 1 0 1
(Once we have used Gauss elimination method to convert the original matrix to upper triangular form, we go on to use Jordans idea of eliminating
entries in the upper right portion of the matrix.)

7 3

2 1
We can write the results of the elimination method as the product of a number of elimination matrices E ij with the matrix A. Letting E be the product
of all the E ij, we write the result of this Gauss-Jordan elimination using
block matrices: E[ A | I ]=[ I | E ]. But if EA = I, then E = A1

A1 =

3.2.3 Matrix Transpose


The transpose of a matrix is denoted by a prime (A) or a superscript t or T
(At or AT). The first row of a matrix becomes the first column of the transpose matrix; the second row of the matrix becomes the second column of
the transpose, etc. Thus,
2 8
2 7 1

t
A=
,
and
A
=

7 6
8
6
4

1 4

The transpose of a row vector will be a column vector, and the transpose of a column vector will be a row vector. The transpose of a symmetric
matrix is simply the original matrix.

3.2.4 Determinant of a Matrix


The determinant of a matrix is a scalar and is denoted as |A| or det(A).
The determinant has very important mathematical properties, but it is very
difficult to provide a substantive definition. For covariance and correlation
matrices, the determinant is a number that is sometimes used to express
the generalized variance of the matrix. That is, covariance matrices with

41

Matrices

small determinants denote variables that are redundant or highly correlated. Matrices with large determinants denote variables that are independent
of one another. The determinant has several very important properties for
some multivariate stats that is, change in R2 in multiple regressions can be
expressed as a ratio of determinants.

3.2.5 Using the Inverse Matrix to Solve Equations


One of the most important applications of matrices is to the solution of linear simultaneous equations.
Writing simultaneous equations in matrix form. Consider the simultaneous equations
x + 2y = 4
3x - 5y = 1
These can be written in matrix form as

1 2 x 4

=
3 5 y 1

Writing

x

y
A=

x

y
, X= and B=

4

1

We have

AX=B

This is the matrix form of the simultaneous equations. Here the unknown is the matrix X, since A and B are already known. A is called
the matrix of coefficients.

Solving the simultaneous equations

Given AX = B
We can multiply both sides by the inverse of A, provided this exists, to give
A1AX = A1B
But A1A = I, the identity matrix. Furthermore, IX = X, because
multiplying any matrix by an identity matrix of the appropriate
size leaves the matrix unaltered. So
X = A1B
if AX = B, then X = A1B
This result gives us a method for solving simultaneous equations. All
we need do is write them in matrix form, calculate the inverse of the
matrix of coefficients, and finally perform a matrix multiplication.

Example: 1

1 0 1 1 1

1 1 0 1 1
0 0 1 1 1

0 0 0 1 1
0 0 0 1 1
using positioning of
Find the inverse of the matrix
the matrix do the position in two different ways.

42

Mathematics

Solution:

1 0 1 1 1

1 1 0 1 1
0 0 1 1 1

0 0 0 1 1
0 0 0 1 1
1 way of partitioning partition P show that
Let P=
on the diagonals there are 3 3 and 2 2 matrices.

A B

P = 0 C

1 1

1 1
Where A= 1 1

1 1
1 1

1
1

1 1

, B=
,
C
=

1 1

1 0 1

A = 1 1 1
0 0 1

1 0 1 1 1
0
1

1 1
A BC = 1 1 1 1 1
1
1 1 =
2

0
0 0 1 1 1

1
0
1
P =
0

2 nd way of partitioning.

1
1

0 1 0
0

1 1 1 0
0 1 0 1

1
1
0 0
2
2

1 1
0 0

2
2

Partition P so that on the main diagonal there are 2 2 and 3 3 matrices respectively. Then

X Y
P=

o W,

1 1 1
1 0
1 1 1

Where X =
,Y =
, W = 0 1 1 then

1
1
0
1
1

0 1 1

43

Matrices

X 1
P 1 =
O

X 1YW 1
1
To obtain the W we can partition it.
1
W

1 0 1 0
0

1 1 1 1 0
0 0 1 0 1
1

P =
1
0 0 0 1
1 1 1

2
2

0 1 1

1
1
0 1 1
0 0 0

Then we obtain

2
2
As W=

Which is the same as before?


Example 2: Solve the simultaneous equations
x + 2y = 4
3x - 5y = 1
Solution: We have already seen these equations in matrix form

1 2 x 4

=
3 5 y 1
1 2
We need to calculate the inverse of A= 3 5
5 2
1

(1)( 5) (2)(3) 3 1
A 1 =
1 5 2

A 1 = 11 3 1
Then X is given by

1 5 2 4 1


1
11 3 1 1 11
X= A B =
=

22

11

2

= 1
Hence x=2, y = 1 is the solution of the simultaneous equations.
3 1
Example 3: Find the inverse of the matrix A=

4 2
Solution: Using the formula

A 1 =

2 1 1 2 1
1

=
(3)(2) (1)(4) 4 3 2 4 3

This could be written as

44

Mathematics

1
1
2

3
2

2
Example 4: Find the inverse of the matrix

2 4

3 1
A=
Solution:
Using the formula

1 4
1

A 1 = (2)(1) (4)( 3) 3 2
1 4

3 2
This can be written
=

1
14

1 14 2 7 1 14 2 7


A1 = 3 14 1 7 = 3 14 1 7
although it is quite permissible to leave the factor 1/14 at the front of
the matrix.
Example 5: Find, the inverse of the matrix A=

3 2
6 4

Solution: In this case the determinant of the matrix is zero:

3 2
6 4 = 3 4 26 = 0
Because the determinant is zero the matrix is singular and no inverse
exists.

3.3 ADJOINT OF A MATRIX


In linear algebra, the adjoint matrix of a square matrix is something similar
to the inverse of a square matrix:
When A is invertible, the difference between adj(A) and A 1 is just a
scalar. However, the adjoint matrix can be defined for any square matrix,
not necessarily invertible.
Here is a list of properties of adjoint matrices:
1. adj(I) = I
2. adj(AB) = adj(B)adj(A), where A and B are square matrices of the
same size.

( )

3. adj AT = ( adj( A) )

45

Matrices

4. If A is an n n nonsingular matrix, then det ( adj( A) ) = (det( A) )

n 1

5. If A is an n n matrix, then adj(kA) = = (det( A) )

n 2

6. If A is an n

n matrix with n > 2, then adj(adj(A)) = (det( A) )

n 2

= (det( A) )

n 2

1
1
1
7. If A is invertible, then adj A = ( adj( A) ) = det( A) A
8. If A is a symmetric matrix, then adj(A) is also symmetric.

( )

Note that in this module, we do not define the adjoint of a 1 1 matrix.


However, an interesting question would be:
How to define the adjoint of a 1 1 matrix so that all the properties
above are still satisfied?

3.3.1 Finding the Adjoint Matrix


The adjoint of a matrix A is found in stages:
(1) Find the transpose of A, which is denoted by AT . The transpose is
found by interchanging the rows and columns of A. So, for example,
the first column of A is the first row of the transposed matrix; the
second column of A is the second row of the transposed matrix, and
so on.
(2) The minor of any element is found by covering up the elements in
its row and column and finding the determinant of the remaining
matrix. By replacing each element of AT by its minor, we can write
down a matrix of minors of AT .
(3) The cofactor of any element is found by taking its minor and imposing a place sign according to the following rule
+

...

...

+
...

...

...
...

This means, for example, that to find the cofactor of an element in the
first row, second column, the sign of the minor is changed. On the other
hand to find the cofactor of an element in the second row, second column,
the sign of the minor is unaltered. This is equivalent to multiplying the
minor by +1 or 1 depending upon its position. In this way we can form
a matrix of cofactors of AT . This matrix is called the adjoint of A, denoted
adjA.
The matrix of cofactors of the transpose of A, is called the adjoint matrix, adjA.
1 2 0

Example 1: Find the adjoint, and hence the inverse, of A= 3 1 5


1 2 3

46

Mathematics

Solution: Follow the stages outlined above. First find the transpose of
A by taking the first column of A to be the first row of AT , and so on:
1 3 1

2 1 2
AT = 0 5 3
Now find the minor of each element in AT . The minor of the element
1 in the first row, first column, is obtained by covering up the elements in
its row and column to give
2 2

0 3 and finding the determinant of this, which is 7. The minor of

the element 3 in the second column of the first row is found by covering
2 2

up elements in its row and column to give 0 3 which has determinant


6. We continue in this fashion and form a new matrix by replacing every
element of AT by its minor. Check for yourself that this process gives
7 6 10

adj A = 14 3 5
7 0 7
Matrix of minors of

Then impose the place sign. This results in the matrix of cofactors, that
is, the adjoint of A.
7 6 10

adj A = 14 3 5
7 0 7

Notice that to complete this last stage, each element in the matrix of
minors has been multiplied by 1 or 1 according to its position.
It is a straightforward matter to show that the determinant of A is 21.
Finally
7 6 10
adj A 1

A =
= 14 3 5
21
A

7 0 7
1

3.4MULTIPLE CHOICE QUESTIONS


1.

0 1
1 1
If A =
and B =
which of the following is false?
2 3
5 2
(a) 3A 4B = 4

14 1

0 1
A2 =

(c)
4 9
2.

4 7
(b) 3A 4B =

14 1

0 1
A2 =

(d)
4 9
If A and B are n n matrices, which of the following does not equal (A

47

Matrices

+ B)2?

3.

4.

5.

6.

7.

(a) (B + A)2

(b) A 2 + 2AB + B2

(c) (A + B)A + (A + B)B

(d) A 2 + AB + BA + B2

If A, B, and C are n n matrices, which of the following equalities is


invalid?
(a) (A + A) = 2A

(b) (ABC) = CBA

(c) (A + A + 2B) = 2B + 2A

(d) ((AB)2) = (B)2(A)2

If a = (3, 4, 0) and b = (0, 2, 3), then a b a is equal to:


(a) 3

(b) 0

(c) 3

(d) 2

The straight line in R through the point (1, 3, 3) pointing in the direction of the vector (1, 2, 3) hits the x1x2-plane at the point:
3

(a) Never

(b) (2, 1, 0)

(c) (1, 3, 0)

(d)(2, 1, 0)

The plane in R3 through the point (1, 3, 3), which is orthogonal to the
vector (1, 2, 3), has the equation:
(a) x + 3y + 3z = 14

(b) x + 2y + 3z = 11

(c) x 2y 3z = 16

(d) x + 2y + 3z = 14

For which values of t does the following linear equation system have
infinitely many solutions?
tx +
y=1
6x + (t + 1)y = 3
(a) The system does not have infinitely many solutions for any value of
t
(b) t = 3
(c) t = 2
(d) t = 2 and t = 3

8.

If A, B and C are matrices with orders 33, 23 and 42 respectively,


how many of the following matrix calculations are possible?
4B, A + B, 3BT + C, AB, BTA, (CB)T, CBA

9.

(a) 2

(b) 1

(c) 0

(d) 3

Find the determinant of the matrix


5 2 3

4 1 5
6 7 9
(a) 14

(b) 364

(c) 100

(d) 340

48

Mathematics

10. Find the cofactor, A23, of the matrix


5 2 7

6 1 9
4 3 8
(a) 0

(b) 23

(c) 7

(d) -23

3.5 REVIEW QUESTIONS


1. What are real and complex matrices?
2. What are square matrices?
3. Explain the sum of the matrix with examples.
4. Explain the matrix multiplication with examples.
5. What is inverse matrix explain with example?
6. Describe the Gauss-Jordan elimination method.
7. Explain the adjoint of a matrix with example.
8. How to finding the adjoint matrix?

1 3 2
9. Find the adjoint and the inverse, of A = 0 1 3 .
4 2 2
4 6
10. Find, if possible, the inverse of the matrix A =
.
2 3

ANSWER FOR MULTIPLE CHOICE QUESTIONS


1. (d)

2.(b)

3.(d)

4.(c)

5.(d)

6. (d)

7.(c)

8.(d)

9.(b)

10.(c)

Chapter 4

Determinant of a Matrix

Objectives
After studying this
chapter, you will be
able to:
Discuss the
determinant of
a square matrix
Explain the
properties of
determinants
Understand the
minors
Describe the
cofactors
Discuss the
applications of
matrices and
determinants
Explain the
adjoint of a
matrix
Understand the
expanding to
find the determinant
Discuss theinverse of a
square matri
Understandconsistent and
inconsistent
systems
Discuss the
system of linear
equations

INTRODUCTION

e need some background knowledge before we can discuss the definition of the determinant. We want to form a product by choosing n elements where A is an n by n matrix. There will only be one element from each
row and one element from each column in this product. For example, if one
element of the product is a21, then no other element in this product will be
from row 2 or column 1. Let us look at a 3 by 3 matrix for an example.
a11 a12

A = a 21 a 22
a 31 a 32

a13

a 23
a 33

We know that we will use an element from each column, so, for consistency, we will order the product this way: a_1a_2a_3. We can fill in the blanks
with row numbers. If we choose to begin with a31, then we can choose from
rows 1 and 2 for the remaining positions. One possible product formed by
these rules is a31a12a23. Another possible product is a11a32a23. There are 3!,
or 3*2*1 = 6, of these products. For our n by n matrix, there are n! possible
products. All 6 possible products for this 3 by 3 matrix are: a11a22a33, a21a32a13,
a31a12a23, a31a22a13, a21a12a33, and a11a32a23. Now, we need to determine which
sign (+ or -) should be attached to each product. To do this, you need to order
the product with the column numbers increasing as we did above and look
at the sequences of row numbers. For the product, a11a22a33, we look at the
row sequence (1,2,3). We are looking for inversions, or numbers that are out
of order.
Since 1 comes before 2, 1 comes before 3, and 2 comes before 3 in the
sequence, there are no inversions in this sequence. In the sequence (2,3,1),
which comes from the product a21a32a13, there are two inversions because 2 is
placed before 1 and 3 is placed before 1. There are also two inversions for the
sequence from the product a31a12a23. There are three inversions for (3,2,1) and
one inversion each for (2,1,3) and (1,3,2). If the number of inversions is even,
then the sign attached to the product is positive. If the number of inversions
is odd, then the sign attached to the product is negative. Notice that we did
not say that the product was positive or negative. We simply are determining
whether the product will be added or subtracted.

50

Mathematics

4.1 THE DETERMINANT OF A SQUARE


MATRIX
The determinant of a square matrix is the sum of all the n possible signed
products formed from the matrix using each row and each column only
once for each product. The sign to be attached to the product is the same
as the one determined by the formula (-1)N when N is the number of inversions as described above.
The determinant of the generic 3 by 3 matrix is: a11a22a33 + a21a32a13 +
a31a12a23 - a31a22a13 - 0a21a12a33 - a11a32a23. For the matrix,

Key Vocabulary
Consistent System: A
system is consistent if
there is at least one solution.

0 2 4

A = 4 2 3 ,
1 3 6
The determinant is (0*2*6) + (4*3*4) + (1*2*3) - (1*2*4) - (4*2*6) - (0*3*3)
= 0 + 48 + 6 - 8 - 48 - 0 = -2 which is the same as we calculated at the beginning of the chapter. Since this definition is cumbersome to follow, we generally do not compute the determinant by the definition, but it is good to
know why the short cuts that we learned at the beginning of the chapter are
valid. The determinant is also a good example of an abstract idea that has
very important practical uses.

4.2PROPERTIES OF DETERMINANTS
The determinant has many properties. Some basic properties of determinants are:
The determinant is a real number, it is not a matrix.
The determinant can be a negative number.
It is not associated with absolute value at all except that they both
use vertical lines.
The determinant only exists for square matrices (22, 33, nn). The
determinant of a 11 matrix is that single value in the determinant.
The inverse of a matrix will exist only if the determinant is not zero.


det(I n ) = 1

Where In is the n * n identity matrix.

det(A ) = det(A)

det(A 1 ) =

1
det(A)

4.3MINORS
A minor for any element is the determinant that results when the row and
column that element is in are deleted.
The notation Mij is used to stand for the minor of the element in row i
and column j. So M21 would mean the minor for the element in row 2, column 1.

51

Determinant of a Matrix

Consider the 33 determinant I have included headers so that you can


keep the rows and columns straight, but you would not normally include
those. We are going to find some of the minors.
C1
R1 1

R2 4
R3 2

C2

C3

4.3.1 Finding the Minor for R2C1


The minor is the determinant that remains when you delete the row
and column of the element you are trying to find the minor for. That
means we should delete row 2 and column 1 and then find the determinant.
C2
R1 3
R2 5

C3
2
= 3(2) 5(2) = 6 10 = 4
2

As you can see, the minor for row 2 and column 1 is M21 = -4.
Let us try another one.

4.3.2 Finding the Minor for R3C2


When you are just trying to find the determinant of a matrix, this is overkill.
But there is one extremely useful application for it and it will give us practice finding minors.
The matrix of minors is the square matrix where each element is the
minor for the number in that position.
Here is a generic matrix of minors for a 33 determinant.
C1
R 1 M11
R 2 M 21
R 3 M 31

C2

C3

M12
M 22
M 32

M13
M 23
M 33

Let us find the matrix of minors for our original determinant. Here is
the determinant.
C1
R1 1
R2 4
R3 2

C2

C3

3
1
5

2
3
2

Here is the work to find each minor in the matrix of minors.

Key Vocabulary
Inconsistent: A system is
inconsistent if it has no
solutions.

52

Mathematics

R1

C1
1

C2
4

C3
4

= 2 - 15 = -13
3

=8-6=2
1

= 20 - 2 = 18
1

= 6 - 10 = -4
3

= 2 - 4 = -2
1

= 5 - 6 = -1
1

=9-2=7

= 3 - 8 = -5

= 1 - 12 = -11

R2

Key Vocabulary
Inverse Matrix: It is used
to solve a system of two
linear simultaneous
equations.

R3

Finally, here is the matrix of minors. Again, you donot need to put the
labels for the row and columns on there, but it may help you.
C1
R 1 13

R 2 4
R3 7

C2

C3
2

2
5

18
1
11

4.4 COFACTORS
A cofactor for any element is either the minor or the opposite of the minor,
depending on where the element is in the original determinant. If the row
and column of the element add up to be an even number, then the cofactor
is the same as the minor. If the row and column of the element add up to be
an odd number, then the cofactor is the opposite of the minor.

Sign Chart
Rather than adding up the row and column of the element to see whether it
is odd or even, many people prefer to use a sign chart. A sign chart is either
a + or - for each element in the matrix. The first element (row 1, column 1)
is always a + and it alternates from there.
Note: The + does not mean positive and the - negative. The + means the
same sign as the minor and the - means the opposite of the minor. Think of
it addition and subtraction rather than positive or negative.

53

Determinant of a Matrix

Here is the sign chart for a 22 determinant.


C2

C3

R1 +
R2

Here is the sign chart for a 33 determinant.


C1

C2

R1 +

R2
R3 +

C3

Key Vocabulary

4.4.1 Matrix of Cofactors


Again, if all you are trying to do is find the determinant, you do not need to
go through this much work.
The matrix of cofactors is the matrix found by replacing each element of
a matrix by its cofactor. This is the matrix of minors with the signs changed
on the elements in the - positions.
C1

C2

C3

R 1 13

18

2
5

1
11

R2 4
R3 7

4.5EXPANDING TO FIND THE


DETERMINANT
Here are the steps to go through to find the determinant.
Pick any row or column in the matrix. It does not matter which row
or which column you use, the answer will be the same for any row.
There are some rows or columns that are easier than others, but
well get to that later.
Multiply every element in that row or column by its cofactor and
add. The result is the determinant.
Let us expand our matrix along the first row.
1 3 2
4 1 3
2 5 2
From the sign chart, we see that 1 is in a positive position, 3 is in a negative position and 2 is in a positive position. By putting the + or - in front of
the element, it takes care of the sign adjustment when going from the minor
to the cofactor.
+1

1 3
4 3
4 1
3
+2
5 2
2 2
2 5

Linear Equation in Three


Variables: If a, b, c and
r are real numbers (and
if a, b, and c are not all
equal to 0) then ax + by
+ cz = r is called a linear
equation in three variables. (The three variables are the x, the y,
and the z.)

54

Mathematics

= 1 (2 15) - 3 (8 6) + 2 (20 2)
= 1 (-13) - 3 (2) + 2 (18)
= -13 - 6 + 36
= 17
The determinant of this matrix is 17.
Let us try it again, but this time expands on the second columns. As an
effort to save time, the minors for that column (from the matrix of minors)
were 2, -2, and -5. The original elements were 3, 1, and 5. The 3 and 5 are in
negative positions.
Determinant = - 3 (2) + 1 (-2) - 5 (-5) = -6 -2 + 25 = 17

Key Vocabulary
Uniquely Determined:
A system is uniquely
determined if there is
exactly one solution to
the system.

Expand on any row or any column, you will get 17.


get

However, you cannot do diagonals. If we try the main diagonal, you


+ 1 (-13) + 1 (-2) + 2 (-11) = -13 -2 - 22 = -37
Some Rows or Columns are better than others
Pick the row or column with the most zeros in it.
Since each minor or cofactor is multiplied by the element in the matrix, picking a row or column with lots of zeros in it means that you
will be multiplying by a lot of zeros. Multiplying by zero doesnot
take very long at all. In fact, if the element is zero, you donot need to
even find the minor or cofactor.
Pick the row or column with the largest numbers (or variables) in it.
The elements in the row or column that you expand along are not
used to find the minors. The only place that they are multiplied
is once, in the expansion. If you pick the row or column with the
smallest numbers, then every minor will be the product of larger
numbers. If you pick a row or column that has variables in it, then
you will only have to multiply by the variables once, during the
expansion.

4.6APPLICATIONS OF MATRICES AND


DETERMINANTS
Applications of matrices and determinants are used in so many fields some
of these are described here.

Area of a Triangle
Consider a triangle with vertices at (x1,y1), (x2,y2), and (x3,y3). If the triangle
was a right triangle, it would be pretty easy to compute the area of the triangle by finding one-half the product of the base and the height.
However, when the triangle is not a right triangle, there are a couple of
other ways that the area can be found.

55

Determinant of a Matrix
(x2,y2)
a
(x1,y1)

b
c
(x3,y3)

Herons Formula
If you know the lengths of the three sides of the triangle, you can use
Herons Formula to find the area of the triangle.
In Herons formula, s is the semi-perimeter (one-half the perimeter of
the triangle).
s = 1/2 (a + b + c)
Area = sqrt (s (s a) (s b) (s c) )
Consider the triangle with vertices at (-2, 2), (1, 5), and (6, 1).
Using the distance formulas, we can find that the lengths of the sides
(arbitrarily assigning a, b, and c) are a = 3 sqrt(2), b = sqrt(61), and c =
sqrt(73).
(1,5)
3 2

(2,2)

b
73

61

(6,1)

Using those values gives


s = 1/2 (3 sqrt(2) + sqrt(61) + sqrt(73) )
s - a = 1/2 (- 3 sqrt(2) + sqrt(61) + sqrt(73) )
s - b = 1/2 (3 sqrt(2) sqrt(61) + sqrt(73) )
s - c = 1/2 (3 sqrt (2) + sqrt (61) sqrt (73) )
s (s - a) (s - b) (s - c) = 1089 / 4
When you take the square root of that, you get 33/2, so the area of that
triangle is 16.5.

Problems with Herons Formula include


Must know the lengths of the sides of the triangle. If you donot then
you have to use the distance formula to find the lengths of the sides
of the triangle.
You have to compute the semi-perimeter, so chances are you will
have fractions to work with.

56

Mathematics

Lots of square roots are involved. For the lengths of the sides of the
triangle and for the area of the triangle.
It is not the easiest thing in the world to work with.

4.6.1 Geometric Technique


The triangle can be enclosed in a rectangle. The vertices of the triangle will
intersect the rectangle in three places, forming three right triangles. These
triangles are denoted A, B, and C in the picture.
The area of the triangle we desire will be the area of the rectangle minus the areas of the three triangles.
The legs of the three triangles can be found by simple subtraction of
coordinates and then used to find the area since the area of a triangle is onehalf the base times the height.
Area of triangle A = 3 (3) / 2 = 9/2.
Area of triangle B = 5 (6) / 2 = 15.
Area of triangle C = 8 (3) / 2 = 12.
The sum of the areas of the triangles is 9/2 + 15 + 12 = 63 / 2 or 31.5.
The area of a rectangle is base times height, so the bounding rectangle
has area = 8 (6) = 48.
The area of the triangle in the middle is the difference between the rectangle and the sum of the areas of the three outer triangles.
Area of triangle = 48 - 31.5 = 16.5.
Wow, that was much easier.

(1,5)

(2,2)

3
C

(6,5)

4.6.2 Determinants
It turns out that the area of a triangle can also be found using determinants.
The derivation of the formula is kind of long and most of you donot care to
see it, so it is on a separate page.
What you do is form a 33 determinant where the first column are the
xs for all the points, the second column are the ys for all the points, and
the last column is all ones.

57

Determinant of a Matrix

po int 1 2
po int 2 1

2
5

1
1

po int 3 6

Evaluate that determinant. I will expand on column 1.


2
1
6

2
5
1

1
1 =
1

+ ( 2)

5 1
2 1
2 1
1
+6
1 1
1 1
5 1

= -2 ( 5 + 1 ) - 1 ( 2 + 1 ) + 6 ( 2 - 5 ) = -2 ( 6 ) - 1 ( 3 ) + 6 ( -3 ) = -12 - 3 - 18
= -33.
It is possible that you will get a negative determinant, like we did here.
Donot worry about that. The sign is determined by the order you put the
points in and can be easily changed just by switching two rows of the determinant. Area, on the other hand, cannot be negative, so if you get a negative, just drop the sign and make it positive. Finally, divide it by 2 to find
the area.
| -33 | = 33
33 / 2 = 16.5, which was the area.
Formula for the Area of a Triangle using Determinants

Area = 1/2

x1
x2
x3

y1
y2
y3

1
1
1

The plus/minus in this case is meant to take whichever sign is needed


so the answer is positive (non-negative). Do not say the area is both positive
and negative.
Why not use absolute value? Well, think how confusing it would be to
have the absolute value of a determinant.

4.7ADJOINT OF A MATRIX
Definition 1: (Minor, Cofactor of a Matrix)-The number det (A(i\j)) is called
the (i, j)th minor of A. We write Aij = det(A(i\j)). The (i, j)th cofactor of A,
denotedCij, is the number (-1)i+jAij.
1 2 3
4 2 7

Let A = 2 3 1 ,Then Adj (A) = 3 1 5 ;


1 2 2
1 0 1
Example:
as C11 = (-1)1+A11= 4, C12 = (-1) 1+2 A12 = -3, C13 = (-1)1+3 A13 = 1, and so on.
Theorem:Let A be an n * n matrix. Then
n

1.for1 i n, a ij Cij = a ij ( 1)i + j A ij = det(A),


j=1

j=1

j=1

j=1

2.for i l, a ij Cij = a ij ( 1)i + j A ij = 0,and


3.A(Adj(A)) = det(A)I n .Thus,

58

Mathematics

det(A) 0 A 1 =

1
Adj(A),
det(A)

Proof: Let B= [bij] be a square matrix with


The lth row of B as the ith row of A,
The other rows of B are the same as that of A.
By the construction of B, two rows(ith and lth) are equal.
det( A(l / j )) = det( B(l / j )) for 1 j n.
We have
n

j=1

j=1

j=1

j=1

0 = det(B) = ( 1)l + j blj det(B(l / j)) = ( 1)l + j a lj det(B(l / j))


= ( 1)l + j a lj det(A(l / j)) = a ij Clj .

Now,
n

(A(Adj(A)))ij = a ik (Adj(A)) = a ik C jk
k =1

0
=
det(A)

kj

k =1

if i j
if i = j

Thus,A(Adj(A))= det(A)In. Since,


det(A) 0, A
A 1 =

1
Adj(A) = I n Therefore, A has a right inverse. Hence,
det(A)

1
Adj(A)
det(A)

1 1 0
Example: Let A = 0 1 1 ,Then
1 2 1
1 1 1

Adj(A) = 1 1 1
1 3 1
1 / 2 1 / 2

And det(A)=-2.By theorem A = 1 / 2 1 / 2


1 / 2 3 / 2
1

1/ 2

1/ 2 .
1 / 2

4.8 INVERSE OF A SQUARE MATRIX


The inverse of a square matrix A with a non-zero determinant is the adjoint
matrix divided by the determinant, this can be written as

59

Determinant of a Matrix

A 1 =

1
Adj(A)
det(A)

The adjoint matrix is the transpose of the cofactor matrix. The cofactor
matrix is the matrix of determinants of the minors Aij multiplied by -1i+j.
The i,jth minor of A is the matrix A without the ith column or the jth row.

Example (3x3 Matrixes)


The following example illustrates each matrix type and at 3x3 the steps can
be readily calculated on paper. It can also be verified that the original matrix A multiplied by its inverse gives the identity matrix (all zeros except
along the diagonal which are ones).
1 2 0
A = 1 1 1
1 2 3
Determinant = 9
+

1 1
2 3

2 0
Cofactor matrix =
2 3
+

2 0
1 1

1 1
1 3

1 0
+
1 3

1 0
1 1

1 1
1 2

1 4 3
1 2

= 6 3 0
1 2
2 1 3
1 2
+
1 1

1 6 2
Adjo int matrix = Transpose of cofactor matrix = 4 3 1
3 0 3
1 / 9 6 / 9 2 / 9
A = Inverse of A = 4 / 9 3 / 9 1 / 9
3 / 9 0 / 9 3 / 9
1

1 0 0
AA = 0 1 0
0 0 1
1

Inverse of a 2x2 matrix


The inverse of a 2x2 matrix can be written explicitly, namely
a b
d b
1
=
c d ad bc c a
1

1 2
1
1 4 2 2
Example:
=

2 3 1 3 / 2 1 / 2
3 4

60

Mathematics

4.9 CONSISTENT AND INCONSISTENT


SYSTEMS
When you solve a system of linear equations, what does your solution set
(all of your solutions) describe geometrically? In each problem involving 2
by 2 matrices that we solved, our solution set was the point of intersection
of the two lines represented by the equations in our system. In each problem with 3 by 3 matrices, the solution set was the point of intersection of 3
planes. However, a system of linear equations does not always have a point
as the solution set.
If you solve the system
x1 + 2x2

=4

2x1 + 4x2 = 8
algebraically, what do you get? You have an infinite number of solutions
along the line x1 = 4 - 2x2 because any solution to the first equation also
solves the second equation. Therefore, you can choose any value for one of
the variables, and you will be able to find a value for the other variable so
that both equations are satisfied. This is called a consistent system because
there is a solution. It is further categorized as an underdetermined consistent system because there is not enough information to determine a unique
solution.
Definition: A system is consistent if there is at least one solution.
Definition: A system is underdetermined when there are an infinite
number of solutions.

For a linear system, if there are two or more solutions, then there are an
infinite number of solutions. These solutions all lie on the same line. Geometrically, this system represents a line because both equations are representations of the same line. Try to solve this system with Gaussian elimina1 2 4

,
tion. What do you get? Weget 0 0 0 because the second equation is a
multiple of the first. The second equation requires that 0x1 + 0x2 = 0 which
is always true. Therefore, our second equation made no additional requirements beyond what the first equation requires. When Gaussian elimination
on a system with a square coefficient matrix leaves you with zeros across an
entire row of the augmented matrix and there are no rows with zeros to the

61

Determinant of a Matrix

left of the bar and a non-zero number to the right of the bar, you know that
you have a consistent system that is underdetermined. We move the zero
row or rows to the bottom of the matrix. When we try to get zeros above the
main diagonal in Gauss-Jordan elimination, we do not try to get zeros in a
column if the diagonal element of that column is zero.
What is the solution to the system?
2x1 + 4x2 + 5x3 = 47
3x1 + 10x2 + 11x3 = 104
3x1 + 2x2 + 4x3 = 37
Our work and solution are below.
2 4 5

3 10 11
3 2 4

47 Original

104 Augmented
37
Matrix

1 2 2.5 23.5 r1 2

3 10 11 104
3 2
4 37

1 2
2.5 23.5

3.5 33.5 3 * r1 + r2
0 4
0 4 3.5 33.5 3 * r + r
1
3

1 2
2.5 23.5

0 1 .875 8.375
0 4 3.5 33.5 4 * r + r
2
1

1 0 .75 6.5 2 * r2 + r1

0 1 .875 8.375
0 0

0
0

This tells us that x1 = 6.75 - .75x3 and x2 = 8.375 - .875x3. That means that
we can choose whatever number we want for one element of x and get corresponding valid solutions for the other two. For instance, if we choose x3 to
be 1, then x1 = 6 and x2 = 7.5. Instead, we may choose x1 = 6.75 then x2 = 8.375
and x3 = 0. There are an infinite number of solutions that we can find in this
manner. Notice that this system is also consistent and underdetermined.
Definition: A system is inconsistent if it has no solutions.
If you graph these lines, you will see that they are parallel. Try to solve
this system using Gaussian elimination. Our work follows:
1 2 4
original

2 4 9 augmented matrix
1 2 4

0 0 1 2 * r1 + r2

This requires that 0x1 + 0x2 = 1. We know that this cannot be true. When
using Gaussian elimination on a system, if you have zeros in an entire row
to the left of the bar and do not have a zero to the right of the bar on that
row, you know that you have an inconsistent system. Therefore, there is no
point where the lines (or planes if you are in higher dimensions) intersect,
so the system does not have a solution.

62

Mathematics

With underdetermined and inconsistent systems, you will never be


able to get an identity matrix to the left of the bar of the augmented matrix.
Therefore, we will not be able to find an inverse for the coefficient matrix.
The only coefficient matrices that have inverses are those that have a unique
point as the solution to the system, and the only coefficient matrices that
have a unique point as the solution to the system are those that have inverses. These systems are consistent because they have a solution and uniquely
determined because there is exactly one solution.
Definition: A system is uniquely determined if there is exactly one solution to the system.
Here is a visual breakdown of the information that you have been given. This breakdown assumes a square coefficient matrix.

Note: A system is inconsistent if any row of the matrix has zeros left of
the bar and a non-zero number right of the bar. Therefore, the matrix
1 0 3 6

0 0 0 2
0 0 0 0

would be inconsistent even though the entire last row contains only
zeros.

4.10 SYSTEM OF LINEAR EQUATIONS


In mathematics, a system of linear equations is a collection of two or more
linear equations with the same set of variables in all the equations.
In other words, we can say a system of linear equations is nothing but
two or more equations that are being solved simultaneously.

63

Determinant of a Matrix

A Linear Equation can be in 2 dimensions (such as x and y).


Mostly, the system of equations can be used by the business people
to predict their future events. They will model a real world situation in to
system of equations to find the solution and manage their business. We can
make an accurate prediction by using system of equations.
The solution of the system of equations is an ordered pair that satisfies
each equation in the system.
Consider the two equations,
x+y=2
x-y=2
It forms a system of equations in two variables. The solution of this
system is the ordered pair (x, y).
Consider the equations given below.
-3a + 2b - 6c = 6
5a + 7b - 5c = 6
a + 4b - 2c = 8

3 dimensions(such as x, y and z)
Note: A Linear Equation has no exponent on a variable.
It forms a system of equations in three variables. The solution of this
system is the ordered pair (a, b, c).
are:-

The solution of a system of linear equations can be of three types. They


(i) One solution (ii) Infinite solution and (iii) No solution

64

Mathematics

If the graph of the equations intersects each other at a point, then ordered pair corresponding to the point of intersection will be the solution to
that system. In this case, the system will have exactly one solution.
If a system has exactly one solution, then the equations are said to be
independent.
If the graph of the equations coincides, then all the points on the line
will be the solution to that system. In this case, the system will have infinite
number of solutions.
If the system has infinite number of solutions, then the equations are
said to be dependent.
If the graphs of the equations are parallel, then the system of equations
will have no solution. Because parallel lines never intersect each other.
If the system has at least one solution (one solution or infinitely many
solutions), then it is said to be consistent system.
If the system has no solution, then it is said to be inconsistent system.
The following figure will give clear picture of what we have learnt.

Examples: Discuss the number of solutions and type of the system of


equations given in the graphs.
(i)

Solution:
Step 1:
The system of equations is
4x - 6y = -4
8x + 2y = 48
Step 2:
From the graph, it is clear that the two lines 4x - 6y = -4 and 8x + 2y =
48 intersect at exactly one point.

Determinant of a Matrix

The point of intersection is (5, 4).


Step 3:
So, the system of equation has only one solution and hence it is consistent and independent.
(ii)

Solution
Step 1:
The system of equations is
3x + 3y = 15
2y = -2x + 6

Step 2:
From the graph, it is clear that the two lines are parallel. Parallel lines
never intersect.
Step 3:
So, there is no solution for this system of equations and the system is
inconsistent.

65

66

Mathematics

(iii)

Step 1:
From the graph, it is clear that the two equations y = 3x + 2 and 6x - 2y
+ 4 = 0 coincide with each other.
Step 2:
All the points on the line are the solutions to both the equations. So, the
system of equations has infinite number of solutions.
Step 3:
ent.

So, the system of equations is consistent and the equations are depend-

4.11 MULTIPLE CHOICE QUESTIONS


1.

Find the matrix product AB,if it is defined.


3 0
1 3 3

A=
, B = 3 1
3
0
5

0 5

12 6
(a)

25 9

(c) AB is undefined.
2.

3 9 0
(b)

0 0 25

6 12
(d) 9 25

Perform the matrix operation.


Let A =
5 2 and B = 1 0 . Find 2A + 3B.
(a)
10 4

3.

(b) 7 4

(c)



(d) 7 4
7 4
Find the inverse ofthe matrix, ifit exists.

5 4
A =

0 4

67

Determinant of a Matrix

1
5

(a)

4.

1

5
1
4

1
5

(b)

1
5
1
4

1 1
4 5

0 1

5
(c)

(d) 5

1
4
1
5

Decide whether or not the matrices are inverses of each other.


1
2
2 4

and
1
4 4
2

1
4
1
4

(a)Yes
5.

(b) No

Determine whetherthe matrix is invertible.


6 1

3 4
(a)Yes

6.

(b) No

Compute the determinant ofthe matrix by cofactor expansion.


4 2 7

9 3 5
7 9 4

7.

(a)1048

(b)-286

(c)286

(d)146

Compute the determinant ofthe matrix by cofactor expansion.


5

0
0

0
0

8.

2 2 5 4

4 2 1 0
0 2 3 7

0 0 1 5
0 0 0 2

(a)0

(b)-80

(c)80

(d)-40

Determine which ofthe sets of vectors is linearly independent.


A: The set {p1, p2, p3} where p1(t) = 1, p2(t) = t2, p3(t) = 3 + 3t
B: The set {p1, p2, p3} where p1(t) = t, p2(t) = t2, p3(t) = 2t + 3t2
C: The set {p1, p2, p3} where p1(t) = 1, p2(t) = t2, p3(t) = 3 + 3t + t2
(a) C only

(b) All of them

(c) A only

(d) B only

(e) A and C
9.

Determine the values ofthe parameter s for which the system has a
unique solution, and describe the solution

68

Mathematics

sx1 5sx 2 = 3

3x1 15sx 2 = 5

4
9 5s
and x 2 =
15(s 1)(s + 1)
15(s 1)(s + 1)
14
9 + 5s
and x 2 =
(b) s 1; x1 =
3(s + 1)
15s(s + 1)
4
9 5s
and x 2 =
(c) s 0; x1 =
3(s 1)
15s(s 1)
4
9 5s
(d) s 1; x1 = 15(s 1)(s + 1) and x 2 = 15s(s + 1)
10. If A, B and C are matrices with orders 33, 23 and 42 respectively,
how many of the following matrix calculations are possible?
(a) s 1; x1 =

4B, A + B, 3BT + C, AB, BTA, (CB)T, CBA


(a)0

(b) 2

(c) 3

(d) 4

(e) 1

4.12 REVIEW QUESTIONS


1.

Find the determinants of these matrices.


3 4
a.

1 2
5 2
b.

3 6
3 0
4

c. 6 2 1
5 7 3
2 3 1

d. 7
0 5
5 2 4

2.

Use Cramers rule to solve these systems of equations:

x1 5x 2 + 7x 3 = 10

3.

x1 5x 2 + 7x 3 = 10

9x 2 + 2x 3 = 7
9x 2 + 2x 3 = 7
(a). x + 3x x = 6
(b). x1 + 3x 2 x 3 = 6
1
2
3
Find the inverse of matrix A using cofactors and determinants. Verify
that you found the inverse by checking that I is the product matrix of
AA-1 or A-1A. Remember, since A is square, you do not have to check
both because if AA-1 or A-1A is the identity matrix, then so is the other.
1 3 1

A = 4 1 3
3 2 1

69

Determinant of a Matrix

4.

Classify these systems as either consistent or inconsistent. If the system


is consistent, further categorize it as underdetermined or uniquely determined. Explain why the system fits into that category. Also, explain
what this means graphically for each system.
(a)

2x1 + 3x 2 = 9 and 3x1 + 4 x 2 = 13

(b)

3x1 + 4x 2 = 7 and 9x1 + 12x 2 = 21

(c)

2x1 + 3x 2 = 8 and 3x1 + 4x 2 = 11

1 3 4
5


A = 2 1 6 and b = 6
1 7 0
3
2 3
4
5


(d) Ax = b where
A = 5 2 1 and b = 7
0 11 18
11
5.

6.

4 3 9
9 2 6

Consider the matrices: A = 0 1 5 and B = 8 0 1


10 2 8
3 12 7
(a) What is A + B?
(b) What is AT + B?

Consider the following matrices:


4 7 8
A=

3 1 2

5 2 6
B=

0 9 3

(a) Is AB defined? If yes, what is it? If no, why not?


(b) Using two of A, AT, B, and BT form a product that is a 2 by 2 matrix.
For instance, AAT is a 2 by 2 matrix. Find two more examples of
this.
7.

For
5
2 5 7 8

A = 0 8 3 1 and B =
0
9 1 2 4

7 10

3 1
4 2

8 9

find AB.
8.

For Matrix A
4 3 9

0 1 5
10 2 8

(a)Finding determinant of A
(b) Forming Minors Matrix of A
(c) Forming Cofactor Matrix of A
9.

What is an adjoint matrix?

10. For Matrix A

70

Mathematics

4 3 9

0 1 5
10 2 8

(a) Forming Adjoint A


(b) Finding the Inverse Matrix of A

ANSWERS FOR THE MULTIPLE CHOICE QUESTIONS


(1) d

(2) d

(3) b

(4) b

(5) a

(6) c

(7) c

(8) e

(9) c

(10) c

Chapter 5

Limit, Continuity, and


Differentiability
Objectives
After studying this
chapter, you will be
able to:
Understand the
limit
Discuss the
neighborhood
of a point on
the real line
Explain the different types of
limits
Understand
continuity of a
function
Discuss the
derivability of a
function
Describe the
differential of a
function

INTRODUCTION

he basic concepts of the theory of calculus of real variables are limit, continuity and differentiability of a function of real variables. Here we give
an intuitive idea of limit and then the analytical definition of it. The rest of
this chapter deals with the continuity and differentiability of a real valued
function.

5.1 LIMIT
Meaning of x a (x tends to a): Let x be a real variable and a be a fixed
real number. Suppose that x assumes successive values a + 0.1, a + 0.01, a +
0.001, a + 0.0001, ...(Figure5.1)

Figure 5.1: (x a +)
Obviously, as x passes through these successive values, the numerical
difference between x and the real number a, i.e., | x a | becomes less and
less gradually and becomes so small that we can write |x a| < for every
given > 0. We express this situation symbolically as x a + to mean x tends
(or approaches) to a from R.H.S. Thus, x a +, whenever of the successive
values of x ultimately satisfy a < x < a + , being any positive real number,
no matter however small.
Again, suppose the real variable x assumes successive values a 0.1, a
0.01, a 0.001, a 0.0001, (Figure 5.2). As x passes through these successive
values, the numerical difference between

72

Mathematics

Figure 5.2: (x a )
x and the real number a, i.e., | x a | becomes less and less gradually and
becomes so small that we can write | x a | < for every pre-assigned
> 0. We express this situation symbolically as x a and read as x tends
(or approaches) to a from L.H.S.Thus, x a , whenever, the successive
values of x ultimately satisfy a < x < a, being any positive real number,
no matter however small.

Key Vocabulary
Continuous Function: It is a function
for which, intuitively,
smallchanges in the
input result in small
changes in the output.
Otherwise, a function is
said to be a discontinuous function.

By the expression x tends (or approaches) to a or symbolically by


x a, we mean, given any > 0, no matter however small, the successive
values of x ultimately satisfy the inequality 0 < | x a |< . Note that x
a implies x a.
Meanings of x and x . If a real variable x, assuming positive
values only, increases without limit, i.e., it assumes values in such a way
that the successive values ultimately become and remain greater than any
pre-assigned arbitrary positive real number M, no matter however large,
then we say that x tends (or approaches) to infinity. This is denoted symbolically as x + , or x .
If a real variable x, assuming negative values only, increases numerically without limit, i.e., it assumes values in such a way that the successive
values become and remain less than M, where M is any pre-assigned arbitrary positive real number, no matter however large, then we say that x
tends (or approaches) to minus infinity. This is denoted symbolically as x
.
It is noted that in reality, there exists no number such as + or
towards which x approaches.
These are only symbols to mean that the value of x increases (for + )
or decreases (for ) without limit.

5.2 NEIGHBORHOOD OF A POINT ON


THE REAL LINE
Let a be a point on the real line (i.e., a line on which points are denoted by
real numbers and vice versa in an orderly manner) and > 0 be a real number. The -neighborhood of a point a, denoted by N (a) or simply by N (a),
is defined as the interval:
a < x <a + , or | x a |< .
The deleted -neighborhood of the point a is obtained by deleting the
point a from N (a). It is denoted by N (a) {a}, and is defined by
0 < | x a | < , i.e., a < x <a + , x a .

73

Limit, Continuity, and Differentiability

5.2.1 Limit of a Function


Definition: A real valued function (x) of a real variable x is said to
have a limit l if for any pre assigned arbitrary positive number , no
matter however small, there corresponds a positive number such
that | f (x) l | < , i.e., l < f (x) <l + , whenever, 0 < | x a |
< , x a, i.e., a < x < a + , x a. This situation is denoted by
writing
lim f(x) = l, or
x a

f(x) l as x a

The meaning is that for every neighborhood (l , l + ) of l, there exists


a deleted neighborhood(a , a + ) of a such that (x) is in (l , l + ) for
every x in (a , a + ) excluding x = a. This definition does not require the
behavior of (x) at x = a.
Geometrically this means that for every x in the two open intervals
a < x < a and a < x < a + ,
the graph of f (x) lies between the horizontal lines
y = l and y = l +

Figure 5.3:Limit of a function.


Notes:
(i) Here the graph of y = (x) has been assumed to be without any
break in the interval under consideration. For the determination of
lim f(x) we are not at all concerned with the point on the graph corx a
respondin g to x = a. The point on the graph corresponding to x =
a may not belong to the curve y = (x) or even may not exist at all.
(ii) A limit, if exists, is necessarily finite and unique.
(iii) To prove the existence of limit of (x) it will be sufficient if we can
show that the inequality 0 < | x a | < follows from the inequality | f (x) l | < ; is given and can be found. If we can find > 0
for given > 0 which satisfy the previous two inequalities, we write

lim f(x) = l .
x a

Key Vocabulary
Differentiable Function:
It is a function whose
derivative exists at each
point in its domain.

74

Mathematics

(iv) The existence of lim f(x) depends completely on the values of (x)
x a
for x near a (not for x at a).

A function (x) is said to have a limit l1 as x a from the left, if for


given > 0, no matter however small, we can find a > 0 such that

| f (x) l1 | < fora < x <a .

This is denoted by
R lim f(x) = l 2 or lim f(x) = l 2 or f(a + 0) = l 2
x a

Key Vocabulary
Function: A function is
a relation between a set
of inputs and a set of
permissible outputs with
the property that each
input is related to exactly
one output.

x a +

and say that l1 is the left-hand limit of (x).


Similarly, if for given > 0, no matter however small, we can find a > 0
such that
| f (x) l2 | < for a < x < a +
then we say that (x) has the limit l2 as x a from the right.
This is symbolically expressed as
R lim f(x) = l 2 or lim f(x) = l 2 or f(a + 0) = l 2
x a

x a +

where l2 is called the right-hand limit of (x).


From the definition of limit, it follows that lim f(x) = l if and only if
x a

lim f(x) = l1 = l = l 2 = lim f(x)

x a

x a +

Example 1: Using definition show that lim 3x = 6


x2

Solution: Given > 0, we have to find out > 0, such that


| 3x 6 | < for 0 < | x 2 | <
i.e., |x - 2|< /3

for 0 < | x 2 | <

Therefore, if we choose, = / 3 our definition is satisfied, since


| 3x 6 | <

for 0 <|x - 2 |/ 3

Hence the result.


Example 2: Using definition show that lim 3x = 6 .
x2

Solution: Given > 0, we have to find out > 0 such that


x2 4
4 <
x2
i.e.,

for

(x 2)(x + 2)
4 <
x2

0< x2 <

for

0< x2 <

Since we are considering the limit when x 2, then x 2 0 and hence


we may cancel the factor (x 2) from both numerator and denominator.
Thus,
| x + 2 4 | < for 0 < | x 2 | <
i.e., | x 2 | < for 0 < | x 2 | <

75

Limit, Continuity, and Differentiability

Therefore, if we choose = , our definition is fulfilled, since


x2 4
4 <
x2

for

0< x2 <

Hence the result:


lim p(x) = p(a)
x a

5.3 DIFFERENT TYPES OF LIMITS


Definition 1: A function (x) is said to tend to as x a, if for any
pre-assigned positive number N, however large it may be, there exists a > 0, such that

f (x) > N for 0 < | x a | <


This is symbolically expressed as
lim f(x) =
x a

Definition 2: A function (x) is said to tend to as x a, if for any preassigned positive number

N, however large it may be, there exists a > 0, such that

f (x) > N for 0 < | x a | < .

This is symbolically written as

lim f(x) =
x a

1
= .
x2
Solution: Given N > 0, no matter however large, we have to find out a
> 0, such that

Example 1: Using definition show that lim


x 0

1
>N
x2
1
x2 >
N

or

Therefore, if we choose =
1
>N
x2

for
for

1
N
for

0< x0 <
0< x <

, our definition is satisfied, since


0< x0 <

1
N

Hence the result.


Definition 3: A function (x) is said to have a limit l as x tends to ,
if for any pre-assigned positive number , no matter however small,
there exists a positive number m, however large it may be, such that
| f (x) l | < for x > m
f(x) = l.
This is denoted by xlim

Definition 4:A function (x) is said to have a limit l as x tends to , if for

Key Vocabulary
Limit: A limit is the
value that a function or
sequence approaches
as the input or index approaches some value.

76

Mathematics

any pre-assigned positive number , however small it may be, there exists
a positive number m, however large, such that | f (x) l | < for x > m
This is expressed by writing
lim f(x) = l.

1
= 0.
x2
Solution: Given > 0, no matter however small, we have to find out a
m> 0, such that

Example 2: Using definition show that lim


x

Key Vocabulary
Real Line: The real line
or real number line is the
line whose points are the
real numbers. That is, the
real line is the set R of
all real numbers, viewed
as a geometric space,
namely the Euclidean
space of dimension one.

1
0 <
x2

for

x>m
1

or

< for x > m

or

x >

Therefore, if we choose m =

1
for x > m

, our definition is satisfied, since

1
0 <
x2

for

x>

Hence the result.


Definition 5: A function (x) is said to tend to as x tends to , if, for
any given positive number N there exists a positive number m such
that (x) > N for x > m
This is symbolically expressed as
lim f(x) =
x

Example 3: Show that

lim x 2 =
x

Solution: Given N > 0 we have to find out a m > 0 such that


x 2 > N for
or

x > N

for

x>m
x>m

Therefore, if we choose m > N , our definition is satisfied, since


x 2 > N for x > N
Hence the result.

Some Standard Limits


sin x
=1
(i) lim
x 0
x

1
(ii)lim 1 + = e
x
x

77

Limit, Continuity, and Differentiability


1

1
(iv)lim log e (1 + x) = 1
x 0 x

(iii) lim(1 + x) x = e
x 0

ex 1
=1
x 0
x
xn a n
= na n 1
(vii) lim
x a x a

ax 1
= log e a,a > 0
x 0
x
(1 + x)n 1
=n
(viii)lim
x 0
x

(v) lim

(vi)lim

Fundamental Theorems
Let (x) and g(x) be two real single valued functions of real variable x
lim
f(x) = l,lim g(x) = m, where l and m are finite.
and
x a

x a

Then
(i) lim cf(x) = c lim f(x) = cl, where c is a cons tan t
x a

x a

(ii) lim {c1 f(x) + c 2 g(x)} = c1l + c 2 m,


x a

where c1 , c 2 are cons tan t.

(iii) lim{f(x).g(x)} = l.m


x a

f(x) l
(iv) lim
= , provided m 0
x g(x)

m
(v) If (x) f(x) (x) in (a h, a + h), h > 0
and lim (x) = lim (x) = l, then lim f(x) = l
x a

x a

x a

(vi) If f(x) g(x) in (a h, a + h), h > 0 and lim f(x) = l,


lim g(x) = m, then l m

x a

x a

(vi) If lim (x) = band lim f(y) = f(b) then


x a

yb

lim f{ (x)} = f lim (x) = f(b)


x a

x a

5.4 CONTINUITY OF A FUNCTION


Continuity at a Point
A real valued function (x) of a real variable x is said to be continuous at x
= a provided the following three requirements are fulfilled:
(i) (a) exists, i.e., the function has a value at x = a
(ii) lim f(x) exis t an d
x a

(iii) lim f(x) = f(a)


x a

More precisely, we may come to the following analytical definition.


Definition: A function (x) defined in a neighborhood of a including a itself
is said to be continuous at x = a, if given > 0, no matter however small,
there exists a> 0, such that
| f (x) f (a) | < for | x a | <
i.e., f (a) < f (x) < f (a) + for a < x < a +
Note: A point where (x) is not continuous is called a point of discontinuity.

78

Mathematics

Continuity from the Left


Definition: A function (x) is said to be continuous from the left at a point x
= a if lim f(x) = f(a), i.e., given > 0, there exist a > 0 such that
x a

f(x) f(a) <

for a < x a

Continuity from the Right


Definition: A function (x) is said to be continuous from the right at a point
x = a if lim f(x) = f(a), i.e., given > 0, there exist a > 0 such that
x a +

f(x) f(a) <

for a x < a +

Note: It is obvious that if a function (x) is continuous both from the


left and from the right at a point x = a, then it is continuous at x = a.

Continuity in an Interval
A function (x) is said to be continuous in an interval if it is continuous at
every point of the interval.
Note: Geometrically, the function (x) is said to be continuous at the
point x = a if there is no break in the graph of the function at the point
whose abscissa is a and in an arbitrarily small neighborhood of a.
Example 1: A function (x) is defined as follow:
f(x) = x 2 , 0 < x < 1
= x, 1 x < 2
1
= x2 , 2 x < 3
4

Show that it is continuous at x = 1 and discontinuous at x = 2.


Solution:
(i) Here, f(1 + 0) = lim f(x) = lim x = 1 (as x 1+ 1 < x < 2)
x 1+

and

x 1+
2

f(1 0) = lim f(x) = lim x = 1 (as x 1 0 < x < 1)


x 1

x 1

Also f(1) = 1
f(1 0) = f(1) = f(1 + 0), i.e., f(x) is con t inuous at x = 1.

(ii) Now, f(2 0) = lim f(x) = lim x = 2 (as x 2 1 < x < 2)


x2

and

x2

f(2 + 0) = lim f(x) = lim


x2 +

x2 +

1 2
x = 1 (as x 2 + 2 < x < 3)
4

f ( 2 0 ) f ( 2 + 0 ).
So (x) is discontinuous at x = 2.
Example 2: Show that the function (x) = x [x], where [x] denotes the
greatest integer not greater than x, is discontinuous at x = 0.

79

Limit, Continuity, and Differentiability

Solution: Here
f(x) = x ( 1) = x + 1when 1 x < 0
= x 0 = x when 0 x < 1
f(0 0) = lim f(x) = lim (x + 1) = 1

and

x 0

x 0

(Q 1 < x < 0)

f(0 + 0) = lim f(x) = lim x = 0


x 0 +

f(0 0) f(0 + 0)

x 0 +

Therefore (x) is discontinuous at x = 0


Note: This function is discontinuous at all integral values of x.
Example 3: Show that the function
1
f(x) = x cos , when x 0
x
= 0, when x = 0

is continuous at x = 0.
Solution: Here
1
f(x) f(0) = x cos 0
x
= x cos

1
x
x

1
Q cos 1for all non zero real x
x

Therefore, given > 0, there exists = > 0, such that | f (x) f (0) | <
whenever | x 0 | <
Hence (x) is continuous at x = 0.
Example 4: Show that the function
1
f(x) = x 2 sin , when x 0
x
= 0, when x = 0

is continuous at x = 0.
Solution: Here
f(x) f(0) = x 2 sin
= x 2 sin

1
0
x
1
x2
x

1
sin 1 for all real
x

Therefore, given > 0, there exists a = > 0, such that


| f (x) f (0) | < whenever | x 0 | < .
Therefore f (x) is continuous at x = 0.

Theorem of Continuity
Let (x) and g(x) be both continuous at x = a, then
(i) f (x) + g(x), f (x) g(x) and f (x) g(x) are continuous at x = a

80

Mathematics

(ii)

f(x)
g(x)

is continuous at x = a provided g (a) 0

(iii) |(x)| and |g(x)| are continuous at x = a


(iv) Any constant function is continuous at any point.
f(x) = a 0 x n + a1 x n 1 + ....... + a n 1 x + a n , a 0 0,
(v) Let
be a polynomial in x
of degree n, then (x) is continuous for all real values of x.

(vi) If g(x) is continuous at (a) then g{(x)} is continuous at x = a

5.5 DERIVABILITY OF A FUNCTION


Definition: Let y = (x) be a real single valued function defined in the closed
f(c + h) f(c)
interval [a,b] and c be apoint in (a, b), i.e., a < c < b. If lim
exists,
h 0
h
and finite, then the limit is called the derivative of (x) at x = c denoted by
dy
(c) or
at x = c and we say that is derivable or differentiable at x = c.
dx

Right-hand and Left-hand Derivatives

If hlim
0

f(c + h) f(c)
exists,
h

is called the right-hand derivative of (x) at x = c denoted by Rf(c) or f (c +


f(c + h) f(c)
lim
0). Also if h 0
exists, is called the left-hand derivative of (x) at
h
x = c denoted by Lf (c) or f(c 0).

Notes: (i) f (c) exists if and only if Lf (c) and Rf (c) both exist and are
equal.
(ii) If any one fails to exist or if both exist but are unequal, then f (c)
does not exist.

Derivability in an Interval
Let (x) be a function defined in [a, b]. Then (x) is said to be derivable in
[a, b] if
(i) f (c) exists for all c such that a < c < b
(ii) f (a + 0) exists
(iii) f (b 0) exists
Theorem: If a function (x) has a finite derivative at x = c, then it is continuous at x = c.
Proof: Let (x) has finite derivative f (c) at x = c.

81

Limit, Continuity, and Differentiability

f(c + h) f(c)
f(c + h) f(c) =
h
h

f(c + h) f(c)
lim {f(c + h) f(c)} = lim
h
h 0
h 0
h

f(c + h) f(c)
= lim
h (sin ce both the lim its exist)
lim
h 0
h

h 0
= f '(c) 0 = 0
Thus for any > 0, there exists a > 0, such that
|f (c + h) f (c) | < whenever |h 0 |<
or |f (x) f (c)|< whenever |x c|< ( putting x = c + h )
This proves that (x) is continuous at x = c.

Remark: The converse of the above theorem is not true in general, that
is, a function may be continuous at a point, but may not be derivable there.
It is illustrated in the following example.
Example: Show that the function (x) = | x |, x is real, is continuous but
not derivable at x = 0.
Solution: Here
x if x 0
f(x) =
x if < 0
lim f(x) = lim x = 0 (Q x > 0)

and
Also,
Thus,

x 0 +

x 0 +

lim f(x) = lim ( x) = 0 (Q x < 0)

x 0

x 0

f(0) = 0
lim f(x) = lim = f(0)

x 0

x 0 +

Hence (x) is continuous at x = 0


Now,

f(h) f(0)
- h- 0
= lim
(Qh < 0)
h

h
h
= lim ( 1) = 1 (Q h 0)

f '(0 - 0) = lim

h 0

h 0

Also,

f(h) f(0)
h- 0
= lim
(Qh > 0)
h 0 + h
h
= lim (1) = 1 (Q h 0)

f '(0 + 0) = lim

h 0 +

h 0 +

f '(0 0) f '(0 + 0),

therefore, f (0) does not exist though (x) is continuous at x = 0. Hence the
result.
Example 1: Let
f(x) = x, 0 < x < 1
= 2 x, 1 x 2
= x x2 , x > 2

Show that (x) is discontinuous at x = 2 and also verify that f (2) does
not exist.
Solution: Here

82

Mathematics

lim f(x) = lim (2 x) = 0 (Q 1 < x < 2)

x2

and

x2

lim f(x) = lim (x x 2 ) = 2 (Q x > 2)

x2 +

x2 +

Hence lim f(x) lim f(x) and consequently (x) is discontinuous at


x2
x2 +
x=2
Now,

h 0

f(2 + h) f(2)
(2 + h) (2 + h)2 0
= lim
(Q h > 0)
h 0 +
h 0 +
h
h
(h 2 + 3h + 2)

2
= lim
= lim h + 3 +
h 0 +
h 0 +
h
h

and

ist.

f(2 + h) f(2)
2 (2 + h) 0
= lim
(Q h < 0)
h

h
h
h
= lim
= lim ( 1) = 1
(Q h 0)
h 0 h
h 0

L f '(2) = lim

R f '(2) = lim

This limit does not exist (since it tends to ). Hence f(2) does not ex-

Example 2: Let (x) be the function defined by (x) = 2 | x | + | x 2 |,


x is real. Show that (x) is not derivable at x = 0, 2 and is derivable at every
other points.
Solution: Here

f(x) = 3x + 2, when x < 0


= x + 2, when 0 0 2
= 3x 2, when x > 2
Now,

f(0 + h) f(0)
3h + 2 2
= lim
h

h
h
= lim 3 = 3 (Q h 0)

L f '(0) = lim

(Q h < 0)

h 0

h 0

and

f(0 + h) f(0)
h+22
= lim
h

0
+
h
h
= lim 1 = 1
(Q h 0)

R f '(0) = lim

h 0 +

(Q 0 < h < 2)

h 0 +

Lf '(0) Rf '(0)

Hence (0) does not exist, i.e., (x) is not derivable at x = 0.


Again

f(2 + h) f(2)
2+h+24
= lim
(Q 0 < 2 + h < 2)
h

h
h
= lim 1 = 1
(Q h 0)

Lf '(2) = lim

h 0

h 0

and

f(2 + h) f(2)
3(2 + h) 2 4
= lim
h 0 +
h 0 +
h
h
= lim 3 = 3
(Q h 0)

Rf '(2) = lim

h 0 +

Lf '(2) Rf '(2)

Therefore (2) does not exist, i.e., (x) is not derivable at x = 2


It is obvious that (x) is derivable at every point other than x = 0, 2

83

Limit, Continuity, and Differentiability

Example 3: Prove that the function (x) = | x 1 |, 0 < x < 2 is continuous at x = 1 but not differentiable there.
Solution: Here

f(x) = x + 1, when 0 < x < 1


= x 1, when 1 x < 2
Now,

lim f(x) = lim ( x + 1)

x 1

x 1

= 1 + 1 = 0
lim f(x) = lim (x 1)

and

x 1+

x 1+

(Q 0 < x < 1)
(Q 1 > x < 1)

= 11= 0
lim f(x) = lim f(x) = 0 = f(1)

x 1

x 1+

Hence (x) is continuous at x = 1


Again

f(1 + h) f(1)
(1 + h) + 1 0
= lim
(Q 0 < 1 + h < 1)
h

h
h
= lim ( 1) = 1

Lf '(1) = lim

h 0

h 0 +

f(1 + h) f(1)
(1 + h) 1 0
= lim
(Q1 < 1 + h < 2)
h 0 +
h
h
= lim 1 = 1
(Q h 0)

and

Rf '(1) = lim

h 0 +

h 0 +

Lf '(1) Rf '(1)

Therefore we conclude that (x) is continuous at x = 1 but not differentiable there.


Example 4: Examine the continuity and differentiability of the function
1, when x 0
f(x) =
1 + sin x, when x > 0
at x = 0.
Solution: Here,
f(0 0) = lim f(x) = lim 1 = 1

(Q x < 0)

f(0 + 0) = lim f(x) = lim (1 + sin x) = 1

(Q x > 0)

x 0

x 0

x 0 +

x 0 +

(0 ) = 1

And

f (0 0 ) = f (0 + 0 ) = f (0 )
Hence (x) is continuous at x = 0
Again
and

f(0 + h) f(0)
11
= lim
= 0,
h 0 h
h
1 + sin h 1
f(0 + h) f(0)
f '(0 0) = lim
= lim
h 0 +
h 0 +
h
h
sin h
= lim
=1
h 0 + h
f '(0 0) = lim

h 0

(Q h < 0)
(Q h > 0)

84

Mathematics

Since f (0 0) f (0 0), f(x) is not derivable (or differentiable) at x = 0.


Example 5: A function (x) is defined by
1
if x 0
x
= 0 if x = 0.

f(x) = x sin

Prove that (x) is continuous but not derivable at x = 0


Solution: Here
f(x) f(0) = x sin
= x sin

1
0
x

1
Q sin 1for all non zero real x
x

1
x
x

Therefore, given > 0, there exists a = > 0, such that


|f (x) f (0)|< whenever x 0 < .
Hence (x) is continuous at x = 0.
For the derivability of (x) at x = 0, we have from the definition
f(0 + h) f(0)
f '(0) = lim
= lim
h 0
h 0
h
= lim sin
h 0

1
0
h
h

h sin
1
,
h

which does not exist since (x) oscillates finitely near x = 0.


Therefore, (x) is continuous at x = 0 but not derivable at x = 0.
Example 6: Show that the function (x) defined by
1
if x 0
x
= 0 if x = 0 is derivable at x = 0.

f(x) = x 2 sin

Solution: Here,
f(0 + h) f(0)
h
1
2
h sin 0
1
h
= lim
= lim sin
h 0
h

0
h
h

f '(0) = lim
h 0

=0

(Qh 0)

1
Q 1 sin 1
h

Hence (x) is derivable at x = 0.


Example 7: If
x
, x0
1 + e1/ x
= 0, x = 0

f(x) =

then show that (x) is continuous at x = 0 but not derivable at x = 0.

85

Limit, Continuity, and Differentiability

Solution: We have
f(0 + 0) = lim f(x) = lim
x 0 +

= lim

x 0 +

x 0 +

x
1 + e1/ x

xe 1/ x
=0
e 1/ x + 1

and

f(0 0) = lim f(x) = lim

Also

=0
f(0) = 0
f(0 0) = f(0 + 0) = f(0)

x 0

x 0

(Q x 0)
(Q e 1/ x 0 as x 0+)

x
(Q x 0)
1 + e1/ x
(Q e1/ x 0 as x 0)

Hence (x) is continuous at x = 0


h
1/ h
f(0 + h) f(0)
f '(0 0) = lim
= lim 1 + e
h 0
h 0
h
h
1
= lim
=1
h 0 1 + e1/ h
1

1/ h
Q as h 0 and so e 0 as h 0
h

Now,

and

f(0 0) = lim f(x) = lim


x 0

=0
f '(0 0) f '(0 + 0).

x 0

(Q h 0)

x
(Q x 0)
1 + e1/x
1

1/h
Q as h 0 + and so e as h 0 +
h

Therefore (x) is not derivable at x = 0. Hence we conclude that (x) is


continuous at x = 0 but not derivable at x = 0.
Example 8: Examine the continuity and derivability of the function (x)
defined by
x 2 + x + 2 for 0 x < 1
f(x) =
for 1 x 2
3x + 1
at x = 1.
Solution: Here
f(1 0) = lim f(x) = lim(x 2 + x + 2) (Q0 < x < 1)
x 1

x 1

x 1+

x 1+

= 1+1+ 2 = 4
f(1 + 0) = lim f(x) = lim (3x + 1) (Q1 < x < 2)
and

= 3+1= 4
f(1) = 3 + 1 = 4
f(1 0) = f(1 + 0) = f(1)

Hence (x) is continuous at x = 1

86

Mathematics

Again

and

f(1 + h) f(1)
h
(1 + h)2 + (1 + h) + 2 4
= lim
(Q 0 < 1 + h < 1)
h 0
h
h 2 + 3h
= lim
= lim (h + 3) = 3
(Q h 0)
h 0
h 0
h
f(1 + h) f(1)
3(1 + h) + 1 4
= lim
f '(1 + 0) = lim
(Q 1 < 1 + h < 2)
h 0 +
h 0 +
h
h
3h
= lim
= lim 3 = 3
(Q h 1)
h 0 h
h 0
f '(1 0) = f '(1 + 0) = 3

f '(1 0) = lim

h 0

Therefore, (x) is derivable (or differentiable) at x = 1


Example 9: It is given that f (x + y) = f (x) f (y), f (x) 0, for all real x, y
and f(0) = 2. Prove that for all real x, f (x) = 2 f (x). Hence find the value of
(x).
Solution: It is given that f (x + y) = f (x) f (y) for all real x, y...(1)
for x = y = 0, we get from (1),
f (0) = f (0) f (0) f (0) = 1 ( f (0) 0) ...(2)
f(x + h) f(x)
f(x)f(h) f(x)
= lim
[by (1)]
h 0
h
h
f(h) 1
f(h) f(0)
= f(x)lim
= f(x)lim
[by (2)]
h 0
h 0
h
h
= f(x)f '(0) = 2f(x)
[sin ce f '(0) = 2]
Thus f '(x) = 2f(x), for all real x.
df

= 2dx
f
Now,

f '(x) = lim
h 0

Integrating both sides, we have


df
= 2 dx, or log f(x) = 2x + log c
f
When x = 0, log f(0) = log c or log c = 0

log f(x) = 2x, or f(x) = e

(Q f(0) = 1, by (2))

2x

Example 10: Let (x) be a continuous function and g(x) be a discontinuous function. Prove that (x) + g(x) be a discontinuous function.
Solution: Let F(x) = f (x) + g(x), where (x) is continuous and g(x) is discontinuous. Let us suppose that F(x) is continuous so that g(x) = F(x) (x)
is also continuous, which is a contradiction as g(x) is given to be discontinuous. Therefore F(x) = f (x) + g(x) must be discontinuous.
Example 11: Let (x) be a function satisfying the condition f (x) = f (x)
for all real x. If f(0) exists, find its value.
Solution: It is given that f (x) = f (x), for all real x ...(1)
Also (0) exists.

87

Limit, Continuity, and Differentiability

f(0 + h) f(0)
f(0 h) f(0)
= lim
h

0
h
h
f(h) f(0)
f(h) f(0)
lim
= lim
h 0
h 0
h
h
f(h) f(0)
2 lim
=0
h 0
h
2f '(0) = 0
f '(0) = 0

lim
h 0

or
or
or

[by (1)]

5.6 DIFFERENTIAL OF A FUNCTION


If the derivative of a function (x) exists, the relation
lim
h 0

f(x + h) f(x)
= f '(x)
h

defining the derivative is equivalent to the relation


f(x + h) f(x)
= f '(x) + , or f(x + h) f(x) = hf '(x) + h
h

...(1)

where as h 0
Here f (x + h) f (x) denotes the increment of the function (x). Therefore, when derivative of a function (x) exists, the increment of (x) consists
of two parts one part h(x), called the principal part or linear part and
the other part h, called error. The principal part h(x) is known as the differential of the function (x), denoted by d(x).
df(x)
dx
In perticular, if f(x) = x, then (2) gives dx = h
df(x)
Therefore,
df(x) =
.dx
dx

df(x) = hf '(x) = h

...(2)
...(3)
...(4)

Notes:
(i) The differential of an independent variable is same as the increment
of that variable.
(ii) The differential of a dependent variable is not identical with its
increment (compare (1) and (4)).
Example: Obtain dx, dy, y x, given, y = x 2 + x, x = 1 x = 0.1.
Solution:

dx = x = 0.1; y = {(1.1)2 + 1.1} (12 + 1) = 0.31


dy
dy =
dx = (2x + 1)dx = (2 + 1) 0.1 = 0.3
dx
y dy = 0.31 0.3 = 0.01.

88

Mathematics

5.7MULTIPLE CHOICE QUESTIONS


1.

2.

3.

If f (x) =

(a) equal to 1

(b) equal to 1

(c) equal to 0

(d) does not exist

lim

1 cos 2 x

7.

is

(a) 0

(b) 1 /2

(c) 1 / 4

(d) 1

lim

sin ( cos 2 )
x2

x 0

equals

(a) p

(b) p /2

(c) p

(d) 1

6.

x2

x 0

4. lim

5.

x 1
, x 1, then lim f(x)
x 1
x 1

a x bx
a x + bx

, where a > b > 1 is equal to

(a) 1

(b) 1

(c) 0

(d) none of these

lim

ex + 1
2

ex 1

is equal to

(a) 0

(b) 1

(c) 1

(d) does not exist

lim x x 2 + x is equal to
x

(a) -1 / 2

(b) 1 / 2

(c) 0

(d) None of these

ax + 3, 0 < x < 1

If f(x) = x 2
and lim f(x) exists, then the value of a is
x 1
+ 1, x 1
a
(a) 0
(b) 1
(c) 2

8.

(d) 1

12 + 2 2 + 32 + ... + n 2
is equal to
n
n3

lim

(a) 1 / 2

(b) 1 / 6

(c) 1 / 3

(d) None of these

89

Limit, Continuity, and Differentiability

3x + 1, when x 1
If f(x) =
then lim f(x)
x 1
4 x, when x > 1

9.

10.

(a) is equal to 4

(b) is equal to 3

(c) is equal to 1

(d) does not exist

(i) lim
x 1

x2 1
=2
x1

(iii) lim (x + 1)sin


x 0

(ii) lim x cos


x 0

1
does not exist.
x

1
=0
x

(a) is equal to 1

(b) is equal to 2

(c) is equal to 0

(d) does not exist

5.8REVIEW QUESTIONS
1. Using definition of limit, show that
(i) lim
x 1

x2 1
=2
x1

(iii) lim (x + 1)sin

x 0

(ii) lim x cos


x 0

1
does not exist.
x

1
=0
x

1 + x 1 x defined for all values of x? Indicate


x
the values of x for which it is defined and real. Find the limit as x
0.

2. Is the function

3. Show that the following limits do not exist:


(i) lim e1/ x
x 0

x2
(iii) lim
x2 x 2
(v) lim

x 1

(ii) lim

x
sin x
(iv) lim
x 0
x
x 0

x sin(x 1)
x 1

4. A function (x) is defined as under:

x + 2, x < 1

f(x) = 4x 1, 1 x < 3
x 2 + 5, x 3.

f(x) = 3 and lim f(x)does not exist.


Show that lim
x 1
x3

5. A function is defined as under:

sin x
, x<0

f(x) = x
ax + b, x > 0.

f(x) exists then find a, b and the limit.


If lim
x 0

90

Mathematics

6. A function is defined as under:


f(x) = x 4 , x 2 < 1
= x, x 2 1.

Show that lim f(x) = 1 and lim f(x) does not exist.
x 1

x 1

7. Find (0) so that f (x) = (1+ 2x)1/x for x 0 may be continuous at x


= 0.
8. Examine continuity and differentiability of (x) at x = 0

Where
1
f(x) = x cos , x 0
x
= 0,
x=0

9. Show that the function (x) defined by

1
f(x) = x 2 cos , if x 0
x
= 0,
x=0

is derivable at x = 0.

10. Let (x) = x2, if x 1 and (x) = ax + b, if x > 1. Find the coefficients
a and b at which the function is continuous and has a derivative at
x = 1.

ANSWER FOR MULTIPLE CHOICE QUESTIONS


1. (d)

2. (b)

3. (c)

4. (a)

5. (b)

6. (a)

7. (d)

8.(c)

9.(d)

10. (d)

Chapter 6

Applications of Derivatives

Objectives
After studying this
chapter, you will be
able to:
Discuss the determinant of a
square matrix
Explain
the
properties
of
determinants
Understand the
minors
Describe the cofactors
Discuss the applications
of
matrices
and
determinants
Explain the adjoint of a matrix
Understand the
expanding to
find the determinant
Discuss theinverse
of
a
square matrix
U n d e r s t a n d consistent and
inconsistent
systems
Discuss the
system of linear
equations

INTRODUCTION

e have learnt how to find derivative of composite functions, inverse trigonometric functions, implicit functions, exponential
functions and logarithmic functions. In this chapter, we will study applications of the derivative in various disciplines, e.g., in engineering,
science, social science, and many other fields. For instance, we will learn
how the derivative can be used (i) to determine rate of change of quantities, (ii) to find the equations of tangent and normal to a curve at a point,
(iii) to find turning points on the graph of a function which in turn will
help us to locate points at which largest or smallest value (locally) of a
function occurs. We will also use derivative to find intervals on which
a function is increasing or decreasing. Finally, we use the derivative to
find approximate value of certain quantities.

6.1 RATE OF CHANGE OF QUANTITIES


The derivative

ds
we mean the rate of change of distance s with respect to
dt

the time t. in a similar fashion, whenever one quantity y varies with another
quantity x, satisfying some rule y= f (x), then dy/dx(or f (x)) represents the
dy
rate of change of y with respect to x and
(or f (x0)) represents the
dx x = x
0

rate of change of y with respect to x at x = x0. Further, if two variables x and


y are varying with respect to another variable t, i.e., if x = f (t) and y = g(t)
, then by Chain Rule Dy = dy dx , if dx = 0 Thus, the rate of change of y

dx

dt dt

dt

with respect to x can be calculated using the rate of change of y and that of
x both with respect to t. Let us consider some examples.
Example 1: Find the rate of change of the area of a circle per second with
respect to its radius r when r = 5 cm.
Solution: The area A of a circle with radius r is given by A = r2. Therefore, the rate of change of the area A with respect to its radius r is given by

92

Mathematics

Now
dS d
d
dx
(ByChain Rule)
= (6x 2 ) = (6x 2 ).
dt dt
dx
dt
3 36
= 12x. 2 =
x x
Hence, when

Key Vocabulary
Decreasing Function: A
function is said to be a
decreasing in an internal
if f(x+h) < f(x) for all x
belonging to the interval
when h is positive.

dV
= 9cm 3 / s
dt
dV d 3
d
dx
9=
(By Chain Rule)
= (x ) = (x 3 ).
dt dt
dx
dt
dx
= 3x 2 .
dt
or
dx 3
=
dt x 2
Hence, when
x = 10cm,

...(1)
dS
= 3.6cm 2 / s
dt

dA
When r = 5 cm ,
= 10 . Thus, the area of the circle is changing at the
rate of 10 cm2/s. dr

Example 2: A stone is dropped into a quiet lake and waves move in


circles at a speed of 4cm per second. At the instant, when the radius of the
circular wave is 10 cm, how fast is the enclosed area increasing?
Solution: The area A of a circle with radius r is given by A = pr2. Therefore, the rate of change of area A with respect to time t is

dA d
d
dr
dr
(By Chain Rule)
= ( r 2 ) = ( r 2 ). = 2r
dt dr
dt
dt
dt
It is given that
dr
= 4cm / s
dt
Therefore, when r = 10cm,
dA
= 2 (10)(4) = 80
dt
Thus, the enclosed area is increasing at the rate of 80p cm2/s, when r
= 10 cm.
Example3: The length x of a rectangle is decreasing at the rate of 3 cm/
minute and the width y is increasing at the rate of 2cm/minute. When x
=10cm and y = 6cm, find the rates of change of (a) the perimeter and (b) the
area of the rectangle.
Solution: Since the length x is decreasing and the width y is increasing
with respect to time, we have (a) The perimeter P of a rectangle is given by

Applications of Derivatives

93

P = 2 (x + y) Therefore
(b) The area A of the rectangle is given by A = x. y Therefore = 3(6) +
10(2) (as x = 10 cm and y = 6 cm) = 2 cm2/min

6.2 INCREASING AND DECREASING


FUNCTIONS
If f(x) is a function defined on (a, b) and x0 (a, b) is any point. Then, f(x) is
said to be an increasing function at x0 if f(x0) > f(x) for all x (a, b) to the left
of x0 and f(x0) < f(x) for all x (a, b) to the right of x0. And, f(x) is said to be
an decreasing function at x0 iff f(x0) < f(x) for all x (a, b) to the left of x0 and
f(x0) > f(x) for all x (a, b) to the right of x0.
Mathematically, we can say that a function is known as increasing
if y-value increases when x increases and denoted as if x1 < x2 f(x1) <
f(x2). And, a function is known as decreasing if y-value decreases when x
is increases and denoted as if x1 < x2 f(x1) > f(x2).A function that either
increases or decreases everywhere in an Interval is called monotonic on
the interval.

Increasing Functions
A function is increasing if the y-value increases as the x-value increases, like this:

It is easy to see that y = f (x) tends to go up as it goes along.

6.2.1 Decreasing Functions


The y-value decreases as the x-value increases:

Key Vocabulary
Increasing Function: A
function is said to be
an increasing function
in an interval if f(x+h) >
f(x) for all x belonging
to the interval when h is
positive.

94

Mathematics

Key Vocabulary
Normal Line: It is
defined as the line which
is perpendicular to the
tangent at the point of its
contact to the curve.

For a function y=f(x):


when x1 < x2 then f(x1) f(x2)

Decreasing

when x1 < x2 then f(x1) > f(x2)

Strictly Decreasing

Notice that f(x1) is now larger than (or equal to) f(x2).
Example1: Let us try to find where a function is increasing or decreasing f(x) = x3-4x, for x in the interval [-1, 2].
Solution: Let us plot it, including the interval [-1,2]:

Starting from -1 (the beginning of the interval [-1,2]):


at x =-1 the function is decreasing,
it continues to decrease until about 1.2
it then increases from there, past x =2
Without exact analysis we cannot pinpoint where the curve turns from
decreasing to increasing, so let us just say:

95

Applications of Derivatives

Within the interval [-1,2]:


The curve decreases in the interval [-1, approx 1.2]
The curve increases in the interval [approx 1.2, 2]

6.2.2 Increasing and Decreasing Function Theorem


Let f be continuous on [a, b] and differentiable on the open interval (a, b).
Then,
f is increasing on [a, b] if f (x) > 0 for each x (a, b)

f is decreasing on [a, b] if f (x) < 0 for each x (a, b)

This theorem can be proved by using Mean Value Theorem. We shall


prove the theorem after learning Mean Value Theorem. This theorem is
applied in various problems to check whether a function is increasing or
decreasing.

6.2.3 Increasing and Decreasing Function Rules


Let the given function be f(x) on the real number line R.
Differentiate the function f(x) with respect to x and equate it to zero i.e.,
put f(x) = 0. Solve for x. These values of x which satisfy f(x) = 0 are called
Critical values of the function.
Arrange these Critical values in ascending order and partition the domain of f(x) into various intervals, using the Critical values.
Check the sign of f(x) in each open intervals.
If f(x) > 0 in a particular interval, then the function is increasing in that
particular interval. If f(x) < 0 in a particular interval, then the function is
decreasing in that particular interval.
Example2: Find the intervals on which the function

f ( x) =

x
is
1 + x2

(a) increasing
Solution:
Let f(x) =

x
1 + x2

Differenting the function, we have


f ' (x) =
=
=

1(1 + x 2 ) x2x
(1 + x 2 )2
x2 + 1
1 + x2
1 x2

1 + x2
1 x2
f ' (x) = 0 =
=0
1 + x2
1 x 2 = 0 or x = 1

(b) decreasing

Key Vocabulary
Stationary Point: Any
point at which the
tangent to the graph is
horizontal is called a
stationary point.

96

Mathematics

The critical values in ascending order are 1, 1.


We divide the Real numbers into the intervals
( , 1),( 1,1)and(1, ).
if x ( , 1)
(1 x)(1 + x) ( + ve)( + ve)
f ' (x) =
=
+ ve
1 + x2
= ve
Since f '(x) > 0 in the interval ( 1, 1), the function is increasing in this interval.
if x ( 1, 1)

Key Vocabulary

f ' (x) =

(1 x)(1 + x) ( ve)( + ve)


=
+ ve
1 + x2

= ve
Real Line: The real line
or real number line is the Since f '(x) < 0 , f(x) is decreasing in the interval (1, )
line whose points are the
real numbers. That is, the
6.3 FIRST DERIVATIVE TEST FOR
real line is the set R of
all real numbers, viewed INCREASING AND DECREASING
as a geometric space,
FUNCTIONS
namely the Euclidean
space of dimension one.
If the function f(x) is continuous on [a, b] and differentiable on (a, b) If f(x) is
positive (>0) at every point in (a, b), then f(x) is increasing in the interval[a,
b] If f(x) is negative (<0) at every point in (a, b), then f(x) is decreasing in
the interval [a, b].

6.3.1 Increasing and Decreasing Functions Graph


When the function is increasing, the graph of the function rises from left

to right. On the other hand, while the function is decreasing, its graph falls
from left to right.
The function y = x2is decreasing in the interval (-,0) and is increasing
in the interval (0, ). But, the cubic function is increasing in the entire domain (-, ). From the graph of an increasing and decreasing function, we
come to know following things about a function:

Applications of Derivatives

If the function is increasing to the left and decreasing to the right of a


critical point, then the point corresponds to a local maximum. If the function is decreasing to the left and increasing to the right of a critical point,
then the point corresponds to a local minimum. Based on this behavior, the
first derivative test for determining the local maximum and local minimum
can be stated as follows:
If the first derivative is positive to the left and negative to the right of a
critical point, then the point corresponds to a local maximum. If the derivative is negative to the left and positive to the right of a critical point, then the
point corresponds to a local minimum. If the critical point happens to be an
inflection point, then the change in the behavior of the function is not seen.
The function continues to be in the state of either increasing or decreasing.

The cubic graph on the left has point of inflection at x = 0. The graph is
seen increasing on either side of the critical point (0,0). Indeed the function
is monotonic increasing in the entire domain (-, ).

6.3.2 Increasing and Decreasing Behavior of Some


Known Functions
f(x) = ex. The exponential function is monotonic increasing in its entire domain (-,). f(x) = ln x. The Ln function is monotonic increasing in it domain
(0, ). f(x) = 1x. The reciprocal function is decreasing in its entire domain of
(-, 0) (0,). In the case of periodic functions, f(x) = sin x and f(x) = cos x,
the increasing or decreasing behavior also change periodically between two
extreme values. f(x) = tan x is monotonic, increasing in successive periods,
while f(x) = cot x exhibits the decreasing behavior periodically.

6.4 TANGENT AND NORMAL LINES


The adjoining Figure shows the graph of the curve y = f (x), and p is a point
on it. The TPQ is the tangent to the curve at P. This is inclined at an anglefto
the positive of x-axis.

97

98

Mathematics

Then as seen previously, tan f = (

dy
) P represents the slope of the tan
dx

gent line TPQ to the curve y = f (x) at the point P.

Thus, the equation of the tangent line to the curve at,

P( x1 , y1 ) is y y1 = (

dy
) P ( x1 , y1 )
dx

The Normal Line is defined as the line which is perpendicular to the


tangent at the point of its contact to the curve. Therefore the slope of this
normal line at point

1
1

3
= 3
=

Therefore, the equation of the normal line at P to the curve y = f (x) will
be y y1 =

1
( x1 , y1 )
dy
( )P
dx

6.5 ANGLE BETWEEN TWO CURVES


Angle between two curves is the angle between two tangents lines drawn
to the two curves at their point of intersection. If m1 and m2 are two slopes
of these two tangents and is the acute angle between them then we have

tan = |

m1 m2
|
1 + m1m2

Let C1 and C2 be two curves intersecting at a point P. Then these curves are
said to intersect orthogonally at P. If tangents at P to C1and C2are at right
angles i.e. m1m2= -1

Definition
Let the tangent and normal at P ( x1 , y1 ) to the curve y = f (x) meet x - axis at
T and G respectively and Let PN be perpendicular to the x - axis at N. Then

99

Applications of Derivatives

(1) PT is called the tangent length at P


(2) PG is called the normal length at P
(3) The segment TN is known as the sub-tangent at P and
(4) The segment NG is known as the sub-normal at P
Then we have
(1) PT =
(3) NT =

y1
y'
y1

1 + (y')2

y'

(2) PG = y1 1 + (y')2
and

(4)NG = y1 y'

Example 1: Find the equation of the tangent and normal to the curve y
= 3x2 - 4x -2 at the point (1, -3).
Solution: y = 3x2 - 4x -2The curve.
By Differentiating w. r. to x, we get
dy
= 6x 4
dx
Let P = (1, 3)
dy
Then( )P(1,3) = 6(1) 4 = 2
dx
Equation of the tangent at point P (1, -3) is
y - (-3) = 2 (x - 1)
i.e. y + 3 = 2x -2
i.e. y = 2x -5

P=

1
dy
( )P
dx

Next, slope of the normal at


Equation of the normal at P (1, -3) is

1
y (3) = ( x 1)
2
i.e. 2 (y + 3) = - (x - 1)
i.e. x + 2y + 5 = 0
Example 2: Find the equation of the tangent and normal line to the
curve y = x 2 + 8 at the point (-1, 3)
Solution: y = x 2 + 8 The curve and P = (-1,3)
Differentiating w. r. to x we get

dy
2x
x
=
=
dx 2 x 2 + 8
x2 + 8
dy
1
1
( )P =
=
dx
1+ 8 3
Equations of tangent line at P is

100

Mathematics

dy
) P ( x x1 )
dx
3( y y1 ) = x 1
i.e. x 3 y + 8 = 0
y y1 = (

1
dy
( )P
Also, slope of the normal is =- dx
1
1

3
= 3
=

Example3: Find the co-ordinates of the point of contact of the tangent


line to the curve y = x log x, which is inclined at an angle of 450 with the
x-axis
Solution: Let P(x,y) be the point of contact
The curve is y = x log x

dy
1
= x[ ] + log x[1]
dx
x
dy
( )P = 1 + Logx
dx

Slope of the tangent at P = 1 + log x

(1)

Since this tangent has inclination of 450 with x-axis, its slope is
tan 450= 1 (2)
1 + log x = 1 [from 1 and 2]
log x = 0 x = 1
But P lies on the curve
y = x log x
y = 1 log 1 = 0
P has co-ordinates = (1,0 )
Example 4: Find the angle between the two curves 2y2= x3and y2= 32x
at their point of intersection in the Ist quadrant
Solution: 2y2= x3
y = 32x
2

(1)

(2)

Solving (1) and (2), we get


2 (32x) = x3
x3- 64x = 0
x (x2- 64) = 0
x = 0 and x =8
As the point of intersection is in the Istquadrant it must be +ve and0
x = 8

101

Applications of Derivatives

Putting in y2= 32x


we get y2= 32(8)
y = 256
y =8
Accepting y = 16, we have the point of intersection is P = (8, 16) which
lies in the Ist quadrant.
Now Differentiating (1) and (2) w. r. to x, we get

dy
=0
dx
dy
2x
=
dx
y

4x + 2y

dy
m2 =

dx p(x1 ,y1 )
=

2 1
y1

(6)

Again

dy
= 32
dx
dy 16

=
dx x
dy
16
( )P =
m
dx
16 1
=1
slope of tan gent PT2 (m 2 ) = 1 .
2y

Let be the acute angle between curves (1) and (2) at P.


16 1
16 1
Then m1 m2 = 1 by (3)
m1 m2 =

3 1
1 + 3(1)

2
4
1
=
2
=

1
2

= tan 1 ( )

[From equation(3)]

102

Mathematics

Example 5: Prove that the curves y2= 16x and 2x2+ y2= 4 cut each other
orthogonally.
Solution: Let P (x1, y1) be a point of intersection of the curves
y2= 16x ... (1) and
2x2+ y2= 4 ... (2)
y2= 16x1 ... (3) and

Let m1 and m2 be the two slopes of the two tangents to the curve (1) and
(2) at P respectively. By Differentiating w. r. to x , the equation (1), we get
dy
= 16
dx
dy 8
=
dx y
dy
m1 = ( )p(x ,y )
dx 1 1

2y

8
y1

(5)

By Differentiating w. r. to x , the equation (2), we get


dy
=0
dx
dy
2x
=
dx
y
dy
m 2 = ( )p(x ,y )
dx 1 1

4x + 2y

2 1
y1

(6)
.

from (5) and (6)


We have
m1 m 2 =

16 1
y 21

16 1
16 1
m1 m 2 = 1 by (3)

m1 m 2 =

[From equation(3)]

Thus the curves intersect orthogonally at P.

6.7 APPROXIMATION OF APPLICATIONS


OF DERIVATIVES
Approximate the graph of f(x) by studying only its first and second derivative. Now we will develop further on the definition of the derivative to un-

103

Applications of Derivatives

derstand how we can obtain accurate approximations of f(x) using its first
derivative only. Before continuing let me address a suspicion that is growing in your mind. You are probably wondering why do we care about using
functions derivatives to determine the behavior of the original function. If
we want to analyze how f(x) behaves we can go and graph it directly, without having to look at its derivative.
To answer this, recall that Calculus was defined as the study of mathematically defined change. In science, changing situations are defined in
terms of several conditions or dimensions where one or more dimensions
is changing with respect to another dimensions. We need Calculus to be
able to define an instantaneous rate of change for the situation. In such a
case we need to work backwards from the derivative that we defined to its
anti-derivative.
The anti-derivative of the function will give us the exact relationship
between the situation and the changing dimensions. In order to understand
how to define an instantaneous rate of change and then calculate the net
change, the relationship between the function and its derivative must be
fully grasped. Therefore this chapter will take a look at the definition of the
derivative to see how it defines f(x).
Let us begin with the definition of the derivative;

dy
= f '( x)
dx
This gives us the exact change in f(x) over the infinitely small interval
dx. Remember how the instantaneous rate of change was defined as taking
the limit as a discrete value for a change in x, x goes to zero.

dy
f
= lim
dx x0 x
The derivative gives us the instantaneous rate of change of a function
over an infinitely small interval,dx. If we multiply both sides by dx, we get

df = f '( x)dx
If we replace df and dx with and in the above equation, then we get an
equation that allows us to approximate the change in a function,. Since the
derivative is only defined over an infinitely small interval, we cannot use
its value to give us the exact change in a function over an interval This is
because the derivative of a function changes after each interval, dx and is
constant over the interval.
Therefore the following equation only represents an approximation
to the net change in over a discrete interval as it assumes that the rate of
change is constant.

f = f '( x)x
the approximation gets less and less accurate as x increases f.

104

Mathematics

f ( x) = x 2 from x = 4 to x = 6, x = 2 The change in f(x) over this interval is

by definition

6.8 MAXIMA AND MINIMA


The maximum and minimum values of a function. Because the derivative
provides information about the gradient or slope of the graph of a function
we can use it to locate points on a graph where the gradient is zero. We shall
see that such points are often associated with the largest or smallest values
of the function, at least in their immediate locality.

6.8.1 Stationary Points


When using mathematics to model the physical world in which we live, we
frequently express physical quantities in terms of variables. Then, functions
are used to describe the ways in which these variables change. A scientist
or engineer will be interested in the ups and downs of a function, its maximum and minimum values, its turning points. Drawing a graph of a function using a graphical calculator or computer graph plotting package will
reveal this behavior, but if we want to know the precise location of such
points we need to turn to algebra and diferential calculus. In this section we
look at how we can nd maximum and minimum points in this way.
Consider the graph of the function, y(x), shown in Figure 6.1. If, at the
points marked A, B and C, we draw tangents to the graph, note that these
are parallel to the x axis. They are horizontal. This means that at each of the
points A, B and C the gradient of the graph is zero.

105

Applications of Derivatives

Figure 6.1: The gradient of this graph is zero


at each of the points A, B and C.

6.8.2 Turning Points


Points A and B the curve actually turns. These two stationary points are referred to as turning points. Point C is not a turning point because, although
the graph is at for a short time, the curve continues to go down as we look
from left to right. So, all turning points are stationary points. But not all
stationary points are turning points (e.g., point C). In other words, there are
points for which

dy
= 0 which are not turning points
dx

Point A in Figure 6.1 is called a local maximum because in its immediate area it is the highest point, and so represents the greatest or maximum
value of the function. Point B in Figure 6.1 is called a local minimum because in its immediate area it is the lowest point, and so represents the least,
or minimum, value of the function. Loosely speaking, we refer to a local
maximum as simply a maximum. Similarly, a local minimum is often just
called a minimum.

6.8.3 Distinguishing Maximum Points from Minimum


Points
The gradient of the graph as we travel through the minimum turning point,
from left to right, that is as x increases

dy

Figure 6.2: dx goes from negative through zero


to positive as x increases.

106

Mathematics

dy
is negative because
dx
dy
the tangent has negative gradient. At the minimum point,
= 0 To the
dx
dy
right of the minimum point
is positive, because here the tangent has
dx
dy
a positive gradient So,
goes from negative, to zero, to positive as x
dx
dy
increases. In other words dx must be increasing as x increases In fact, we
Notice that to the left of the minimum point,

can use this observation, once we have found a stationary point, to check if

dy
the point is a minimum. If dx is increasing near the stationary point then
that point must be minimum.

dy

dy

Now, if the derivative of dx is positive then we will know that dx is


increasing; so we will know that the stationary point is a minimum. Now
the derivative of

dy
d2y
called the second derivative, is written
We condx
dx 2

d2y
clude that if dx 2 is positive at a stationary point, then that point must

be a minimum turning point. It is important to realize that this test for a


minimum is not conclusive. It is possible for a stationary point to be a minimum even if

d2y
equals 0, although we cannot be certain: other types of
dx 2

d2y
2
behavior are possible. (However, we cannot have a minimum if dx s negative.) To see this consider the example of the function y = x4 A graph of this
function is shown in Figure 6.3 There is clearly a minimum point when x
= 0. But

dy
= 4 x3 and this is clearly zero when x = 0. Dierentiating again
dx

d2y
= 12 x 2 which is also zero when x = 0.
dx 2

Applications of Derivatives

Figure 6.3: The function y = x4 has a minimum at the origin where x = 0,

d2y
but
= 0 and so is not greater than 0
dx 2

Now think about what happens to the gradient of the graph as we


travel through the maximum turning point, from left to right, that is as x
increases.

Figure 6.4:

dy
goes from positive through zero to negative as x increases.
dx

dy
is positive because
dx
dy
the tangent has positive gradient. At the maximum point,
= 0. To the
dx
dy
right of the maximum point
is negative, because here the tangent has
dx
dy
a negative gradient. So,
goes from positive, to zero, to negative as x
dx
dy
increases. if a stationary point is a maximum. If dx is decreasing near a
Notice that to the left of the maximum point,

stationary point then that point must be maximum. Now, if the derivative

dy
dy
of dx is negative then we will know that dx is decreasing; so we will

107

108

Mathematics

know that the stationary point is a maximum. As before, the derivative of

dy
d2y
d2y
dx the second derivative is d dx 2 .We defined that if dx 2 is negative at

a stationary point, then that point must be a maximum turning point. It is


important to realise that this test for a maximum is not conclusive. It is pos-

d2y
2
sible for a stationary point to be a maximum even if dx = 0, although we

cannot be certain: other types of behavior are possible. But we cannot have

d2y
2
a maximum if dx > 0, because, as we have already seen the point would

be a minimum.

6.9 MULTIPLE CHOICE QUESTIONS


1.

2.

3.

4.

5.

6.

The non-zero real number k such that f(x) = (e)x is tangent to the
curve at g(x) = k(x)2 is :
(a) e

(b) e2

(c)(e2)/4

(d) (e4)/16

The real number a having the property that f(a) = a is a local minimum of f(x) = x4 - x3 - x2 + ax +1 is :
(a)1

(b) 2

(c)3

(d) 4

A non-zero polynomial f(x) with real coefficients has the property that
f(x)=f (x) * f (x) . The leading coefficient, i.e, the coefficient of the
highest power of x is:
(a) 1

(b) 1/6

(c) 321/2

(d)1/18

A point in the interior of the domain of a function f at which f = 0 or f


does not exist is a .. of f.
(a) critical point

(b) exit point

(c) normal point

(d)angular point

..connects the average rate of change of a function over


an interval with the instantaneous rate of change of the function at a
point within the interval.
(a) The critical point theorem

(b) The normal point theorem

(c)The normal point theorem

(d)The Mean Value theorem

Find the function f(x) whose derivative is sin x and whose graph passes through the point (0, 2).

109

Applications of Derivatives

7.

8.

9.

(a)f(x) = -cos x+3

(b) f(x) = -cos x+2

(c)f(x) = -cos x+5

(d) f(x) = -cos x+1

A point where the graph of a function has a tangent line and where the
concavity changes is a
(a) point of inflection

(b) point of deviation

(c)point of interaction

(d) point of summation

The Second Derivative Test does not apply at x = 0 because

(a) f(0) =1

(b) f(0) = 0

(c) f(0) =2

(d) f(0) = 6

The only information we cannot get from the derivative is how to place
the graph in the xy-plane.
(a) True

(b) False

10. How many critical points does the function f(x) = (x - 2)5 (x + 3)4have?
(a) One

(b) Two

(c)Three

(d) Five

6.10 REVIEW QUESTION


1. What are increasing/decreasing functions?
2. Find the equation of the tangent and normal to the curve y = 3x2 - 4x
-2 at the point (1, -3).
3. Explain first derivative test motivated geometrically.
4. Describe second derivative test given as a provable tool.
5. Find the dimensions of the rectangle of largest area having xed
perimeter 100.
6. What do you understand by applications of derivatives?
7. Let f(x) = x2 + 4x 3. Find the maximum value of f(x) on the interval [1, 1].
8. Find the area of the largest rectangle that ts inside a semicircle
of radius 10 (one side of the rectangle is along the diameter of the
semicircle).
9. You are inating a spherical balloon at the rate of 7 cm3/sec. How
fast is its radius increasing when the radius is 4 cm?
10. Find all local maxima and minima for f(x) = x3 x, and determine
whether there is a global maximum or minimum on the open interval (2, 2).

110

Mathematics

ANSWER FOR MULTIPLE CHOICE QUESTIONS


(1). (c)

(2). (a)

(3). (d)

(4). (a)

(5). (d)

(6). (a)

(7). (a)

(8). (b)

(9). (a)

(10). (c)

Chapter 7

Vectors

Objectives
After studying this
chapter, you will be
able to:
Define the scalars and vectors
Discuss the coordinate systems
Discuss direction ratios and
direction cosines
Explain the
magnitude
and direction
of a vector

INTRODUCTION

ectors are first introduced as geometric objects, namely as directed


linesegments, or arrows. The operations of addition, subtraction, and
multiplication by ascalar (real number) are defined for these directed line
segments. Two and threedimensional Rectangular Cartesian coordinate systems are then introduced and used togive an algebraic representation for the
directed line segments (or vectors). Two newoperations on vectors called the
dot product and the cross product are introduced. Somefamiliar theorems
from Euclidean geometry are proved using vector methods.

7.1 SCALARS AND VECTORS


Some physical quantities such as length, area, volume and mass can be completelydescribed by a single real number. Because these quantities are describable by givingonly a magnitude, they are called scalars. The word scalar
means representable byposition on a line; having only magnitude. On the
other hand physical quantities such asdisplacement, velocity, force and acceleration require both a magnitude and a direction tocompletely describe
them. Such quantities are called vectors.
If you say that a car is traveling at 90 km/hr, you are using a scalar quantity, namely thenumber 90 with no direction attached, to describe the speed
of the car. On the otherhand, if you say that the car is traveling due north at
90 km/hr, your description of thecars velocity is a vector quantity since it
includes both magnitude and direction.
To distinguish between scalars and vectors we will denote scalars by
lower case italictype such as a, b, c etc. and denote vectors by lower case boldface type such as u, v, wetc. In handwritten script, this way of distinguishing
between vectors and scalars must bemodified. It is customary to leave scalars
as regular hand written script and modify thesymbols used to represent vectors by either underlining, such as v or v , or by placing anarrow above the
ur
ur
symbol, such as u or v

112

Mathematics

Problems
1. Determine whether a scalar quantity, a vector quantity or neither
would beappropriate to describe each of the following situations.
a. The outside temperature is 15 C.
b. A truck is traveling at 60 km/hr.
c. The water is flowing due north at 5 km/hr.
d. The wind is blowing from the south.
e. A vertically upwards force of 10 Newtons is applied to a rock.
f. The rock has a mass of 5 kilograms.
g. The box has a volume of .25 m3.

Key Vocabulary

h. A car is speeding eastward.

Euclidean Vector: it is
frequently represented
by a line segment with
a definite direction, or
graphically as an arrow,
connecting an initial
point A with a terminal
point B, and denoted by
uuur
AB

j. A bulldozer moves the rock eastward 15m.

i. The rock has a density of 5 gm/cm3.


k. The wind is blowing at 20 km/hr from the south.
l. A stone dropped into a pond is sinking at the rate of 30 cm/sec.

7.1.1 Geometrical Representation of Vectors


Because vectors are determined by both a magnitude and a direction, they
are representedgeometrically in 2 or 3 dimensional space as directedline
segments or arrows. The length of the arrowcorresponds to the magnitude
of the vector while thedirection of the arrow corresponds to the direction of
thevector. The tail of the arrow is called the initial point of the vector while
the tip of thearrow is called the terminal point of the vector. If the vector v
has the point P as itsinitial point and the point Q as its terminal point we
uuur
will write v = PQ .

Equal Vectors
Two vectors u and v, which have the same length and samedirection, are
said to be equal vectors even though they havedifferent initial points and
different terminal points. If u andv are equal vectors we write u = v.

Vectors

113

Sum of Two Vectors

Key Vocabulary

The sum of two vectors u and v, written u + v is the vectordetermined as


follows. Place the vector v so that its initialpoint coincides with the terminal
point of the vector u. Thevector u + v is the vector whose initial point is the
initial pointof u and whose terminal point is the terminal point of v.

Vector: It is a
mathematical object
that has a size, called
the magnitude, and a
direction.

Zero Vector
The zero vector, denoted 0, is the vector whose length is 0. Since a vector of
length 0does not have any direction associated with it we shall agree that
its direction is arbitrary; that is to say it can be assigned any direction we
choose. The zero vector satisfies theproperty: v + 0 = 0 + v = v for every
vector v.

Negative of a Vector
If u is a nonzero vector, we define the negative of u, denoted u, to be the
vector whosemagnitude (or length) is the same as the magnitude (or length)
of the vector u, but whosedirection is opposite to that of u.

uuur
If AB is used to denote the vector from point A to point B, then the vector
uuur
uuur uuur
from point Bto point A is denoted by BA and BA AB

114

Mathematics

Difference of Two Vectors


If u and v are any two vectors, we define the difference of u and v, denoted
u v, to bethe vector u + (v). To construct the vector u v we can either
Construct the sum of the vector u and the vector v; or
Position u and v so that their initial points coincide; then the vector
from the terminalpoint of v to the terminal point of u is the vector
u v.

Key Vocabulary
Direction Cosines: A
vector is the cosines of
the angles between the
vector and the three
coordinate axes.

7.1.2 Multiplying a Vector by a Scalar


If v is a nonzero vector and c is a nonzero scalar, we define the product of c
and v,denoted cv, to be the vector whose length is c times the length of v
and whosedirection is the same as that of v if c > 0 and opposite to that of v
of c < 0. We definecv = 0 if c = 0 or if v = 0.

Parallel Vectors

The vectors v and cv areparallel to each other. Their directions coincide


if c> 0 and the directions are opposite to each other if c < 0. If u and v are
parallel vectors,then there exists a scalar c such that u = cv. Conversely, if u
= cv and c 0, then u andv are parallel vectors.
Example:
uuuur
uuur
Let O, A and B be 3 points in the plane. Let OA = a and let OB = b. Find
an expression for the vectorBA in terms of the vectors a and b.

Vectors

Solution:
uuur uuur uuuur
BA = BO + OA
uuur uuuur
= OB + OA
uuur uuuur
= OB OA
=a b

7.2 COORDINATE SYSTEMS


In order to further our study of vectors it will be necessary to consider vectors as algebraic entities by introducing a coordinate system for the vectors.
A coordinate system is a frame of reference that is used as a standard for
measuring distance anddirection. If we are working with vectors in two
dimensional space we will use a two-dimensional rectangular Cartesian coordinate system. If we are working with vectors in three-dimensionalspace,
the coordinate system that we use is a three-dimensionalrectangular Cartesian coordinate system. To understand these two and three-dimensionalrectangular coordinate systems we first introduce a one-dimensionalcoordinate systemalso known as a real number line.
Let R denote the set of all real numbers. Let l be a given line. We can
set up a one-to-one-relationship between the real numbers R and the points
on l as follows. Select apoint O, which will be called the origin, on the line l.
To this point we associate thenumber 0. Select a unit of length and use it to
mark off equidistantly placed points oneither side of O. The points on one
side of O, called the positive side, are assigned thenumbers 1, 2, 3 etc. while
the points on the other side of O, called the negative side areassigned the
numbers 1, 2, 3 etc. A one-to-onecorrespondence now exists betweenall
the real numbers R and the points on l. The resulting line is called a real
number lineor more simply a number line and the number associated with
any given point on theline is called its coordinate. We have just constructed
a one-dimensionalcoordinatesystem.

Two-dimension alrectangular Cartesian coordinate systemthe twodimensionalCartesian coordinate system has as its frame of reference twonumber lines that intersect at right angles. Thehorizontal number line is
called the xaxisand thevertical number line is the yaxis.
The point of intersection of the two axes is called the origin andis
denoted by O. To each point P in two-dimensionalspace we associate an

115

Key Vocabulary
Dot Product: It is an
algebraic operation that
takes two equal-length
sequences of numbers
and returns a single
number.

116

Mathematics

ordered pair ofreal numbers (x, y) called the coordinates of thepoint. The
number x is called the xcoordinate ofthe point and the number y is the ycoordinateof thepoint. The xcoordinatex is the horizontal distanceof the point
P from the y axis while the ycoordinatey is the vertical distance of the point
P from the x axis.The set of all ordered pairs of real numbers is denoted R2.

Key Vocabulary
Coordinate System It
is a system which uses
one or more numbers,
or coordinates, to
uniquely determine the
position of a point or
other geometric element
on a manifold such as
Euclidean space.

Three-dimensional rectangular cartesian coordinate system the threedimensionalcartesian coordinate system has as its frame of reference threenumber lines that intersect at right angles at a point O called the origin. The
number linesare called the xaxis, the yaxisand the zaxis.
To each point P in three-dimensionalspace we associate an ordered triple of real numbers (x, y, z) called the coordinates ofthe point. The number
x is the distance of the point P from the yz coordinate plane.The number y is
the distance of the point P from the xzcoordinateplane. The number zis the
distance of the point P from the xycoordinateplane. The set of all ordered
triplesof real numbers is denoted by R3. When the coordinate axes are labeled as shown in thefollowing diagrams, the coordinate system is said to
be a right-handed cartesian coordinate system.

7.2.1 Right-handed Cartesian Coordinate System


A right-handedCartesian coordinate system is one in whichthe coordinate
axes are so labeled that if we curl the fingers onour right hand so as to point
from the positive xaxistowardsthe positive yaxis,the thumb will point in the
direction of thepositive zaxis.
[If the thumb is pointing in the directionopposite to the direction of
the positive zaxis,the coordinatesystem is a left-handedcoordinate system.]

117

Vectors

7.2.2 The Dot Product (Scalar Product)


The dot product is a method for multiplying two vectors. Because the product of themultiplication is a scalar, the dot product is sometimes referred to
as the scalar product.The dot product will be used to find an angle between
two vectors and will haveapplications in finding distances between points
and lines, points and planes, etc.
If u = (u1, u2) and v = (v1, v2) are two vectors in R2, we define their dot
product, denoted
uv, as follows: uv = u1v1 + u2v2.
If u = (u1, u2, u3) and v = (v1, v2, v3) are two vectors in R3, we define their
dot product to be uv = u1v1 + u2v2 + u3v3.
Example 1: Let u = (1, 2, 3) and v = (4, 5, 6).
Then uv = (1)(4) + (2)(5) + (3)(6) = 4 + 10 + 18 = 32.
The following theorem relates the length of a vector to the dot product
of the vector withitself.
Theorem: For any vector u in R2 or in R3, u =

uu .

Proof The following proof is for R2. The proof for R3 is similar.
Let u = (u1, u2). Then uu = (u1, u2) (u1, u2) = u12 + u22 = u
Taking square roots gives u = v

Projections
Let u and v be two given vectors with v 0 theprojection of u along v, denoted projvu is the vector pfound as follows. Drop a perpendicular from
the terminalpoint of u that intersects the line through v at the point P.Then
uuur
projvu = p = OP .

118

Mathematics

We find p as follows. Since p lies along v, there is a scalar k such that p


= kv. Now upis orthogonal to v so (up)v = 0. But
(u p) v = 0 u v kv v = 0 k =

Hence projv u = p = kv =

uv uv
=
2
vv
v

uv
uv
v=
v.
2
vv
v

Example 2: Let u = (8, 1, 4) and let v = (1, 2, 2). Find projvu.


Solution:
projv u =

uv
(8,1, 4) (1, 2, 2)
8+2+8
v=
(1, 2, 2) =
(1, 2, 2) = 2(1, 2, 2) = (2, 4, 4)
vv
(1, 2, 2) (1, 2, 2)
1+ 4 + 4

7.2.3 Distance between a Point and a Line in R2


To find the distance D between a point P and a line l in R2, we select a point
uuur
Q on the linel, then the distance D is the length of the projection of OP on n,
a normal vector to the line l.

Vectors

uuur
uuur
OP n
D = projn OP =
n
nn
uuur
uuur
uuur
OP n
OP n
PQ n
=
=
n =
n
n
n n
uuur
uuur
Note that QP n = PQ n and so either of the last twoforms for the
distance D can be used interchangeably.
Example 3: Find the distance between the point P = (9, 1) and the line
3x + 4y = 6.
uuur
Solution: The point Q = (2, 0) lies on the line 3x + 4y = 6 so QP = (9,
1) - (2, 0) = (7, 1).
Since n = (3, 4), the distance is
uuur
QP n (7,1) (3, 4)
21 + 4
25
D=
=
=
=
=5
5
n
(3, 4)
9 + 16

7.2.4The Cross Product (Vector Product)


We were introduced to the dot product of two vectors. The resultof taking
the dot product of two vectors is a scalar quantity. We now introduce a
secondmethod of multiplying two vectors from R3 that results in a vector
quantity. The symbolused to denote this product is a cross hence the
name cross product. Because theresult is a vector, the term vector product is sometimes used for this product.
The cross product has a number of applications. We will use the cross
product to find theareas of triangles and parallelograms. It will also be used
to calculate the volume of aparallelepiped and later to find the distance between a point and a line in R3.

Cross Product (Vector Product)


If u = (u1, u2, u3) and v = (v1, v2, v3) are two vectors in R3, the cross product u
v is thevector in R3 defined as follows.
u v = (u2v3-u3v2, u3v1-u1v3, u1v2-u2v1).
Example 4: Let u = (3, 1, 2) and let v = (4, 6. 5).
Then uv = (15 - 2 6, 24 - 3 5, 3 6 -1 4) = (-7, - 7, 14).
Although the definition of the cross product as given above may be
difficult toremember, the concept of a 22 determinants can be used to simplify the process.
a b
a b
Consider the 22 array of numbers
The determinant of

c d
c d

119

120

Mathematics

a b
a b

c d
c d

written, det
or
is defined to be the number ad bc. Then the
cross product ofu = (u1, u2, u3) and v = (v1, v2, v3), using determinants, can
u u
u1 u 3 u1 u 2
2
3

u v
,
,
v2 v3
v1 v 3 v1 v 2

.
be written as the vector
We remember the components of u v as follows:
u1 u 2 u 3

v1 v 2 v 3

Form the 23 rectangular array


where the first row consists
of thecomponents of the vector u and the second row consists of the components of vector v.
To find the first component of uv, delete the first column and take
the determinant of the remaining 22 array; to find the second component
of uv, delete the second column and take the negative of the determinant
of the remaining 22 array; to find the third component of uv, delete the
third column and take the determinant of the remaining 22 array.
Example 5: Find uv if u = (2, 3, 4) and v = (5, 6, 7).
2 3 4
Solution: Construct the rectangular array
Then
5 6 7
34
2 4 2 3

uv =
,
,
6 7
5 7 5 6

= (3 7 4 6, (2 7 4 5), 2 6 3 5
= (21 24, (14 20), 12 15)
= ( 3,6, 3)

7.3 DIRECTION RATIOS AND


DIRECTION COSINES
Direction ratios provide a convenient way of specifying the direction of a
line in three dimensionalspace. Direction cosines are the cosines of the angles between a line and the coordinate axes.In this Block we show how
these quantities are calculated.

The Direction Ratio and Direction Cosines


Consider the point P(4, 5) and its position vector 4i + 5j shown in Figure 1.

121

Vectors

Figure 1.
uuur
The direction ratio of the vector OP is defined to be 4:5. We can interpret this as stating thatto move in the direction of the line OP we must move
4 units in the x direction for every 5 unitsin the y direction.The direction couuur
sines of the vector OP are the cosines of the angles between the vector and
eachof the axes. Specifically, referring to Figure 1 these are cos and cos
Noting that the length of
4

cos =

41

, cos =

uuur
OP

4 2 + 52 = 41

uuur
is OP

4 2 + 52 = 41 we can write

5
41

It is conventional to label the direction cosines as and m so that


l=

4
41

m=

5
41

7.3.1 Direction Ratios and Cosines in Three


Dimensions
The concepts of direction ratio and direction cosines extend naturally to
three dimensions.
Consider Figure 2.
Given a vector r = ai + bj + ck its direction ratios are a:b:c.This means
that to move inthe direction of the vector we must move a units in the x
direction and b units in the ydirection for every c units in the z direction.
The direction cosines are the cosines of the angles between the vector
and each of the axes. Itis conventional to label direction cosines as l, m and
n and they are given by
l = cos =

a
2

a +b +c

, m = cos =

b
2

a +b +c

, n = cos =

a
2

a + b2 + c 2

122

Mathematics

Figure 2.

7.3.2 Components of a Vector


A displacement of one unit in the positive x direction is labeled i and a displacement of one unit in the positive y direction islabelled j. Because each
has length one unit, they are calledunit vectors.

So
uuur
PQ = 3i + 4j

can also be written as


uuur
PQ = 3i + 4j
3i and 4j are known as the components of the vector PQ.

7.4 MAGNITUDE AND DIRECTION OF A


VECTOR
uuur
Consider the vector AB = 2i 5j the magnitude or modulus ofvectors represented geometrically by thelength of the line AB.

123

Vectors

Using Pythagoras theorem


uuur
AB = 2 2 + ( 5 )
= 4 + 25
= 29
= 5.39 units.
Its direction is defined by the angle AB makes with the positivex direction. This angle is -, where
5
2
= 68.2

tan =

uuur
AB has magnitude 5.39 units and its direction makes an angle
68.2owith the x-axis.The convention used to define direction in the example above isthat angles are measured positive anticlockwise from 0x up
toand including 180 and negative clockwise from 0x up to, butnot including, 180.
How many other ways can you find of uniquely defining thedirection
of a vector?
In general for a vector r= ai+bj, its magnitude is given by
r = a 2 + b2
and its direction is given by where

124

Mathematics

cos =

a
b
b
sin = and tan = .
a
r
r

The angle can be found from one of these together with a sketch. The
vector 0=0i+0j has magnitude zero and is called the zero vector.

7.4.1 Adding Vectors


Vectors have magnitude and direction. To complete the definition of a vector, it is necessary to know how to add two vectors.
When vectors are added, it is equivalent to one displacement followed
by another.
uuur
When a =3i+ 2j is added to b =2i j, it is the same as displacement XY
uuur
followed by displacement YZ from the diagrams opposite,
R=a+b

= ( 3i + 2 j ) + ( 2 i j )
= ( 3 + 2 ) i + ( 2 1) j
= 5i + j

Vectors

The components are added independently of each other. Ris called the
resultantof a andband this property of vectors is called the triangle law of
addition.
In general, in component form, if p =di+ ejand q =f i+ gjthen p+ q (d +
f) i + (e + g)j.
Adding vectors can be considered in terms of a parallelogram law as
well as a triangle law. In fact, the parallelogram law includes the triangle law
uuur uuur uuur
XZ = XY + YZ
or R = a+ b.

From the lower triangle XWZ the result is obviously just as valid. This
shows that the resultant vector R of a andb is either a followed by b or b
followed by a.
A vector is any quantity possessing the properties of magnitude and
direction, which obeys thetriangle law of addition

7.4.2 Multiplication by a Scalar


Suppose a displacement 2i+ j is repeated three times. This is equivalent to
adding three equal vectors:
(2i+ j)+ (2i+ j) + (2i+ j).

125

126

Mathematics

The result is 6i + 3j or 3(2i+ j), and the scale factor of 3 scales each component separately.
The vector 2i+ j is said to have been multiplied by the scalar 3.
In general, for 2 non-zero vectors a andb, if a =sbwhere s is a scalar,
then a is parallel to b.
If s >0, a and bare in the same direction, but if s <0 then aandbare in
opposite directions.
In general, in component form, if vector a is given as
a = xi + yj
andsis a scalar, then the vector sais
sa = s (xi+yj)
= sxi+ syj

7.4.3 Subtracting Vectors


You have seen so far that vectors can be added and multiplied by scalars.
What then of subtracting vectors?
Take a =3i+ 2j and b = i+4j.
Operating as with addition,
ab = a+ (b)
= 3i+2j+ (i4j)
= (3 1)i + (2 +4) j
=2i 2j.

Geometrically, b is equal in magnitude but opposite in direction to


b, so while a + b is shown in (i) on the right, ab = a + (b)is shown in (ii).
You can see that both addition and subtraction of vectors use the triangle law of addition.

7.4.4 Relative Position Vectors


uuur
The position vector of B relative to A is simply AB . Then,using the triangle
law of addition,
uuur uuuur uuur
AB = AO + OB
= a + b
b a.

127

Vectors

Wherea and bare the position vectors of A and B, respectively, relative


to the origin O.
Example 1: The point P has position vector i + j, the point Q, i + 6j
uuur
and the point R, j. Find the magnitudes and directions of the vectors QR
uuur
and QR

Solution:
uuur uuur uuur
PQ = PO + OQ
uuur uuur
= OP + OQ
uuur uuur
= OQ OP

= ( i + 6 j ) ( i + j )
= 2i + 5 j

uuur
The magnitude of PQ is given by
uuur
PQ = 2 2 + 52
= 29 = 5.39.

128

Mathematics

The direction is given by


tan =

5
2

So that
68.2.
Similarly
uuur uuur uuur
QR = OR OQ

= j (i + 6 j )
= i 7 j

uuur
QR
The magnitude of
is given by
uuur
2
2
QR = ( 1) + ( 7 )
= 50 = 7.07.
uuur
QR
The direction of
is -180+ where
tan =

7
1

giving
= 81.9.

uuur
So the direction of QR is 180+81.9=98.1.

Unit Vectors
You should be getting used to using i and j which are unitvectors in the
directions Ox and Oy. A unit vector is simply avector having magnitude
one, and can be in any direction. To find a unit vector in the direction of
c=3i4j, you multiply bya scalar, so that its direction is unchanged but its
magnitude isaltered to one.
The magnitude of c is

129

Vectors

c = 32 + ( 4 )

=5
which is five times as big as the magnitude of a unit vector. C must be
1
multiplied by 5 or divided by 5 to make a unit vector in its direction.
A unit vector in the direction of cis
1
( 3i 4 j )
5
3
4
= i j
5
5
In general, if a vector ahas magnitude
rection of a is denoted a and
a =

, then a unit vector in the di-

a
.
a

Equal Vectors
If the vectors ci+ dj and ei+ fj are equal, then
ci+ dj= ei+ fj
and it follows that
c = e and d = f.
These are the only possible conclusions if the vectors are equal.
Note that c = e comes from equating the i components of theequal vectors and d = f comes from equating the j components.
You will see in later chapters that the technique of equatingcomponents is very useful in the solution of problems.
Example 2: Vectors p and q are defined in terms of x and y as
p = 3i+ (y - 2)j and q = 2xi- 7j.
If p = 2q, find the values of x and y.
Solution: Since p = 2q,
3i+ (y 2)j= 2(2xi 7j)
giving 3i+ (y 2)j= 4xi 14j
Equating i components gives
3 = 4x
and so
3
x= .
4
Equating j components gives
y 2 = 14
and so
y = 12.

130

Mathematics

Vectors in Three-dimensions
The results obtained so far have all been applied to vectors inone or two
dimensions. However, the power of vectors is thatthey can be applied in
one, two or three dimensions. Although the applications of mechanics in
this book will be restricted toone or two dimensions, by taking a vector approach, theextensions to three dimensional applications will be easier.

In order to work in three dimensions, it is necessary to define athird


axis Oz, so that Ox, Oy and Oz form a right-handed set asin the diagram opposite. An ordered trio of numbers such as (2,3, 4) is necessary to define the
coordinates of a point and avector must have 3 components. For example,
the positionvector of the point (2, 3, 4) is 2i+ 3j+ 4k where k is a unit vector
in the direction Oz.
The properties of vectors considered so far are all defined in three dimensions as the following examples show.
Example 3: If p = 3i- 2j+ k and q = i+ 3j- 2k, find p+ q and 4q.
Solution:
p + q = ( 3i 2 j + k ) + (i + 3 j 2 k )

= ( 3 + 1) i + ( 2 + 3 ) j + (1 2 ) k
= 4 i + j k.

4q = 4 ( i + 3 j 2 k )
= 4i + 12 j 8 k.

Scalar Products
So far, vectors have been added, subtracted and multiplied by ascalar. Just
as the addition of two vectors is a differentoperation from the addition of
two real numbers, the product oftwo vectors has its own definition.
The scalar product of two vectors, a and b, is defined as abcos , where
is the angle between the two vectors, and aand b are the moduli (or magnitude) of a and b. The scalarproduct is usually written as a.b, read as a dot
b, so
a.b = ab cos .

131

Vectors

If the vectors a and bare perpendicular the scalar product a.bis 0 since
cos 90=0
So
a.b = 0 when a is perpendicular to b.
Also, if the vectors a and b are parallel, the scalar product a.b isgiven
by ab, since cos0 = 1.
So
a.b = ab when a is parallel to b.
The scalar product follows both the commutative and distributive
laws.
a.b = ab cos
= ba cos
= b.a

This shows that the scalar product is commutative iea.b = b.a.


The scalar product a. (b+c) can be found by considering thediagram
opposite.
Since OQ = OP + PQ
Then|b+ c|cos = b cos + c cos.
Multiplying by a (a) gives
|a ||b + c|cos = abcos + ac cos
So
a.(b+ c) = a.b+ a.c
This result shows that the scalar product is distributive overaddition.

7.5 MULTIPLE CHOICE QUESTIONS


1.

2.

A zero vector has


(a) any direction

(b) many directions

(c) no direction

(d) None of these

A 200-lb force is pulling on an object, as shown below. The sign of the


x and y components of the force are

132

Mathematics

3.

4.

5.

6.

7.

8.

9.

(a) x (positive), y (positive)

(b) x (positive), y (negative).

(c) x (negative), y (positive)

(d) x (negative), y (negative).

Three boys each pull with a 20-N force on the same object as shown
below. The resultant force will be

(a) zero.

(b) 20 N to the left.

(c) 20 N up.

(d) 20 N down.

A vector of magnitude 10 has an angle with the positive x axis (East)


of 120 degrees. What are its components?
(a) 5 and 8.7

(b) 5 and 8.7

(c) 5 and 8.7

(d) 5 and 8.7

A vector has components x = 6 m and y = 8 m. What are its magnitude


and direction?
(a) 10 m and 30 degrees

(b) 14 m and 37 degrees

(c) 10 m and 53 degrees

(d) 14 m and 53 degree

A vector has components x = 2 m and y = 2 m. What is its direction


(angle with respect to East)?
(a) 45 degrees

(b) 135 degrees

(c) 225 degrees

(d) 45 degrees

Three vectors have components (x,y): A = (2,3), B = (4,7), and C =


(6,3). What is the magnitude of the resultant vector (D = A + B + C)?
(a) 7

(b) 17.7

(c) 15

(d) 7

Three vectors have components (x,y): A = (2,3), B = (4,7), and C =


(6,3). What is the magnitude of the resultant vector (D = A B C)?
(a) 13.6

(b) 15

(c) 52

(d) 7

The system of vectors i, j, k is


(a) Orthogonal

(b) Collinear

(c) Coplanar

(d) None of these

10. The point having position vectors 2i + 3j + 4k, 3i + 4j + 2k, 4i + 2j + 3k


are the vertices of.

133

Vectors

(a) Right angled triangle

(b) Equilateral triangle

(c) Isosceles triangle

(d) Collinear

7.6 REVIEW QUESTIONS


1. Find the direction ratios, the direction cosines and the angles that
uuur
the vector OP makes with each of the axes when P is the point with
coordinates (2,4,3).
2. A line is inclined at 60 to the x axis and 45 to the y axis. Find its
inclination to the z axis.
3. P and Q have coordinates (2, 4) and (7, 8) respectively.
uuur
4. Find the direction ratio of the vector PQ
uuur
Find the direction cosines of PQ
5. Points A and B have position vectors a = 3i + 2j + 7k, and b = 3i +
4j 5k respectively. Find
uuur
(a) AB

uuur
(b) AB

uuur
(c) the direction ratios of AB
uuur
(d) the direction cosines (l, m, n) of AB (e) Show that l2 + m2 + n2 = 1.
Write in the form ai+ bj the vectors:
uuuur
uuur
uuur
(a) OA
(b) OB
(c) AB
uuur
uuur
uuur
(d) BA
(e) BC
(f) CD
uuur
uuuur
uuuur
(g) BD
(h) DA
(i) DA
uuur
(j) EA

(k)

1 uuur
EC
2

uuur
(l) 5 CA

6. Find the magnitudes and directions of:



a = 3i 9j
c = 3i + 3j

b = 2i j
d = 5i + 4j

134

Mathematics

7. What do you understand by geometrical representation of vectors?


8. Explain the cross product with example.
9. What are the difference between equal vectors and unit vectors?
10. Define the multiplication of a vector by a scalar.

ANSWER FOR MULTIPLE CHOICE QUESTIONS


1. (a)

2.(c)

3.(b)

4.(b)

5.(c)

6. (c)

7.(d)

8.(a)

9.(a)

10.(b)

Chapter 8

ThreeDimensional
Geometry
Objectives

INTRODUCTION

After studying this


vector algebra approach to threedimensional geometry is to present
chapter, you will be
standard properties of lines and planes, with minimum use of compliable to:
cated threedimensional diagrams such as those involving similar triangles.
Understand
Points are defined as ordered triples of real numbers and the distance
three
between
points P1 = (x1, y1, z1) and P2 = (x2, y2, z2) is defined by the formula,
dimensional
space
Understand the
dot product
Explain the
Cauchy
Schwarz
inequality
Explain triangle
inequality for
distance
Understand
Joachimsthals
ratio formulae

P1 P2 =

( x2 x1 ) + ( y2 y1 ) + ( z2 z1 )
2

.
uuur
Directed line segments AB are introduced as threedimensional column
vectors: If A = (x1, y1, z1) and B = (x2, y2, z2), then,

x x1
uuur 2

AB = y2 y1 .
z z
1
2
uuur
If P is a point, we let P = OP and call P the position vector of P.
With suitable definitions of lines, parallel lines, there are important geometrical interpretations of equality, addition and scalar multiplication of vectors.
Equality of Vectors: Suppose A, B, C, D are distinct points such that no
uuur uuur
uuur uuur
uuur uuur
three are collinear. Then AB = CD if and only if AC BD and AC BD (See
Figure 8.1.)

136

Mathematics

Key Vocabulary
CauchySchwarz
Inequality: It states that
by taking limits one can
obtain an integral form
of Cauchys inequality.

Figure 8.1: Equality and addition of vectors.


Addition of vectors obeys the parallelogram law: Let A, B, C be

noncollinear. Then
uuur uuur uuuur
AC + AC = AD

uuur uuur
uuur uuur
where D is the point such that AB CD and AC BD (See Figure 8.1.)
uuur
uuur
Scalar multiplication of vectors: Let AP = t AB , where A and B are distinct
points. Then P is on the line AB,
AP
=t
AB
and

P = A if t = 0, P = B if t = 1;
P is between A and B if 0 < t < 1;
B is between A and P if 1 < t;
A is between P and B if t < 0.
(See Figure 8.2)

Figure 8.2: Scalar multiplication of vectors.

137

ThreeDimensional Geometry

The dot product XY of vectors

a1
a2


X = b1 and Y = b2
C
c
1
2

, is defined by

X Y = a1a2 + b1b2 + c1c2 .


The length

X = (X X )

1/2

of a vector X is defined by

X = (X X )

1/ 2

and the CauchySchwarz inequality holds:


X Y X Y .
The triangle inequality for vector length now follows as a simple deduction:
X Y X + Y .

Using the equation


uuur
AB = AB ,
we deduce the corresponding familiar triangle inequality for distance:
The angle between two nonzero vectors X and Y is then defined by
cos =

X Y
, 0
X Y

This definition makes sense. For by the CauchySchwarz inequality,


1

X Y
1.
X Y

Vectors X and Y are said to be perpendicular or orthogonal if X Y = 0.


Vectors of unit length are called unit vectors. The vectors
1
0
0



i = 0 , j = 1 , k = 0
0
0
1



are unit vectors and every vector is a linear combination of i, j and k:
a

i = b = a i + b j+ c k.
c

Nonzero vectors X and Y are parallel or proportional if the angle between X and Y equals 0 or ; equivalently if X = tY for some real number
t. Vectors X and Y are then said to have the same or opposite direction, according as t> 0 or t< 0.

Key Vocabulary
Equality of Vectors: If
A, B, C, D are distinct
points such that no
three are collinear. Then
uuur uuur
AB = CD if and only if
uuur uuur
uuur uuur
AB CD and AC BD .

138

Mathematics

We are then led to study straight lines. If A and B are distinct points, it
is easy to show that AP + PB = AB holds if and only if
uuur uuur
AP = t AB, where 0 t 1.
A line is defined as a set consisting of all points P satisfying
uur
P = P0 + tX , t or equivalently P0 P = tX ,
for some fixed point P0 and fixed nonzero vector X called a direction
vector for the line.
Equivalently, in terms of coordinates,

Key Vocabulary

x = x0 + ta , y = y0 + tb , z0 + tc ,

NonZero Vectors: X
and Y are parallel or
proportional if the angle
between X and Y equals 0
or this is called non-zero
vectors.

Where

uuur uuur
AP = t AB, and not all of a, b, c are zero.

There is then one and only one line passing through two distinct points
A and B. It consists of the points P satisfying
uuur uuur
AP = t AB,
where t is a real number.
The crossproduct XY provides us with a vector which is perpendicular to both X and Y. It is defined in terms of the components of X and Y:
Let

X = a1 i + b1 j+ c1 k andY = a2 i + b2 j+ c2 k.

Then

X Y = a i + b j+ c k,
Where
a=

b1
b2

c1
a
,b= 1
c2
a2

c1
a
,c= 1
c2
a2

b1
.
b2

The crossproduct enables us to derive elegant formulae for the distance from a point to a line, the area of a triangle and the distance between
two skew lines.
Finally we turn to the geometrical concept of a plane in threedimensional space.
A plane is a set of points P satisfying an equation of the form

P = P0 + sX + tY , s , t ,

(1)

where X and Y are nonzero, nonparallel vectors. In terms of coordinates,


equation 1 takes the form
x = x0 + sa1 + ta2

y = y0 + sb1 + tb2

z = z0 + sc1 + tc2 ,

where P0 = (x0, y0, z0).


There is then one and only one plane passing through three noncollinear points A, B, C. It consists of the points P satisfying
uuur
uuur
uuur
AP = s AB + t AC ,
where s and t are real numbers.

139

ThreeDimensional Geometry

The crossproduct enables us to derive a concise equation for the plane


through three noncollinear points A, B, C, namely
uuur uuur uuur
AP AB AC = 0.

When expanded, this equation has the form


ax + by + cz = d,
uuuur
whereai + bj + ck is a nonzero vector which is perpendicular to P1 P2 for all
points P1, P2 lying in the plane. Any vector with this property is said to be
a normal to the plane.
It is then easy to prove that two planes with nonparallel normal vectors must intersect in a line. We conclude the chapter by deriving a formula
for the distance from a point to a plane.

8.1 THREEDIMENSIONAL SPACE


Threedimensional space is the set E3 of ordered triples (x, y, z), where x, y,
z are real numbers. The triple (x, y, z) is called a point P in E3 and we write
P = (x, y, z). The numbers x, y, z are called, respectively, the x, y, z coordinates of P.
The coordinate axes are the sets of points:

{( x,0,0 )} ( x axis ) ,{(0, y ,0 )} ( y axis ) ,{(0,0, z )} ( z axis ).


The only point common to all three axes is the origin O = (0, 0, 0). The
coordinate planes are the sets of points:

{( x, y ,0 )} ( xy plane ) ,{(0, y , z )} ( yz plane ) ,{( x,0, z )} ( xz plane ).


> 0.

The positive octant consists of the points (x, y, z), where x > 0, y > 0, z

We think of the points (x, y, z) with z > 0 as lying above the xyplane,
and those with z < 0 as lying beneath the xyplane. A point P = (x, y, z)
will be represented as in Figure 8.3. The point illustrated lies in the positive
octant.
The distance P1P2 between points P1 = (x1, y1, z1) and P2 = (x2, y2, z2) is
defined by the formula
P1 P2 =

( x2 x1 ) + ( y2 y1 ) + ( z2 z1 )

For example, if P = (x, y, z),


OP = x 2 + y 2 + z 2 .

Key Vocabulary
Points: It is defined

as ordered triples of
real numbers and the
distance between points
P1 = (x1, y1, z1) and P2 =
(x2, y2, z2).

140

Mathematics

Key Vocabulary
Scalar Multiplication of
uuur
uuur
Vectors: Let AP = t AB
, where A and B are
distinct points. Then P
is on the line AB. This is
scalar multiplication of
vector.

Figure 8.3: Representation of three-dimensional space.

uuur
Figure 8.4: The vector AB .
uuur
If A = (x1, y1, z1) and B = (x2, y2, z2) we define the symbol AB to be the
column vector
x x1
uuur 2

AB = y2 y1 .
z z
1
2

141

ThreeDimensional Geometry

uuur
Let P = OP and call P the position vector of P.
uuur
The components of AB are the coordinates of B when the axes are
uuur
translated to A as origin of coordinates. If AB as being represented by the
directed line segment from A to B and think of it as an arrow whose tail is
at A and whose head is at B (See Figure 8.4.).
uuur
Some mathematicians think of AB as representing the translation of
uuur
space which takes A into B. The following simple properties of AB are easily verified and correspond to how we intuitively think of directed line segments:
uuur
AB = 0 A = B;
uuur
uuuur
BA = AB;
uuur uuur uuur
AB + BC = AC (the triangle law);
uuur uuur uuur
BC = AC AB = C- B;
if X is a vector and A a point, there is exactly one point B such that
= X, namely that defined by B = A + X.

uuuur
P1 P2

To derive properties of the distance function and the vector function


uuuur
P1 P2 , we need to introduce the dot product of two vectors in
.
a1
a2


X = b1 and Y = b2


c1
c2

8.2 DOT PRODUCT


a1
a2


If X = b1 and Y = b2 , then X Y, the dot product of X and Y, is defined by
c
c X Y = a a + b b + c c .
1
2
1 2
1 2
1 2

Figure 8.5: The negative of a vector.

142

Mathematics

Figure 8.6: (a) Equality of vectors; (b) Addition and subtraction of vectors.
The dot product has the following properties:
X (Y + Z) = X Y + X Z;
X Y = Y X;
(tX) Y = t(X Y );
a

X = b
c
;
X X = a2 + b2 + c2 if
X Y = XtY;
X X = 0 if and only if X = 0.
The length of X is defined by
X = a2 + b2 + c 2 = ( X X )

1/ 2

We see that
tween P1 and P2.

P = OP

and more generally

uuuur
P1 P2 = P1 P2

, the distance be-

Figure 8.7: Position vector as a linear combination of i, j and k.

143

ThreeDimensional Geometry

Vectors having unit length are called unit vectors.


The vectors
1
0
0



i = 0 , j = 1 , k = 0
0
0
1



are unit vectors. Every vector is a linear combination of i, j and k:
a

b = a i + b j+ c k.
c

(See Figure 8.7.)
It is easy to prove that
tX = t X ,
if t is a real number. Hence if X is a nonzero vector, the vectors

1
X
X

are unit vectors.


A useful property of the length of a vector is
X Y

= X

2X Y + Y .

(2)

The following important property of the dot product is widely used in


mathematics:

Theorem1: (The CauchySchwarz inequality) If X and Y are vectors in

, then
X Y X Y .

(3)

Moreover if X 0 and Y 0, then


X Y = X Y Y = tX , t > 0,
X Y = X Y Y = tX , t < 0.

Proof: If X = 0, then inequality 3 is trivially true. So assume X 6 0.

144

Mathematics

Now if t is any real number, by equation 2,


2

0 tX Y

= tX

= t2 X

2 (tX ) Y + Y
2

2 (X Y )t + Y

= at 2 2bt + c ,
where a = X

> 0, b = X Y , c = Y .

Hence

2b
c
a t2 t + 0
a
a

b ca b2
0.
t +
a2
a
Substituting t = b/a in the last inequality then gives
ac b2
so

a2

0,

b ac = a c

and hence inequality 3 follows.


To discuss equality in the CauchySchwarz inequality, assume X 0
and Y 0.
Then if

X Y = X Y ,
tX Y

we have for all t


2

= t 2 X 2tX Y + Y

= t 2 X 2t X Y + Y

= tX Y 2 .

Taking

t= X / Y

then gives

Y = tX, where t > 0. The case

tX Y

=0

X +Y X + Y

and hence tX Y = 0 . Hence

is proved similarly.

Corollary: (The triangle inequality for vectors)


If X and Y are vectors, then
X + Y X + Y

(4)

Moreover if X 0 and Y 0, then equality occurs in inequality 4 if and


only if Y = tX, where t > 0.
Proof:
X +Y

= X

+ 2X Y + Y

+2 X Y + Y

= X + Y

2
2

145

ThreeDimensional Geometry

and inequality 8.4 follows.


If X + Y = X + Y , then the above proof shows that
X Y = X Y .
Hence if X 0 and Y 0, the first case of equality in the CauchySchwarz
inequality shows that Y = tX with t > 0.
The triangle inequality for vectors gives rise to a corresponding inequality for the distance function:
Theorem 2: (The triangle inequality for distance)
If A, B, C are points, then
AC AB + BC

(5)

Moreover if B A and B C, then equality occurs in inequality 5 if and


uuur uuur
only if AB = r AC , where 0 < r < 1.
Proof:
uuur
uuur uuur
AC = AC = AB + BC
uuur
uuur
AB + BC
= AB + BC.
then

Moreover if equality occurs in inequality 8.5 and B 6= A and B 6= C,


uuur
X = AB 0 and X + Y

= X + Y

and the equation AC = AB + BC becomes

X + Y = X + Y . Hence the case of equality in the vector triangle inequality gives


uuur
uuur
Y = BC = tX = t AB , where t > 0.
Then
uuur uuur uuur uuur
BC = AC AB = t AB
uuur
uuur
AC = (1 + t ) AB
uuur uuur
AB = r AC ,
where r = 1/(t + 1) satisfies 0 < r < 1.

8.3LINES
A line in E3 is the set L(P0, X) consisting of all points P satisfying
P = P 0 + tX, t R or equivalently P 0P= tX (6) for some fixed point P0 and
fixed nonzero vector X. (See Figure 8.8.) Equivalently,

in terms of coordinates, equation 6 becomes


x = x0 + ta, y = y0 + tb, z = z0 + tc,
where not all of a, b, c are zero. The following familiar property of straight
lines is easily verified.

146

Mathematics

Theorem 1: If A and B are distinct points, there is one and only one line
containing A and B, namely L(A,-AB) or more explicitly the line defined by
uuur uuur
AP = t AB , or equivalently, in terms of position vectors:
P = (1 t) A + tB or

P = A + t-AB

(7)

Equations 7 may be expressed in terms of coordinates: if A = (x1, y1, z1)


and B = (x2, y2, z2), then
x = (1 t)x 1 + tx 2, y = (1 t)y 1 + ty 2, z = (1 t)z 1 + tz 2.

Figure 8.8: Representation of a line.

Figure 8.9: The line segment AB.

147

ThreeDimensional Geometry

There is an important geometric significance in the number t of the


above equation of the line through A and B. The proof is left as an exercise:
Theorem 2: (Joachimsthals ratio formulae)
If t is the parameter occurring in theorem 1, then

t =

AP
;
AB

t
AP
=
1t
PB

if P B.

Also
P is between A and B if 0 < t < 1;
B is between A and P if 1 < t;
A is between P and B if t < 0.
(See Figure 8.9)
For example,
P=

t=

1
2 gives the midpoint P of the segment AB:

1
( A + B ).
2

Example 1: L is the line AB, where A = (4, 3, 1), B = (1, 1, 0);

is the

line CD, where C = (2, 0, 2), D = (1, 3, 2); is the line EF, where E = (1, 4,
7), F = (4, 3, 13). Find which pairs of lines intersect and also the points
of intersection.
Solution: In fact only Land
ample, to determine if L and

2 5 1
, ,
3 3 3
intersect, in the point
. For ex-

meet, we start with vector equations for

Land :
P = A + t-AB, Q = E + s-EF,
equate P and Q and solve for s and t:
(4i + 3j + k) + t(5i 2j k) = (i + 4j + 7k) + s(5i 7j 20k),
which on simplifying, gives
5t + 5s = 5
2t + 7s = 1
t + 20s = 6
1
1
3
This system has the unique solution t = , s = 3 and this determines
2 5 1
, ,
acorresponding point P where the lines meet, namely P = 3 3 3 .
The same method yields inconsistent systems when applied to the
other pairs of lines.

148

Mathematics

Example 2: If A = (5, 0, 7) and B = (2, 3, 6), find the points P on the line
AB which satisfy AP/PB = 3.
Solution: Use the formulae
uuur
t
AP
P = A + t AB and
=
= 3.
1t
PB
Then

t=
So

3
3
or t = .
4
2

t=

3
3
or t = .
4
2 The corresponding points are

11 9 25
1 9 11
, , and , , .
4
4
4

2 2 2

Let X and Y be nonzero vectors. Then X is parallel or proportional to Y if X = tY for some t R. We write X. Y if X is parallel to Y.
If X = tY, we say that X and Y have the same or opposite direction,
according as t > 0 or t < 0.
uuur
IfA and B are distinct points on a line L, the nonzero vector AB is
called a direction vector for L.
It is easy to prove that any two direction vectors for a line are
parallel.
Let L and

be lines having direction vectors X and Y, respec-

tively. Then L is parallel to


parallel to itself.

if X is parallel to Y. Clearly any line is

It is easy to prove that the line through a given point A and paraluuur
lel to a given line CD has an equation P = A + tCD .
Theorem 3: Let X = a 1i + b 1j + c 1k and Y = a 2i + b 2j + c 2k be nonzero
vectors. Then X is parallel to Y if and only if
a1
a2

b1 b1
=
b2 b2

c1
a
= 1
c 2 a2

c1
= 0.
c2

(8)

Proof: The case of equality in the CauchySchwarz inequality (theorem


1) shows that X and Y are parallel if and only if
X Y = X Y .
Squaring gives the equivalent equality
(a1a2 + b1b2 + c1c2)2 = (a12 + b12 + c12)(a22 + b22 + c22),
which simplifies to
(a 1b 2 a 2b 1) 2 + (b 1c 2 b 2c 1) 2 + (a 1c 2 a 2c 1) 2 = 0,
which is equivalent to
a 1b 2 a 2b 1 = 0, b 1c 2 b 2c 1 = 0, a 1c 2 a 2c 1 = 0,

149

ThreeDimensional Geometry

which is equation 8.
Equality of geometrical vectors has a fundamental geometrical
interpretation:
Theorem 4: Suppose A, B, C, D are distinct points such that no three are
uuur uuur
uuur uuur
uuur uuur
collinear. Then AB CD if and only if AB CD and AC BD (See Figure 8.1.)
uuur uuur
Proof: If AB = CD then
B A = D C,
CA=DB
uuur uuur
uuur uuur
uuur uuur
and so AC = BD . Hence AB CD and AC BD .

uuur uuur
uuur uuur
Conversely, suppose that AC BD and AC BD . Then
uuur uuur
uuur uuur
AB = sCD and AC = tBD ,
or
B A = s(D C) and C A = tD B.
We have to prove s = 1 or equivalently, t = 1.
Now subtracting the second equation above from the first, gives
B C = s(D C) t(D B),
so
(1 t)B = (1 s)C + (s t)D.
If t 1, then
1 s
st
C+
D
1t
1t
and B would lie on the line CD. Hence t = 1.
B=

8.4 THE ANGLE BETWEEN TWO


VECTORS
Let X and Y be nonzero vectors. Then the angle between X and Y is the
unique value of q defined by
cos =
By Cauchys inequality,

X Y
, 0 .
X Y

150

Mathematics

X Y
1,
X Y

so the above equation does define an angle q.


In terms of components, if X = [a1, b1, c1]t and Y = [a2, b2, c2]t, then
cos =

a1 a2 + b1b2 + c1c2
a12 + b12 + c12 a22 + b22 + c22

(9)

The next result is the well-known cosine rule for a triangle.


Theorem 1: (Cosine rule) If A, B, C are points with A B and A
uuur
uuur
C, then the angle between vectors AB and AC satisfies
cos =

AB2 + AC 2 BC 2
,
2 AB AC

(10)

or equivalently
BC2 = AB2 + AC2 2ABAC cos .
(See Figure 8.10)
Proof: Let A = (x1, y1, z1), B = (x2, y2, z2), C = (x3, y3, z3). Then
uuur
AC = a1i + b1j + c1k
uuur
AC = a2i + b2j + c2k
uuur
BC = (a2 a1)i + (b2 b1)j + (c2 c1)k,
where
a i = x i+1 x 1, b i = y i+1 y 1, c i = z i+1 z 1, i = 1, 2.

Figure 8.10: The cosine rule for a triangle.


Now by equation 9,

151

ThreeDimensional Geometry

cos =

a1a2 + b1b2 + c1c2


.
AB CD

Also

) (

AB2 + AC 2 BC 2 = a12 + b12 + c12 + a22 + b22 + c22

( a2 a1 ) + (b2 b1 ) + (c2 c1 )
2

= 2a1a2 + 2b1b2 + c1c2 .

Equation 10 now follows, since


uuur uuur
AB AC = a1a2 + b1b2 + c1c2 .
Example 1: Let A = (2, 1, 0), B = (3, 2, 0), C = (5, 0, 1). Find the angle
uuur
uuur
between vectors AB and AC .
Solution:
uuur uuur
AB AC
cos =
.
AB AC
Now
uuur
uuur
AB = i+ j and AC = 3i- j+ k.

Figure 8.11: Pythagoras theorem for a rightangled triangle.


Hence
cos =

1 3 + 1 ( 1) + 0 1
12 + 12 + 0 2 32 + ( 1) + 12
2

2
2 11

2
11

152

Mathematics

Hence

= cos 1

2
11

If X and Y are vectors satisfying X Y = 0, we say X is orthogonal or


perpendicular to Y.
uuur
uuur
If A, B, C are points forming a triangle and AB is orthogonal to AC ,

uuur
=
uuur
2
then the angle AB between AB and satisfies cos = 0 and hence
and the triangle is rightangled at A.
Then we have Pythagoras theorem:
BC2 = AB2 + AC2

(11)

We also note that BC AB and BC AC follow from equation 11. (See


Figure 8.11)
uuur
uuur
Let A = (2, 9, 8), B = (6, 4, 2), C = (7, 15, 7). Show that AB and AC are
perpendicular and find the point D such that ABDC forms a rectangle.

Figure 8.12: Distance from a point to a line.

Solution:
uuur uuur
AB AC = (4i 5j 10k) (5i + 6j k) = 20 30 + 10 = 0.
uuur
uuur
Hence AB and AC are perpendicular. Also, the required fourth point D
clearly has to satisfy the equation
uuur
uuur uuur
BD = AC , or equivalently D B = AC .
Hence
uuur
D = B+ AC = (6i + 4j 2k) + (5i + 6j k) = 11i + 10j 3k,
so D = (11, 10, 3).

153

ThreeDimensional Geometry

Theorem 2: (Distance from a point to a line) If C is a point and L is


uuuthe
r
CP is
line through A and
B,
then
there
is
exactly
one
point
P
on
L
such
that
uuur
perpendicular to AB , namely
uuur uuur
uuur
AC AB
P = A + t AB, t =
.
AB2

(12)
Moreover if Q is any point onL, then CQ CP and hence P is the point
on Lclosest to C.
The shortest distance CP is given by
CP =

uuur uuur
AC 2 AB2 AC AB

AB

(13)

(See Figure 8.12)

uuur
uuur
uuur
Proof: Let P = A + t AB and assume that CP is perpendicular to AB .
Then
uuur uuur
CP AB = 0
uuur
( P C ) AB = 0
uuur
uuur
A + t AB C AB = 0
uuur uuur uuur
CA + tAB AB = 0
uuur uuur
uuur uuur
CA AB + t AB AB = 0
uuur uuur
uuur uuur
CA AB + t AB AB = 0,

(
(

)
)

)
)

so equation 12 follows.
The inequality CQ CP, where Q is any point on L, is a consequence
of Pythagoras theorem.
uuur
uuur
Finally, as CP and PA are perpendicular, Pythagoras theorem gives
CP 2 = AC 2 PA 2
uuur
= AC 2 t AB

= AC 2 t 2 AB2
uuur uuur 2
AC AB
2
2
= AC
AB
AB2

uuur uuur 2
2
2
AC AB AC AB
,
=
AB2

as required.
Theorem 3: (The projection of a line segment onto a line)
Let C1, C2 be points and P1, P2 be the feet of the perpendiculars from C1
and C2 to the line AB. Then

154

Mathematics

P1 P2 = C1C2 n ,
where
n =

1 uuur
AB.
AB

Also
C1C2 P1P2.

(14)

(See Figure 8.13)


Proof: Using equations 12, we have
uuur
uuur
P1 = A + t1 AB, P2 = A + t2 AB,
where
uuur
uuur
P1 P2 = A + t2 AB A + t1 AB
uuur
= (t2 t1 ) AB,

) (

So,
uuuur
P1 P2 = P1 P2 = t2 t1 AB
uuuuur uuur uuuur uuur
AC2 AB AC1 AB
=

AB
AB2
AB2
uuuuur uuur
C1C2 AB
=
AB
AB2
uuuuur
= C1C2 n ,
where n is the unit vector
n =

1 uuur
AB.
AB

Inequality 14 then follows from the CauchySchwarz inequality 3.


Two nonintersecting lines are called skew if they have nonparallel direction vectors. Theorem 8.5.3 has an application to the problem of showing
that two skew lines have a shortest distance between them. (The reader is
referred to problem 16 at the end of the chapter.) Before we turn to the study
of planes, it is convenient to introduce the crossproduct of two vectors.

8.5 THE CROSSPRODUCT OF TWO


VECTORS
Let X = a1i + b1j + c1k and Y = a2i + b2j + c2k. Then X Y, the crossproduct
of X and Y , is defined by
X Y = ai + bj + ck,

155

ThreeDimensional Geometry

where
a=

b1
b2

c1
a
, b= 1
c2
a2

c1
a
, c= 1
c2
a2

b1
.
b2

The vector crossproduct has the following properties which follow


from properties of 2 2 and 3 3 determinants:
i.
ii.

i j = k, j k = i, k i = j;
X X = 0;

iii.

Y X = X Y ;

iv.

X (Y + Z) = X Y + X Z;

v.
vi.

(tX) Y = t(X Y );
(Scalar triple product formula) if Z = a3i + b3j + c3k, then

a1
X (Y Z ) = a2
a3

b1
b2
b3

c1
c2 = ( X Y ) Z;
c3

X (X Y ) = 0 = Y (X Y );
X Y =

(X Y ) ;
2

if X and Y are nonzero vectors and is the angle between X and Y


,then
X Y = X Y sin .
(See Figure 8.14)
From theorem 8.3.3 and the definition of crossproduct, it follows that
nonzero vectors X and Y are parallel if and only if X Y = 0; hence by (vii),
the crossproduct of two nonparallel, nonzero vectors X and Y , is a non
zero vector perpendicular to both X and Y.
Let X and Y be nonzero, nonparallel vectors.
i.

Z is a linear combination of X and Y , if and only if Z is perpendicular to X Y ;

ii.

Z is perpendicular to X and Y, if and only if Z is parallel to X


Y.

Let X and Y be nonzero, nonparallel vectors. Then


X Y 0.
Then if X Y = ai + bj + ck, we have
a
t
det X Y X Y = a1
a2

b
b1
b2

c
c1 = ( X Y ) ( X Y ) > 0.
c2

156

Mathematics

Figure 8.14: The vector crossproduct.


Hence the matrix [X Y |X|Y ] is nonsingular. Consequently the
linear system
r(X Y ) + sX + tY = Z

(15)

has a unique solution r, s, t.


i.

Suppose Z = sX + tY. Then


Z (X Y ) = sX (X Y ) + tY (X Y ) = s0 + t0 = 0.

Conversely, suppose that


Z (X Y) = 0.

(16)

Now from equation 15, r, s, t exist satisfying


Z = r(X Y ) + sX + tY.
Then equation 16 gives
0 = (r(X Y) + sX + tY) (X Y )
= r||X Y ||2 + sX (X Y) + tY (Y X)
= r||X Y ||2.
Hence r = 0 and Z = sX + tY, as required.
Suppose Z = (X Y). Then clearly Z is perpendicular to X
and Y.
Conversely suppose that Z is perpendicular to X and Y.
ii.

Now from equation15, r, s, t exist satisfying

157

ThreeDimensional Geometry

Z = r(X Y) + sX + tY.
Then
sX X + tX Y = X Z = 0
sY X + tY Y = Y Z = 0,
from which it follows that
(sX + tY ) (sX + tY ) = 0.
Hence sX + tY = 0 and so s = 0, t = 0. Consequently Z = r(X Y), as
required.
The crossproduct gives a compact formula for the distance from a
point to a line, as well as the area of a triangle.
Theorem 1: (Area of a triangle)
If A, B, C are distinct noncollinear points, then
1. the distance d from C to the line AB is given by
uuur uuur
AB AC

d=
,
AB

2. the area of the triangle ABC equals
uuur uuur
AB AC

2

Proof: The area

(17)

A B + BC + C A

.
(18)
2
of triangle ABC is given by

AB CP
,
2

where P is the foot of the perpendicular from C to the line AB. Now by
formula 13, we have
=

CP =
=

uuur uuur
AC 2 AB2 AC AB

uuur uuur
AB AC
AB

AB
,

which, by property (viii) of the crossproduct, gives formula 17. The second
formula of equation 18 follows from the equations
uuur uuur
AB AC = (B A) (C A)
= {(B A) C} {(C A) A}
= {(B C A C)} {(B A A A)}
=BCACBA
= B C + C A + A B,
as required.

158

Mathematics

8.6 PLANES
A plane is a set of points P satisfying an equation of the form
P = P 0 + sX + tY, s, t R,

(19)

where X and Y are nonzero, nonparallel vectors.


For example, the xyplane consists of the points P = (x, y, 0) and corresponds to the plane equation
P = x i + y j = O + x i + y j.
In terms of coordinates, equation 19 takes the form
x = x 0 + sa 1 + ta 2
y = y 0 + sb 1 + tb 2
z = z 0 + sc 1 + tc 2,
where P0 = (x0, y0, z0) and (a1, b1, c1) and (a2, b2, c2) are nonzero and non
proportional.
Theorem 1: Let A, B, C be three noncollinear points. Then there is one
and only one plane through these points, namely the plane given by the
equation
uuur uuur
P = A + sAB + tAC ,
(20)
or equivalently
uuur uuur uuur
AP = sAB + tAC.

(21)

(See Figure 8.15.)

Figure 8.15: Vector equation for the plane ABC.


Proof: First note that equation 20 is indeed the equation of a plane
uuur
uuur
through A, B and C, as AB and AC are nonzero and nonparallel and

159

ThreeDimensional Geometry

(s, t) = (0, 0), (1, 0) and (0, 1) give P = A, B and C, respectively. Call
this plane P.
Conversely, suppose P = P0 + sX + tY is the equation of a plane Q passing through A, B, C. Then A = P0 + s0X + t0Y, so the equation for Q may be
written
P = A + (s s 0)X + (t t 0)Y = A + sX + tY;
so in effect we can take P 0 = A in the equation of Q. Then the fact
that B and C lie on Q gives equations
B = A + s 1X + t 1Y, C = A + s 2X + t 2Y,
or
uuur
uuur
AB = s1X + t1Y , AC = s2 X + t2Y .

(22)

Then equations 22 and equation 20 show that


P Q.

uuur
uuur
Conversely, it is straightforward to show that because AB and AC are
not parallel, we have
s1
s2

t1
0.
t2

Figure 8.16: Normal equation of the plane ABC.

Hence equations 22 can be solved for X and Y as linear combinations


uuur
uuur
of AB and AC , allowing us to deduce that

160

Mathematics

Q P.

Hence

Q = P.

Theorem 2: (Normal equation for a plane)


Let
A = (x 1, y 1, z 1), B = (x 2, y 2, z 2), C = (x 3, y 3, z 3)
be three noncollinear points. Then the plane through A, B, C is
given by
x x1
x2 x1
x3 x1

y y1
y2 y1
y3 y1

z z1
z2 z1 = 0.

z3 z1

(23)

or equivalently,
x x1
x2 x1
x3 x1

y y1
y2 y1
y3 y1

z z1
z2 z1 = 0.
z3 z1

(24)

where P = (x, y, z). (See Figure 8.16.)

Figure 8.17: The plane ax + by + cz = d.


Example 1: Show that the planes
x + y 2z = 1 and x + 3y z = 4
intersect in a line and find the distance from the point C = (1, 0, 1) to this
line.
Solution: Solving the two equations simultaneously gives
x=

1 5
3 1
+ z, y = z,
2 2
2 2

(25)

ThreeDimensional Geometry

where z is arbitrary. Hence


5 1

1 3
x i + y j+ z k = i j+ z i j+ k ,
2 2
2
2

1 3
which is the equation of a line L through A = , ,0 and having
2 2
5 1
direction vector i j+ k. We can now proceed in one of three ways
2
2 on L to A.
to find the closest point

One way is to use equation 17 with B defined by


uuur 5 1
AB = i j+ k.
2 2
Another method minimizes the distance CP, where P ranges over L.
A third way is to find an equation for the plane through C, having
5 1
i j+ k as a normal. Such a plane has equation
2 2
5x y + 2z = d,

whered is found by substituting the coordinates of C in the last


equation.
d = 5 1 0 + 2 1 = 7.
We now find the point P where the plane intersects the line L. Then
uuur
CP will be perpendicular to L and CP will be the required shortest
distance from C to L. We find using equations 25 that
1 5 3 1
5 + z z + 2 z = 7,
2 2 2 2

Figure 8.18: Line of intersection of two planes.

161

162

Mathematics

so z =

4 17 11
11
. Hence p = , , .
15
3 15 15

It is clear that through a given line and a point not on that line, there
passes exactly one plane. If the line is given as the intersection of two planes,
each in normal form, there is a simple way of finding an equation for this
plane. More explicitly we have the following result:
Theorem 3: Suppose the planes
a1x + b1y + c1z = d1

(26)

a2x + b2y + c2z = d2

(27)

have nonparallel normals. Then the planes intersect in a line L.


Moreover the equation

l(a 1x + b 1y + c 1z d 1) + (a 2x + b 2y + c 2z d 2) = 0,

(28)

where l and are not both zero, gives all planes through L.
(See Figure 8.18)
Proof: Assume that the normals a1i + b1j + c1k and a2i + b2j + c2k are
nonparallel. Then by theorem 8.4.3, not all of

1 =

a1
a2

b1
b
, 2 = 1
b2
b2

c1
a
, 3 = 1
c2
a2

c1

c2

(29)

are zero. If say 1 0. , we can solve equations 26 and 27 for x and


y in terms of z, as we did in the previous example, to show that the
intersection forms a line L.
We next have to check that if l and are not both zero, then
equation 28 represents a plane. (Whatever set of points equation 28
represents, this set certainly contains L.)
(la 1 + a 2)x + (lb 1 + b 2)y + (lc 1 + c 2)z (l d 1 + d 2) = 0.
Then we clearly cannot have all the coefficients
la 1 + a 2, lb 1 + b 2, lc 1 + c 2
zero, as otherwise the vectors a 1i + b 1j + c 1k and a 2i + b 2j + c 2k would
be parallel. Finally, if P is a plane containing L, let P 0 = (x 0, y 0, z 0) be
a point not on L. Then if we define l and by
l = (a 2x 0 + b 2y 0 + c 2z 0 d 2), = a 1x 0 + b 1y 0 + c 1z 0 d 1,
then at least one of l and is nonzero. Then the coordinates of
P 0 satisfy equation 28, which therefore represents a plane passing
through L and P 0 and hence identical with P.
Example 2: Find an equation for the plane through P0 = (1, 0, 1) and
passing through the line of intersection of the planes
x + y 2z = 1 and x + 3y z = 4.
Solution: The required plane has the form

163

ThreeDimensional Geometry

l(x + y 2z 1) + (x + 3y z 4) = 0,
where not both of l and are zero. Substituting the coordinates of P 0
into this equation gives
2l + (4) = 0, l = 2.
So the required equation is
2(x + y 2z 1) + (x + 3y z 4) = 0,
or
x + y + 3z 2 = 0
Our final result is a formula for the distance from a point to a plane.

Figure 8.19: Distance from a point P 0 to the plane ax + by + cz = d.

8.7 MULTIPLE CHOICE QUESTIONS


1.

2.

Identify the slope in the following equation y = 9x - 12.


(a) -9

(b) 9

(c) 12

(d) -12

Which of the following is the equation of the line that passes through
(0, 2) and (-3, -4)?
(a) The equation of the line in slope-intercept form y = one-half times
x+2
(b) The equation of the line in slope-intercept form y = one-half times
x -2
(c) y = 2x+2
(d) y = 22

3. Which of the following is the equation of the line that passes through (0,

164

Mathematics

-2) and (3, 4)?


(a) The equation of the line in slope-intercept form y = one-half times
x+2
(b) The equation of the line in slope-intercept form y = one-half times
x -2
(c) y = 2x+2
(d) y = 2x-2
4. Determine the volume of the parallelepiped determined by a, b, and c
using the scalar triple product V = a (b c ) .

5.

6.

7.

8.

9.

(a) 29

(b) 95

(c) 22

(d) -12

Determine the dot product of the vectors: a = (1,2, 3), b = (1,2,1).


(a) a b = -2

(b) a b = -6

(c) a b = -4

(d) a b = 0

Find a polar representation for the curve whose Cartesian equation is


x2 + (y + 2)2 = 4.
(a) r + 4 sin = 0

(b) r = 8 cos

(c) r = 2 cos

(d) r = 5cos

r
r
r
Given the vectors a = 5, 7 and b = 2, 3 , an expression for b in
r
r
terms of i and j
r r
r r
(a) 4i + 5 j
(b) 4i + 5 j
r r
r
(c) i + 2 j
(d) 2 j
Find the coordinates and
uuur
AB,for A (1, 3 ) and B (7, 2 )

the

magnitude

of

each

(a)

67

(b)

33

(c)

37

(d)

11

vector

Write a force of 200 N at 20 to the horizontal in Cartesian form


(a) [157.9, 68.4]

(b) [197.9, 68.4]

(c) [177.9, 68.4]

(d) [187.9, 68.4]

r
r
10. The angle between the vectors in, g = 5,1 and b = 3,8
(a) 9.2

(b) 99.2

(c) 88.2

(d) 56.2

8.8 REVIEW QUESTIONS


1. Let X = a1i + b1j + c1k and Y = a2i + b2j + c2k be nonzero vectors. Then
X is parallel to Y if and only if

165

ThreeDimensional Geometry

a1
a2

b1 b1
=
b2 b2

c1
a
= 1
c 2 a2

c1
= 0.
c2

2. Define rhreedimensional space.


3. Define dot Product.
4. Explain theorem the CauchySchwarz inequality.
5. Explain theorem the triangle inequality for distance.
6. Define Joachimsthals ratio formulae.
7. L is the line AB, where A = (4, 3, 1), B = (1, 1, 0); M is the line CD,
where C = (2, 0, 2), D = (1, 3, 2); N is the line EF, where E = (1, 4,
7), F = (4, 3, 13). Find which pairs of lines intersect and also the
points of intersection.
8. If A = (5, 0, 7) and B = (2, 3, 6), find the points P on the line AB
which satisfy AP/PB = 3.
9. Find an equation for the plane through P 0 = (1, 0, 1) and passing through the line of intersection of the planes

x + y 2z = 1 and x + 3y z = 4.

10. Show that the planes


x + y 2z = 1 and x + 3y z = 4

intersect in a line and find the distance from the point C = (1, 0, 1)
to this line.

ANSWERS FOR MULTIPLE CHOICE QUESTIONS


(1). (b)

(2). (c)

(3). (c)

(4). (a)

(5). (d)

(6). (a)

(7). (a)

(8). (c)

(9). (d)

(10). (b)

Chapter 9

Linear Programming

Objectives
After studying this
chapter, you will be
able to:
Understand
assumption
of linear programming
Explain the
process of
formulation
Understand
graphic
method
Discuss the
simplex
method
Understand
two phase
method of
linear programming
problems

INTRODUCTION
The immediate and more obvious LP results enable the mill operator to:
Minimize the cost of cotton blends
Minimize substandard blends
Maintain accurate inventory records
Purchase and sell most economically
The basis of the LP technique is the formulation of a mathematical model
of the allocation problem. For problems of any practical size, this model is entered into a computer, and the computer LP system rapidly calculates the optimal solution. The system may also produce reports which indicate the effect
on the optimal solutions of possible changes in the given prices, availabilities,
specifications, etc. Little mathematical knowledge or skill is required to formulate an LP model. Nor do the operation of the computer and the analysis
of computer results require any advanced technical skill.

9.1 ASSUMPTIONS OF LINEAR


PROGRAMMING
The LP problems embody seven important assumptions relative to the problem being modeled. The first three involve the appropriateness of the formulation; the last four the mathematical relationships within the model.

Objective Function Appropriateness


This assumption means that within the formulation the objective function is
the sole criteria for choosing among the feasible values of the decision variables. Satisfaction of this assumption can often be difficult as, for example,
Ram might base his van conversion plan not only on profit but also on risk
exposure, availability of vacation time, etc.

Decision Variable Appropriateness


A key assumption is that the specification of the decision variables is appropriate. This assumption requires that

Linear Programming

167

The decision variables are all fully manipulating within the feasible
region and are under the control of the decision maker.
All appropriate decision variables have been included in the model.

Constraint Appropriateness
The third appropriateness assumption involves the constraints. Again, this
is best expressed by identifying sub-assumptions:
The constraints fully identify the bounds placed on the decision
variables by resource availability, technology, the external environment, etc. Thus, any choice of the decision variables, which simultaneously satisfies all the constraints, is admissible.
The resources used and/or supplied within any single constraint
are homogeneous items that can be used or supplied by any decision variable appearing in that constraint.

Key Vocabulary

Constraints have not been imposed which improperly eliminate ad- Column Vector: It (column matrix) is an m 1
missible values of the decision variables.
matrix, i.e. a matrix con The constraints are inviolate. No considerations involving model
sisting of a single column
variables other than those included in the model can lead to the
of m elements.
relaxation of the constraints.

Proportionality
Variables in LP models are assumed to exhibit proportionality. Proportionality deals with the contribution per unit of each decision variable to the
objective function. This contribution is assumed constant and independent
of the variable level. Similarly, the use of each resource per unit of each
decision variable is assumed constant and independent of variable level.
There are no economies of scale.
For example, in the general LP problem, the net return per unit of Xj
produced is cj. If the solution uses one unit of Xj, then cj units of return are
earned, and if 100 units are produced, then returns are 100cj. Under this
assumption, the total contribution of Xj to the objective function is always
proportional to its level. This assumption also applies to resource usage
within the constraints. Rams labor requirement for fine vans was 25 hours/
van. If Ram converts one fine van he uses 25 hours of labor. If he converts 10
fine vans he uses 250 hours (2510). Total labor use from van conversion is
always strictly proportional to the level of vans produced. Economists encounter several types of problems in which the proportionality assumption
is grossly violated. In some contexts, product price depends upon the level
of production. Thus, the contribution per unit of an activity varies with
the level of the activity. Methods to relax the proportionality assumption
are discussed in the nonlinear approximations, price endogenous, and risk.
Another case occurs when fixed costs are to be modeled. Suppose there is a
fixed cost associated with a variable having any non-zero value. In this case,
total cost per unit of production is not constant.

Additively
Additively deals with the relationships among the decision variables. Simply put their contributions to an equation must be additive. The total value
of the objective function equals the sum of the contributions of each vari-

168

Mathematics

able to the objective function. Similarly, total resource use is the sum of the
resource use of each variable. This requirement rules out the possibility that
interaction or multiplicative terms appear in the objective function or the
constraints.
For example, in Rams van problem, the value of the objective function is 2,000 times the fancy vans converted plus 1,700 times the fine vans
converted. Converting fancy vans does not alter the per van net margin of
fine vans and vice versa. Similarly, total labor use is the sum of the hours of
labor required to convert fancy vans and the hours of labor used to convert
fine vans. Making a lot of one van does not alter the labor requirement for
making the other.

Key Vocabulary
Feasible Region: The
region in which all the
constraints are satisfied.
All feasible solutions
must lie in this feasible
region.

In the general LP formulation, when considering variables Xj and Xk,


the value of the objective function must always equal cjtimes Xj plus ck
times Xk. Using Xjdoes not affect the per unit net return of Xk and vice versa.
Similarly, total resource use of resource I is the sum of aijXj and aikXk. Using
Xj does not alter the resource requirement of Xk. The nonlinear approximation, price endogenous and risk present methods of relaxing this assumption.

Divisibility
The problem formulation assumes that all decision variables can take on
any non-negative value including fractional ones. In the Rams van shop
example, this means that fractional vans can be converted; for example,
Ram could convert 11.2 fancy vans and 0.8 fine vans. This assumption is
violated when non-integer values of certain decision variables make little
sense. A decision variable may correspond to the purchase of a tractor or
the construction of a building where it is clear that the variable must take
on integer values. In this case, it is appropriate to use integer programming.

Certainty
The certainty assumption requires that the parameters cj, bi, and aij be
known constants. The optimum solution derived is predicated on perfect
knowledge of all the parameter values. Since all exogenous factors are assumed to be known and fixed, LP models are sometimes called non-stochastic as contrasted with models explicitly dealing with stochastic factors.
This assumption gives rise to the term deterministic analysis.
The exogenous parameters of a LP model are not usually known with
certainty. In fact, they are usually estimated by statistical techniques. Thus,
after developing a LP model, it is often useful to conduct sensitivity analysis by varying one of the exogenous parameters and observing the sensitivity of the optimal solution to that variation. For example, in the van shop
problem the net return per fancy van is 2,000, but this value depends upon
the van cost, the cost of materials and the sale price all of which could be
random variables.

9.2 FORMULATION OF LINEAR


PROGRAMMING PROBLEMS
The following structure model uses for explain the formulation of LP.

169

Linear Programming

9.2.1 Structure of Linear Programming Model


The general structure of the Linear Programming model essentially consists
of three components:
(i) The activities (variables) and their relationships
(ii) The objective function and
(iii) The constraints
The activities are represented by X1, X2, X3 ..Xn.
These are known as decision variables.
The objective function of an LPP (Linear Programming Problem) is a Key Vocabulary
mathematical representation of the objective in terms a measurable quantity such as profit, cost, revenue, etc.
Optimal Solution:
Optimal solution is that
Optimize (Maximize or Minimize) Z = C1 X1 + C2 X 2 + ..Cn X n
which is determined to
Where Z is the measure of performance variable
be the best solution from
all feasible solutions.
X1 , X 2 , X 3 , X 4 ..X n are the decision variables
And C1, C2, Cn are the parameters that give contribution to decision
variables.
The constraints: These are the set of linear inequalities and/or equalities which impose restriction of the limited resources.

9.2.2 General Mathematical Model of an LPP


Optimize (Maximize or Minimize) Z = C1 X1 + C2 X 2 + + Cn X n
Subject to constraints:

a
a11 X1 + a12 X 2 + + a1n X n ( , = , ) b1 11
a21

a12

a22

a 21 X1 + a 22 X 2 + + a 2n X n ( , = , ) b 2
a 31 X1 + a 32 X 2 + + a 3n X n ( , = , ) b 3
a m1 X1 + a m2 X 2 + + a mn X n ( , = , ) b m
and X1 , X 2 .X n

9.2.3 Guidelines for Formulating Linear


Programming Model
(i) Identify and define the decision variable of the problem.
(ii) Define the objective function.

170

Mathematics

(iii) State the constraints to which the objective function should be optimized.
(iv) Add the non-negative constraints from the consideration that the
negative values of the decision variables do not have any valid
physical interpretation.
Example

Key Vocabulary
Refinery Management:
It means management
of industrial plants that
uses mechanical and
chemical means to purify a substance, such as
petroleum or sugar, or to
convert it to a form that
is more useful.

A manufacturer produces two types of models M1 and M2.Each model of


the type M1 requires 4 hours of grinding and 2 hours of polishing; whereas
each model of M2 requires 2 hours of grinding and 5 hours of polishing.
The manufacturer has 2 grinders and 3 polishers. Each grinder works for 40
hours a week and each polisher works 60 hours a week. Profit on M1 model
is INR 15,000 and on model M2 is INR 20,000.Whatever produced in a week
is sold in the market. How should the manufacturer allocate his production
capacity to the two types of models, so that he makes maximum profit in a
week?
i) Identify and define the decision variable of the problem.

Let X1 and X2 be the number of units of M1 and M2 model.

ii) Define the objective function.


Since the profits on both the models are given, the objective function is to maximize the profit.

Max Z = 3X1 + 4X 2
iii) State the constraints to which the objective function should be optimized.
There are two constraints one for grinding and the other for polishing.
The grinding constraint is given by
2X1 + 5X 2 180

Number of hours available on grinding machine per week is 40 hrs.


There are two grinders. Hence the total grinding hour available is 40X2 =
80 hours.
The polishing constraint is given by

2X1 + 5X 2 180
Number of hours available on polishing machine per week is 60 hrs.
There are three grinders. Hence the total grinding hour available is 60X3 =
180 hours.
Finally we have,

Max Z = 3X1 + 4X 2
Subject to constraints,

2X1 + 3X 2 60
4X1 + 3X 2 96
X1 , X 2 0

Example
A firm is engaged in producing two products .A and B. Each unit of product A requires 2 kg of raw material and 4 labor hours for processing, where

Linear Programming

171

as each unit of B requires 3 kg of raw materials and 3 labor hours for the
same type. Every week, the firm has an availability of 60 kg of raw material
and 96 labor hours. One unit of product a sold yields INR 2,000 and one unit
of product B sold give INR 1750 as profit.
Formulate this as a Linear Programming Problem to determine as to
how many units of each of the products should be produced per week so
that the firm can earn maximum profit.
(i) Identify and define the decision variable of the problem.
Let X1 and X2 be the number of units of product A and product B produced per week.
ii) Define the objective function.
Since the profits of both the products are given, the objective function
is to maximize the profit.

Key Vocabulary

Risk Exposure: The


amount of risk an invesiii) State the constraints to which the objective function should be optor has taken on in a
timized.
particular investment
There are two constraints one is raw material constraint and the other or a portfolio or to what
one is labor constraint.
extent a business could
be affected by certain
The raw material constraint is given by
factors that may have a
2X1 + 3X 2 60
negative impact on earnings.
The labor hours constraint is given by
Max Z = 40X1 + 35X 2

4X1 + 3X2 96

Finally we have,

Max Z = 40X1 + 35X 2


Subject to constraints,

2X1 + 3X 2 60
4X1 + 3X 2 96
X1 , X 2 0

Example
The agricultural research institute suggested the farmer to spread out at
least 4800 kg of special phosphate fertilizer and not less than 7200 kg of
a special nitrogen fertilizer to raise the productivity of crops in his fields.
There are two sources for obtaining these mixtures A and mixtures B. Both
of these are available in bags weighing 100kg each and they cost INR 2,000
and INR 1200 respectively. Mixture A contains phosphate and nitrogen
equivalent of 20kg and 80 kg respectively, while mixture B contains these
ingredients equivalent of 50 kg each. Write this as an LPP and determine
how many bags of each type the farmer should buy in order to obtain the
required fertilizer at minimum cost.
i) Identify and define the decision variable of the problem.

Let X1 and X2 be the number of bags of mixture A and mixture B.

ii) Define the objective function.

172

Mathematics

The cost of mixture A and mixture B are given; the objective function is to minimize the cost

Min.Z = 40X1 + 24X 2



iii) State the constraints to which the objective function should be optimized.
The above objective function is subjected to following constraints.

80X1 + 50X 2 7200

Phosphate requirement

80X1 + 50X 2 7200

Nitrogen requirement

X1 , X 2 0
Finally we have,

Min.Z = 40X1 + 24X 2 is subjected to three constraints

20X1 + 50X 2 4800

80X1 + 50X 2 7200


X1 , X 2 0

Example
A Retired person wants to invest up to an amount of INR 1500,000 in fixed
income securities. His broker recommends investing in two Bonds: Bond A
yielding 7% and Bond B yielding 10%. After some consideration, he decides
to invest at most of INR 600,000 in bond B and at least INR 300,000 in Bond
A. He also wants the amount invested in Bond A to be at least equal to the
amount invested in Bond B. What should the broker recommend if the investor wants to maximize his return on investment? Solve graphically.
i) Identify and define the decision variable of the problem.

Let X1 and X2 be the amount invested in Bonds A and B.

ii) Define the objective function.


Yielding for investment from two Bonds are given; the objective
function is to maximize the yielding

Max Z = 0.07X1 + 0.1X 2



iii) State the constraints to which the objective function should be optimized.
The above objective function is subjected to following three
constraints.
X1 + X 2 30,000
X1 6,000

X 2 12,000
X1 X2 0
X1 , X 2 0

Finally we have,

Linear Programming

Max Z = 0.07X1 + 0.1X 2


is subjected to three constraints

5X1 + X 2 10

2X1 + 2X 2 12
X1 + 4X 2 12
X1 , X 2 0

Minimization Problems
Example
A person requires 10, 12, and 12 units chemicals A, B and C respectively
for his garden. A liquid product contains 5, 2 and 1 units of A, B and C respectively per jar. A dry product contains 1, 2 and 4 units of A, B and C per
carton. If the liquid product sells for INR 150 per jar and the dry product
sells for INR 100 per carton, how many of each should be purchased, in
order to minimize the cost and meet the requirements?
i) Identify and define the decision variable of the problem.

Let X1 and X2 be the number of units of liquid and dry products.

ii) Define the objective function.


The cost of Liquid and Dry products are given; the objective function is to minimize the cost

Min. Z = 3X1 + 2X 2
iii) State the constraints to which the objective function should be optimized.
The above objective function is subjected to following three constraints.
5X1 + X 2 10

2X1 + 2X 2 12
X1 + 4X 2 12
X1 , X 2 0

Finally we have,

Min. Z = 3X1 + 2X 2
is subjected to three constraints
5X1 + X 2 10

2X1 + 2X 2 12
X1 + 4X 2 12
X1 , X 2 0

9.3 GRAPHICAL METHOD


The steps in solving an LP problem graphically are introduced below. We
will apply these steps to a simple LP problem.

173

174

Mathematics

Step 1 Formulate the LP Problem


Formulation refers to translating the real-world problem into a format of
mathematical equations that represent the objective function and the constraint set. Often, data gathering, problem definition, and problem formulation are the most important steps when using any tool.

Figure 9.1: Graphical Method.


A thorough understanding of the problem is necessary in order to formulate it correctly.

Step 2 Construct a Graph and Plot the Constraint Lines


Constraint lines represent the limitations on available resources. Usually,
constraint lines are drawn by connecting the horizontal and vertical intercepts found from each constraint equation.

Step 3 Determine the Valid Side of Each Constraint Line


The simplest way to start is to plug in the coordinates of the origin (0, 0) and
see whether this point satisfies the constraint. If it does, then all points on
the origin side of the line are feasible, and all points on the other side of the
line are infeasible. If (0, 0) does not satisfy the constraint, then all points on
the other side and away from the origin are feasible (valid), and all points
on the origin side of the constraint line are infeasible (invalid). There are
two exceptions.

Step 4 Identify the Feasible Solution Region


The feasible solution region represents the area on the graph that is valid
for all constraints. Choosing any point in this area will result in a valid solution.

175

Linear Programming

Step 5 Plot two Objective Function Lines to Determine the


Direction of Improvement
Improvement is in the direction of greater value when the objective is to
maximize the objective function, and is in the direction of lesser value when
the objective is to minimize the objective function. The objective function
lines do not have to include any of the feasible regions to determine the
desirable direction to move.

Step 6 Find the Most Attractive Corner


Optimal solutions always occur at corners. The most attractive corner is the
last point in the feasible solution region touched by a line that is parallel to
the two objective function lines drawn in step 5 above. When more than one
corner corresponds to an optimal solution, each corner and all points along
the line connecting the corners correspond to optimal solutions.

Step 7 Determine the Value of the Objective Function for the


Optimal Solution
Example
Solve the following LPP by graphical method

Maximize Z = 5X1 + 3X 2
Subject to constraints
2X1 + X 2 1,000
X1 400

X 2 700

X1 , X 2 0

Solution
The first constraint 2X1 + X 2 1000 can be represented as follows.
2 0 + X 2 = 1,000

We set X 2 = 1,000

When X1 = 0 in the above constraint, we get,

2 0 + X 2 = 1,000
X 2 = 1,000

Similarly when X2 = 0 in the above constraint, we get,


2X1 + 0 = 1,000

X1 = 1,000 / 2 = 500

The second constraint


We set

can be represented as follows,

X1 = 400

The third constraint


We set

X1 400

X 2 700

X 2 700

can be represented as follows,

176

Mathematics

Figure 9.2: The constraints are shown plotted in the above.

Point

Z = 5X1 + 3X 2

X1

X2

700

Z = 5 0 + 3 700 = 2,100

150

700

Z = 5 150 + 3 700 = 2,850*


Maximum

400

200

Z = 5 400 + 3 200 = 2,600

400

Z = 5 400 + 3 0 = 2,000

The Maximum profit is at point B


When X1 = 150 and X2 = 700
Z = 2850
Example:
Solve the following LPP by graphical method.
Maximize Z = 400X1 + 200X 2
Subject to constraints
18X1 + 3X 2 800
9X1 + 4X 2 600
X 2 150

X1 , X 2 0
Solution
The first constraint 18X1 + 3X 2 = 800 can be represented as follows.
We set 18X1 + 3X 2 = 800
When X1 = 0 in the above constraint, we get,

177

Linear Programming

18 0 + 3X 2 = 800
X 2 = 800 / 3 = 266.67
Similarly when X 2 = 0 in the above constraint, we get,
18X1 + 3 0 = 800

X1 = 800 / 18 = 44.44
The second constraint 9X 1 + 4X 2 = 600 can be represented as follows,

We set 9X1 + 4X 2 = 600


When X1 = 0 in the above constraint, we get,
9 0 + 4X 2 = 600
X 2 = 600 / 4 = 150

Similarly when X 2 = 0 in the above constraint, we get,


9X1 + 4 0 = 600

X1 = 600 / 9 = 66.67
The third constraint X 2 150 can be represented as follows,

We set X 2 = 150

Figure 9.3: The constraints are shown plotted in the above.


Point

X1

X2

Z = 400X1 + 200X 2

150

Z = 400 0 + 200 150 = 30,000 * Maximum

178

Mathematics

31.11

80

Z = 400 31.1 + 200 80 = 28,444.4

44.44

Z = 400 44.44 + 200 0 = 17,777.8

The Maximum profit is at point A


When Z = 20X1 + 40X 2
Z = 30,000
Example
Solve the following LPP by graphical method.
Minimize Z = 20X1 + 40X 2
Subject to constraints
36X1 + 6X 2 108
3X1 + 12X 2 36

20X1 + 10X 2 100


X1 ,X 2 0
Solution
The first constraint 36X1 + 6X 2 108 can be represented as follows.
We set 36X1 + 6X 2 108
When X1 = 0 in the above constraint, we get,
36 0 + 6X 2 = 108
X 2 = 108 / 6 = 18

Similarly when X 2 = 0 in the above constraint, we get,


36X1 + 6 0 = 108
X1 = 108 / 36 = 3

The second constraint 3X1 + 12X 2 36 can be represented as follows,


We set 20X1 + 10X 2 100
When X1 = 0 in the above constraint, we get,
3 0 + 12X 2 = 36
X 2 = 36 / 12 = 3

Similarly when X 2 = 0 in the above constraint, we get,


3X1 + 12 0 = 36

X1 = 36 / 3 = 12
The third constraint 20X1 + 10X 2 100 can be represented as follows,

We set 20X1 + 10X 2 100


When X1 = 0 in the above constraint, we get,

179

Linear Programming

20 0 + 10 X 2 = 100

X 2 = 100/10 = 10
Similarly when X1 = 0 in the above constraint, we get,
20 X1 + 10 0 = 100

X1 = 100/20 = 5

Figure 9.4: The constraints are shown plotted.


Point

X1

X2

Z = 20X1 + 40X 2

18

Z = 20 0 + 40 18 = 720

Z = 20 2 + 40 6 = 280

Z = 20 4 + 40 2 = 160* Minimum

12

Z = 20 12 + 40 0 = 240

The Minimum cost is at point C


When X 2 = 4 and X 2 = 2
Z = 160

9.4 SIMPLEX METHOD


The subject of linear programming, sometimes called linear optimization,
concerns itself with the following problem: For Nindependent variables

x1 ,.., x N , maximize the function


z = a01 x1 + a02 x2 + + a0N xN (i)

180

Mathematics

Subject to the primary constraints

x1 0, x2 0, . . . xN 0

And simultaneously subject to


straints,

m1

M = m1 + m2 + m3

additional con-

of them of the form

ai 1 x1 + ai 2 x2 + L + aiN xN bi (bi 0)

m2

(ii)

i = 1, . . ., m1

(iii)

i = 1, . . ., m1

(iv)

of them of the form

ai 1 x1 + ai 2 x2 + L + aiN xN bi (bi 0)
And

m3 of them of the form

ak 1 x1 + ak 2 x2 + L + akN xN = bk 0

(v)

k = m1 + m2 + 1, . . ., m1 + m2 + m3
a '
The various ij s can have either sign, or be zero. The fact that the bs
must all be nonnegative (as indicated by the final inequality in the above
three equations) is a matter of convention only, since you can multiply any
contrary inequality by 1. There is no particular significance in the number
of constraints M being less than, equal to, or greater than the number of
unknowns N.

A set of values x1 . . .x N that satisfies the constraints (ii)(v) is called


a feasible vector. The function that we are trying to maximize is called the
objective function. The feasible vector that maximizes the objective function is called the optimal feasible vector. An optimal feasible vector can
fail to exist for two distinct reasons: (i) there are no feasible vectors, i.e., the
given constraints are incompatible, or (ii) there is no maximum, i.e., there
is a direction in N space where one or more of the variables can be taken to
infinity while still satisfying the constraints, giving an unbounded value for
the objective function.
The subject of linear programming is surrounded by notational and
terminological thickets. Both of these thorny defenses are lovingly cultivated by a coterie of stern acolytes who have devoted themselves to the field.
Actually, the basic ideas of linear programming are quite simple. Avoiding
the shrubbery, we want to teach you the basics by means of a couple of specific examples; it should then be quite obvious how to generalize.
Why is linear programming so important?
(i) Because nonnegative is the usual constraint on any variable xithat
represents the tangible amount of some physical commodity, like
guns, butter, dollars, units of vitamin E, food calories, kilowatt
hours, mass, etc.
(ii) Because one is often interested in additive (linear) limitations or
bounds imposed by man or nature: minimum nutritional requirement, maximum affordable cost, maximum on available labor or
capital, minimum tolerable level of voter approval, etc.
(iii) Because the function that one wants to optimize may be linear, or
else may at least be approximated by a linear function - since that is
the problem that linear programming can solve.

181

Linear Programming

Here is a specific example of a problem in linear programming, which


has N = 4, m1 = 2, m2 = m3 = 1 , hence M = 4:
Maximize z = x1 + x2 + 3x3

1
x
2 4 (vi)

with all the xs nonnegative and also with


x1 + 2x3 740

2x2 7x4 0

1
x2 x3 + 2x4
2
x1 + x2 + x3 + x4 = 9

(vii)

The answer turns out to be x1 = 0, x2 = 3.33, x3 = 4.73, x4 = 0.95 .


Figure 9.5 summarizes some of the terminology thus far.

Figure 9.5: Basic concepts of linear programming.

9.5 TWO PHASE METHOD


Artificial variables and auxiliary problem
Consider the LP
max

(P)

c T x
(Subject to) S.T.

Ax = b

x0

The basis having u1, u2, um as basic variables is feasible; it determines

the bfs x* , u * = (0, b ).


These slack variables are called artificial variables.
This new LP problem is not equivalent to (P).
BUT, if we can force all artificial variables to be zero, then the resulting
solution gives a feasible solution to (P).

182

Mathematics

So, we change the objective function:


m

Max
(A)

ui
i =1

S.T

Ax + u=b

Xu0

This is called an auxiliary problem.


Example
Given the LP problem
Max ( z = ) x1 x 3 + 2x 4

x1 + 2x 2 + x 4 = 4

S.T

(P)

x 2 + x 3 x 4 = 1

(z = ) x x + 2x
First we make sure the right hand side is nonnegative.
1

Max

(z = ) x

x 3 + 2x 4

x 2 x 3 + x 4 = 1

S.T

(P)

x 2 x 3 + x 4 = 1

( w = )

u1 u 2

Adding artificial variables u1, u2 gives the auxiliary problem


Max ( w = ) u1 u 2

x1 + 2x 2 + x 4 + u1 = 4

S.T

x 2 x 3 + x 4 + u 2 = 1

(A)

x1 , x 2 , x 3 , x 4 , u1 , u 2 0
0

Any feasible solution of (A) has objective value 0.has optimal value
T

x1* , x*2 , x*3 , x*4 is feasible for (P),

x1* , x*2 , x*3 , x*4 , 0, 0 is feasible for (A).

x1* , x*2 , x*3 , x*4 , 0, 0 is optimal for (A) with value 0.

x1* , x*2 , x*3 , x*4 , , u1* , u *2 is optimal for (A) with value 0.

u1* = u *2 = 0
T

x1* , x*2 , x*3 , x*4 is feasible for (P).

So
(P) Has a feasible solution

(A) has optimal value 0.

183

Linear Programming

In general, the auxiliary problem is never unbounded; its optimal value is 0.

9.5.1 Elementary Ideas about Duality


Consider the general linear program
Minimize

cT x

Subject to

Ax b

x 0
Where

x = (x1 ; .; x n )T

is vector of variables,

A = (A ij )

is an mn

matrix, and c = (c1 ; .; c n )T


And b = (b1 ; ; b m )T . Recall that the dual of this linear program is
given by
Maximize

A T y c

Subject to

A T y c

y0
Where y = (y1 ;...........; y m )T is a vector of variables? The variables in y
are in one-to-one correspondence with the constraints of the primal LP, and
the variables in x are in one-to-one correspondence with the constraints of
the dual LP. The dual of the dual of an LP is again the original LP.
Theorem.1:(Weak Duality Theorem) If x and y are feasible solutions to the primal and dual respectively, then

Proof: We have
n
n m
m
m

cT x = ci xi A ji y j xi = A ji xi y j bj y j = bT y ,
i =1
i =1 j =1
j =1 i =1
j =1

Where the two inequalities hold by feasibility of x and y.


We also stated, but did not prove, the following result, which
establishes an intimate connection between the primal and dual LPs.
Theorem.2: Strong Duality Theorem
The optimal objective value of the primal is finite if and only if the
optimal objective value of the dual is finite, and in this case the optimal
objective values are equal.
If x and y are optimal solutions to the primal and dual LPs, then Theorem 2.2 tells us that cT x = bT y, and it follows that both inequalities in the
proof of Theorem 2.1 must hold with equality. That is,
and

c x = A x x
i

ji

j =1

m
m
n

bj y j = A ji xi y j

j =1
j =1 i =1

i =1

i =1

184

Mathematics

Let us consider the first equation. Since


only way the first equation can hold is if

m
j=1

A ji y i c i for all i, the

ci = j =1 A ji y i
m

for all i. This

is certainly true if ci = j =1 A ji y i for all i, i.e. all dual constraints are tight.
m

However, this is not necessary; even if ci <


still have c i xi =

m
j =1

m
j=1

A ji y i for some i, we can

A ji y i xi provided xi = 0. Similarly, for all j either yj = 0

or bj = j =1 A ji xi Conversely, if x and y are feasible solutions to the primal


and dual respectively such that these conditions hold, then equality holds
throughout the proof of Theorem 2.1, implying that x and y have the same
objective value and are thus both optimal. This proves the following result.
n

Lemma 3Let x and y be feasible solutions to the primal and dual


respectively. Then x and y are both optimal if and only if the
following two conditions hold:
or
or

Primal complementary slackness conditions: For i = 1 n, either xi = 0

j =1

A ji y i = ci

Dual complementary slackness conditions: For j = 1 m, either yj = 0


j =1

A ji xi = bj

This suggests an approach for finding optimal solutions to the primal


and dual LPs: search for feasible solutions satisfying both complementary
slackness conditions. We can use this idea to obtain approximation algorithms by searching for feasible solutions satisfying a relaxed version of
the complementary slackness conditions. We say that x and y satisfy the
-approximate dual complementary slackness conditions if for j = 1 m,
either yj = 0 or j =1 A ji xi bj that is, when yj 0 we do not require the
corresponding primal constraint to be tight, but it should not be too far
from being tight. The following lemma is useful for proving approximation
guarantees.
n

Lemma.4 Suppose x and y are feasible solutions to the primal and dual
respectively, satisfying the primal complementary slackness conditions and
the -approximate dual.
Complementary slackness conditions. Then x is an -approximate solution to the primal LP.
Proof: We have

cT x = ( AT y)T x = yT Ax yT b ,

Where the first equality follows from the primal complementary slackness
conditions and the in- equality follows from the -approximate dual complementary slackness conditions. The lemma follows since yT b is at most
the optimal objective value of the primal LP,
Strategy is to construct feasible solutions x and y such that x is integral
and the primal complementary slackness conditions and -approximate
dual complementary slackness conditions are satisfied. We do so without

185

Linear Programming

actually solving the LP, which makes this approach appealing from a practical standpoint. Lemma 2.2 then guarantees that x is an -approximate
solution to the LP relaxation and hence an -approximate solution to the
problem at hand. This is the main idea behind the primal-dual method.
However, achieving these constraints simultaneously is not always
possible. For example, our primal-dual algorithm for the Steiner Forest
problem does not satisfy the 2-approximate complementary slackness conditions for every j; yet we can show that these conditions are satisfied on
average and this suffices to imply a 2-approximate solution.

9.6 MULTIPLE CHOICE QUESTIONS


1.

2.

What is the objective function (Z) to be maximized in this linear programming problem (where Z is total profit)?
(a)Z = 100X + 120Y

(b)Z = 120X + 100Y

(c)Z = 1500X + 1500Y

(d)Z = 2X + 3Y

Total profits are maximized when the objective function (as a straight
line on a graph) is:
(a) Furthest from the origin irrespective of the feasible region
(b) Nearest to the origin and tangent to the feasible region
(c) Nearest to the origin irrespective of the feasible region
(d) Furthest from the origin and tangent to the feasible region

3.

4.

5.

6.

What is the equation of the labor constraint line for the welding department in this linear programme?
(a) 3X + 2Y = 1,500 hours

(b) 3X + 2Y = 550 hours

(c) 2X + 3Y = 1,500 hours

(d) 2X + 3Y = 550 hours

What is the equation of the labor constraint line for the assembly department in this linear programme?
(a) 2X + 2Y = 1,500 hours

(b) 1X + 1Y = 1,500 hours

(c) 3X + 2Y = 1,500 hours

(d) 1X + 1Y = 550 hours

What is the solution to this linear programming problem in terms of


the respective quantities of X and Y to be produced if profits are to be
maximized?
(a) X = 150, Y = 400

(b) X = 400, Y = 150

(c) X = 0, Y = 500

(d) X = 550, Y = 0

Which of the following is not an assumption of linear programming?


(a) Diminishing returns to the variable factors of production
(b) Prices of products remain the same no matter how high the consumer demand
(c) Prices of factor inputs remain the same no matter how high the firm
demand
(d) Constant returns to the variable factors of production

7.

Decision variables.
(a) tell how much or how many of something to produce, invest, purchase, hire, etc.
(b)represent the values of the constraints.

186

Mathematics

(c)measure the objective function.


(d)must exist for each constraint.
8.

9.

Which of the following is not a part of every linear programming problem formulation?
(a) An objective function

(b) A set of constraints

(c) Non-negativity constraints

(d) A redundant constraint

Which of the following is not true about slack variables in a simplex


tableau?
(a) They are used to convert T constraint inequalities to equations
(b)They represent unused resources
(c) They require the addition of an artificial variable
(d) They yield no profit

10. Determining the most efficient allocation of people, machines, equipment, etc., is characteristic of the LP problem type know as
(a) Production scheduling

(b) Labor planning

(c) Assignment

(d) Blending

9.7 REVIEW QUESTIONS


1. Define the concepts of linear programming. What are the needs of
linear programming assumptions?
2. Describe the proportionality assumption of linear programming.
3. Explain the structure of linear programming model.
4. What are the steps of formulation of linear programming problems?
5. Explain the guidelines for formulating linear programming model.
6. Explain the simple method of linear programming problem, with
the help of example.
7. Explain the elementary ideas about duality.
8. Write short notes on:
a. Feasible region
b. Optimal solution

9. Solve the following linear programming problem graphically


Minimize cos t = 4X1 + 5X 2
Subject to :

X 1 + 2X 2 80

3X1 + X 2 75

10. David Corporation makes three products, and it has three machines
available as resources as given in the following LP problem:
Maximize contribution = 4X1 + 4X 2 + 7X 3
Subject to :

1X 1 + 7X 2 + 4X 3 100(hours on machine 1)

2X1 + 1X 2 + 7X 3 110(hours on machine 2)

8X1 + 4X 2 + 1X 3 100(hours on machine 3)

a. Determine the optimal solution using LP software.

187

Linear Programming

b. Is there unused time available on any of the machines with the


optimal solution?
c. What would it be worth to the firm to make an additional hour
of time available on the third machine?
d. How much would the firms profit increase if an extra 10 hours
of time were made available on the second machine at no extra
cost?

ANSWER FOR MULTIPLE CHOICE QUESTIONS


1. (a)

2. (d)

3. (c)

4. (c)

5. (a)

6. (a)

7.(a)

8.(d)

9.(a)

10. (c)

Chapter 10

Probability and Probability


Distribution
Objectives
After studying this
chapter, you will be
able to:
Define probability
Describe the
probability
distribution
Understand
conditional
probability
and Bayes
theorem

INTRODUCTION
Probability theory deals with situations in which there is an element of randomness or chance.
The mathematical theory of probability arose in consideration of games
of chance but is now widely used in far more practical and applied situations.
A random phenomenon is a situation in which we know what outcomes
could happen, but we do not know which particular outcome will happen.
For any random phenomenon, each attempt, or trial, generates an outcome. Something happens on each trial, and we call whatever happens the
outcome.
Experiment is a process of observation that leads to a single outcome
that cannot be predicted with certainty.

Learning Objectives
Compute probability in a situation where there are equally-likely
outcomes
Apply concepts to cards and dice
Compute the probability of two independent events both occurring
Compute the probability of either of two independent events occurring
Do problems that involve conditional probabilities
Compute the probability that in a room of N people, at least two share
a birthday
Describe the gamblers fallacy

Probability of a Single Event


If we roll a six-sided die, there are six possible outcomes, and each of these
outcomes is equally likely. A six is as likely to come up as a three, and likewise for the other four sides of the die. What, then, is the probability that
a one will come up? Since there are six possible outcomes, the probability
is 1/6. What is the probability that either a one or a six will come up? The
two outcomes about which we are concerned are called favorable outcomes.

189

Linear Programming

Given that all outcomes are equally likely, we can compute the probability
of a one or a six using the formula:
Probability =

Number of favorableoutcomes
Number of possible equally likely outcomes

10.1 PROBABILITY
Probability is the branch of mathematics that studies the possible outcomes
of given events together with the outcomes relative likelihoods and distributions. In common usage, the word probability is used to mean the
chance that a particular event (or set of events) will occur expressed on a
linear scale from 0 (impossibility) to 1 (certainty), also expressed as a percentage between 0%and 100%. The analysis of events governed by probability is called statistics.
There are several competing interpretations of the actual meaning
of probabilities. Frequentistsview probability simply as a measure of the
frequency of outcomes, while Bayesians treat probability more subjectively
as a statistical procedure that endeavours to estimate parameters of an underlying distribution based on the observed distribution.
A properly normalized function that assigns a probability density to
each possible outcome within some interval is called a probability density
function (or probability distribution function), and its cumulative value is
called a distribution function.
A variate is defined as the set of all random variables that obey a given
probabilistic law. It is common practice to denote a variate with a capital
letter. The set of all values that X can take is then called the range, denoted
RX Specific elements in the range of X are called quantizes and denoted ,
and the probability that a variateX assumes the element x is denoted P(X=x).
Probabilities are defined to obey certain assumptions, called the probability axioms. Let a sample space contain the union () of all possible
eventsEi, so

(U Ei )
N

i =1

and let E and F denote subsets of S. Further, let F= not F be the complement
of F, so that
F F=S
Then the set E can be written as
E = E S = E C (F F) = (E F)(E F),
Where denotes the intersection, then
P(E)

P(EF)+P(EF)-P[(EF)(EF)]

P(E)

P(EF)+P(EF)-P[(FF)(EE)]

P(E)

P(EF)+P(EF)-P(E)

P(E)

P(EF)+P(EF)-P()

P(E)

P(EF)+P(EF)

Let P(E\F) denote the conditional probability of E given that F has already occurred, then
P(E)

P(E|F)P(F)+P(E|F)P(F)

Key Vocabulary
Compound Event: A
collection of more than
one outcome for an
experiment.

190

Mathematics

P(E|F)P(F)+P(E|F)P(F)

P(E|F)P(F)+P(E|F)[1-P(F)]

P(AB)

P(A)P(B|A)

P(B)P(A|B)

P(AB)

P(A)P(B|A)

P(E|F)

P(EF)/P(F)

The relationship
P(AB)=P(A)P(B)

Key Vocabulary
Probability: Probability
is a measure or
estimation of how likely
it is that something
will happen or that a
statement is true.

that

Holds if A andBare independent events. A very important result states


P(EF)=P(E)+P(F)-P(EF),
This can be generalized to

(U Ai ) = P(A A ) +
t

i =1

''
ijk

P(A i A j A k ) ....( 1)n 1 P(I i =1 A i )

10.2 PROBABILITY DISTRIBUTION


Probability distributions are a fundamental concept in statistics. They are
used both on a theoretical level and a practical level.
Some practical uses of probability distributions are:
To calculate confidence intervals for parameters and to calculate
critical regions for hypothesis tests.
For univariate data, it is often useful to determine a reasonable distributional model for the data.
Statistical intervals and hypothesis tests are often based on specific
distributional assumptions. Before computing an interval or test
based on a distributional assumption, we need to verify that the
assumption is justified for the given data set. The distribution does
not need to be the best-fitting distribution for the data, but an adequate enough model so that the statistical technique yields valid
conclusions.
Simulation studies with random numbers generated from using a
specific probability distribution are often needed.
A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence.

Probability Distribution Prerequisites


To understand probability distributions, it is important to understand variables, random variables and some notation.
A variable is a symbol (A, B, x, y, etc.) that can take on any of a
specified set of values.
When the value of a variable is the outcome of a statistical experiment, that variable is a random variable.

191

Linear Programming

Generally, statisticians use a capital letter to represent a random variable and a lower-case letter, to represent one of its values. For example,
X represents the random variable X.
P(X) represents the probability of X.
P(X = x) refers to the probability that the random variable X is equal
to a particular value, denoted by x. As an example, P(X = 1) refers to
the probability that the random variable X is equal to 1.
An example will make clear the relationship between random variables and probability distributions. Suppose we flip a coin two times. This
simple statistical experiment can have four possible outcomes: HH, HT,
TH, and TT. Now, let the variable X represent the number of Heads that
Key Vocabulary
result from this experiment. The variable X can take on the values 0, 1, or 2.
In this example, X is a random variable; because its value is determined by
Simple Event:An event
the outcome of a statistical experiment.
that includes one
A probability distribution is a table or an equation that links each and only one of the
outcome of a statistical experiment with its probability of occurrence. (final) outcomes for
Consider the coin flip experiment described above. Table 10.1 which as- an experiment and is
sociate each outcome with its probability, is an example of a probability denoted by E .
i
distribution?
Table 10.1: Probability Distribution

Number of heads

Probability

0.25

0.50

0.25

The Table 10.1 represents the probability distribution of the random


variable X.

Cumulative Probability Distributions


A cumulative probability refers to the probability that the value of a random variable falls within a specified range.
Let us return to the coin flip experiment. If we flip a coin two times, we
might ask: What is the probability that the coin flips would result in one or
fewer heads? The answer would be a cumulative probability. It would be
the probability that the coin flip experiment results in zero heads plus the
probability that the experiment results in one head.
P(X < 1) = P(X = 0) + P(X = 1) = 0.25 + 0.50 = 0.75
Like a probability distribution, a cumulative probability distribution
can be represented by a table or an equation. In the Table 10.2 the cumulative probability refers to the probability than the random variable X is less
than or equal to x.

192

Mathematics

Table 10.2: Cumulative probability distribution

Key Vocabulary
Variant: It is a
generalization of the
concept of a random
variable that is defined
without reference to
a particular type of
probabilistic experiment.
It is defined as the set
of all random variables
that obey a given
probabilistic law.

Number of
heads: x

Probability: P(X =
x)

Cumulative Probability:
P(X < x)

0.25

0.25

0.50

0.75

0.25

1.00

Uniform Probability Distribution


The simplest probability distribution occurs when all of the values of a random variable occur with equal probability. This probability distribution is
called the uniform distribution.
Uniform Distribution: Suppose the random variable X can assume k
different values. Suppose also that the P(X = xk) is constant. Then,
P(X = xk) = 1/k
Example:
Suppose a die is tossed. What is the probability that the die will land
on 6?
Solution:
When a die is tossed, there are 6 possible outcomes represented by: S
= {1, 2, 3, 4, 5, 6}. Each possible outcome is a random variable (X), and each
outcome is equally likely to occur. Thus, we have a uniform distribution.
Therefore, the P(X = 6) = 1/6.
Example:
Suppose we repeat the dice tossing experiment described in above Example. This time, we ask what is the probability that the die will land on a
number that is smaller than 5?
Solution
When a die is tossed, there are 6 possible outcomes represented by: S
= {1, 2, 3, 4, 5, 6}. Each possible outcome is equally likely to occur. Thus, we
have a uniform distribution.
This problem involves a cumulative probability. The probability that
the die will land on a number smaller than 5 is equal to:
P(X < 5) = P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) = 1/6 + 1/6 + 1/6
+ 1/6 = 2/3

Complementary Events
The complement of event E, denoted by and is the set of outcomes in S
that are not included in event E.
Example
E = Head; = Tail
E = Spades; = not a spade
E = Account in error; = Account not in error

193

Linear Programming

Simple Event:An event that includes one and only one of the (final)
outcomes for an experiment and is denoted by Ei.
Compound Event: A collection of more than one outcome for an experiment.

Example of Simple and Compound Events


Suppose we randomly select two people from the members of a club. Look
at whether the person selected each time is a man or a woman. Write all the
outcomes for this experiment.
Probability is the mathematical framework for describing (modeling)
uncertainty. It is a numerical measure of the likelihood that a specific event
will occur.

Key Vocabulary

Two or more outcomes (or events) that have the same probability of
occurrence are said to be equally likely outcomes (or events). Probability is Vertical Bar: The vertical
denoted by P.
bar (|) is a character
with various uses in
P(E) denotes the probability of the event E.
mathematics, where it
What is the probability of getting a spade when a card is drawn from a
can be used to represent
well-shuffled deck?
absolute value.
Let E be the spades.
Then we denote the possibility of getting a spade by P(E) or simply
P(spade).

10.2.1 Calculating Probabilities


1. Classical Approach:


P ( D|P ) =

0.99 * 0.001
0.019
0.99 * 0.001 + 0.05 * 0.999

Assumes all outcomes of the experiment are equally likely:


P (E ) =

Number of favorable outcomes to E


Total number of outcomes for the experiment

Example:

Find the probability of obtaining an even number in one roll of a fair


die.

E = even number

P (E) =

Number of outcomes included in E 3 3


= = .50
Total number of outcomes
6

2. Relative Frequency Approach


Probability of E may be interpreted as the relative frequency of the


E over a long series of experiments.
P (E ) =

Number of times E occurs


Number of times experiment is repeated

194

Mathematics

Example:

Roll a die a large number of times and observe number of times an


even number occurs.

E = even number.

P(E) = (Even number)

P (E ) =

Number of observed events


Number of times the die is rolled

Expected number of events =

500
1
=
1,000 2

Law of Large Numbers: If an experiment is repeated again and again,


the probability of an event obtained from the relative frequency approaches
the actual or theoretical probability.
Probabilities must be between 0 and 1, inclusive.
A probability of 0 indicates impossibility.
A probability of 1 indicates certainty.

Rounding rule for Probabilities


Probabilities should be expressed as reduced fractions or rounded to two or
three decimal places. When the probability of an event is an extremely small
decimal, it is permissible to round the decimal to the first nonzero digit after
the decimal point.
Example
1. Toss a coin.

What is the probability of getting a head?

2. Select a card from a deck.


What is the probability that it is a diamond?

3. Toss two coins.


What is the probability of getting at least one head?

4. A group of four firms consists of two firms which have serious financial problems and two which are in good financial condition.
A buyer chooses two of these firms at random to supply a particular product. What is the probability that at least one of the selected
firms is in poor condition?

E = At least one is in poor financial condition.

5. Choose 2 cards from a deck without replacement.


What is the probability that both are diamonds?

10.3 CONDITIONAL PROBABILITY AND


BAYES THEOREM
The conditional probability of an event B is the probability that the event
will occur given the knowledge that an event A has already occurred. This
probability is written P(B|A), notation for the probability of B given A. In

Linear Programming

the case where events A and B are independent (where event A has no effect on the probability of event B), the conditional probability of event B
given event A is simply the probability of event B, that is P(B).
If events A and B are not independent, then the probability of the intersection of A and B (the probability that both events occur) is defined by
P (A and B) = P (A) P (B|A).
From this definition, the conditional probability P (B|A) is easily obtained by dividing by P (A):
P (B A) =

P(A and B)
P(A)

Often it is required to compute the probability of an event given that


another event has occurred. For example, what is the probability that two
cards drawn at random from a deck of playing cards will both be aces? It
might seem that you could use the formula for the probability of two independent events and simply multiply 4/52 x 4/52 = 1/169. This would be
incorrect, however, because the two events are not independent. If the first
card drawn is an ace, then the probability that the second card is also an
ace would be lower because there would only be three aces left in the deck.
Once the first card chosen is an ace, the probability that the second card
chosen is also an ace is called the conditional probability of drawing an ace.
In this case the condition is that the first card is an ace. Symbolically, we
write this as:
P (ace on second draw | an ace on the first draw)
The vertical bar | is read as given, so the above expression is short
for The probability that an ace is drawn on the second draw given that an
ace was drawn on the first draw. What is this probability? Since after an
ace is drawn on the first draw, there are 3 aces out of 51 total cards left. This
means that the probability that one of these aces will be drawn is 3/51 =
1/17.
If Events A and B are not independent, then P (A and B) = P (A) x
P(B|A).
Applying this to the problem of two aces, the probability of drawing
two aces from a deck is 4/52 x 3/51 = 1/221.
One more example: If you draw two cards from a deck, what is the
probability that you will get the Ace of Diamonds and a black card? There
are two ways you can satisfy this condition: (a) You can get the Ace of Diamonds first and then a black card or (b) you can get a black card first and
then the ace of diamonds. Let us calculate Case A. The probability that the
first card is the Ace of Diamonds is 1/52. The probability that the second
card is black given that the first card is the Ace of Diamonds is 26/51 because 26 of the remaining 51 cards are black. The probability is therefore
1/52 x 26/51 = 1/102. Now for Case B: the probability that the first card is
black is 26/52 = 1/2. The probability that the second card is the Ace of Diamonds given that the first card is black is 1/51. The probability of Case 2 is
therefore 1/2 x 1/51 = 1/102, the same as the probability of Case 1. Recall
that the probability of A or B is P(A) + P(B) - P(A and B). In this problem,
P (A and B) = 0 since a card cannot be the Ace of Diamonds and be a black
card. Therefore, the probability of Case A or Case B is 1/102 + 1/102 =
2/102 = 1/ 51. So, 1/51 is the probability that you will get the Ace of Diamonds and a black card when drawing two cards from a deck.

195

196

Mathematics

Birthday Problem
If there are 25 people in a room, what is the probability that at least two
of them share the same birthday. If your first thought is that it is 25/365 =
0.068, you will be surprised to learn it is much higher than that. This problem requires the application of the sections on P (A and B) and conditional
probability.
This problem is best approached by asking what is the probability that
no two people have the same birthday. Once we know this probability, we
can simply subtract it from 1 to find the probability that two people share
a birthday.
If we choose two people at random, what is the probability that they
do not share a birthday? Of the 365 days on which the second person could
have a birthday, 364 of them are different from the first persons birthday.
Therefore the probability is 364/365. Let us define P2 as the probability that
the second person drawn does not share a birthday with the person drawn
previously. P2 is therefore 364/365. Now define P3 as the probability that
the third person drawn does not share a birthday with anyone drawn previously given that there are no previous birthday matches. P3 is therefore
a conditional probability. If there are no previous birthday matches, then
two of the 365 days have been used up, leaving 363 non-matching days.
Therefore P3 = 363/365. In like manner, P4 = 362/365, P5 = 361/365, and so
on up to P25 = 341/365.
In order for there to be no matches, the second person must not match
any previous person and the third person must not match any previous
person, and the fourth person must not match any previous person, etc.
Since P(A and B) = P(A)P(B), all we have to do is multiply P2, P3, P4 ...P25
together. The result is 0.431. Therefore the probability of at least one match
is 0.569.

10.3.1 Bayes Theorem


Bayes theorem describes the relationships that exist within an array of simple and conditional probabilities. For example: Suppose there is a certain
disease randomly found in one-half of one percent (.005) of the general population. A certain clinical blood test is 99 percent (.99) effective in detecting
the presence of this disease; that is, it will yield an accurate positive result
in 99 percent of the cases where the disease is actually present. But it also
yields false-positive results in 5% (.05) of the cases where the disease is not
present.

Bayes Formula
Recall that multiplication rule claims:
P (AH) = P (A)P(H|A) = P (H)P(A|H).
This simple identity is the essence of Bayes Formula.
Let the event of interest A happens under any of hypotheses Hi with
a known (conditional) probability P (A|Hi). Assume, in addition, that the
probabilities of hypotheses H1 , . . . , Hn are known (prior probabilities).
Then the conditional (posterior) probability of the hypothesis Hi , i = 1, 2, .
. . , n, given that event A happened, is

197

Linear Programming

P (Hi |A ) =

P ( A|Hi ) P (Hi )
P ( A )

Where,
P (A) = P (A|H1 )P (H1 ) + + P (A|Hn )P (Hn ).
Assume that out of N coins in a box, one has heads at both sides. Such
two-headed coin can be purchased in Spencer stores. Assume that a coin
is selected at random from the box, and without inspecting it, flipped k
times. All k times the coin landed up heads. What is the probability that two
headed coin was selected?
Denote with Ak the event that randomly selected coin lands heads up
k times. The hypotheses are H1 the coin is two headed, and H2 the coin is
fair. It is easy to see that P (H1 ) = 1/N and P (H2 ) = (N 1)/N .The conditional probabilities are P (Ak |H1 ) = 1 for any k, and P (Ak |H2 ) = 1/2k .
By total probability formula,
P ( A k ) =

2 k + N 1
2 k N

P (H1 |A k ) =

2k
2 k + N 1

For N = 1, 000, 000 and k = 1, 2, . . . , 30 the graph of posterior probabilities is given in Figure 10.1 It is interesting that our prior probability P (H1 )
= 0.000001 jumps to posterior probability of 0.9991, after observing 30 heads
in a row. The mat lab code bayes1 1.m producing Figure10.1 is given in the
Programs/Codes on the GTBayes Page.

Figure 10.1: Posterior probability of the two-headed coin for N =


1,000,000 if k heads appeared.
Prosecutors Fallacy the prosecutorsfallacy is a fallacy commonly occurring in criminal trials but also in other various arguments involving rare
events. It consists of subtle exchanging of P (A|B) for P (B|A). A zealous
prosecutor has collected evidence, say fingerprint match, and has an expert
testify that the probability of finding this evidence if the accused were innocent is tiny. The fallacy is committed the prosecutor proceeds to claim that
the probability of the accused being innocent is comparably tiny.

198

Mathematics

Why is this incorrect? Suppose there is a one-in-a-million chance of


a match given that the accused is innocent. The prospector deduces that
means there is only a one-in-a-million chance of innocence. But in a community of 10 million people, one expects 10 matches, and the accused is just
one of those ten. That would indicate only a one-in-ten chance of guilt, if no
other evidence is available.
Two Masked Robbers: Two masked robbers try to rob a crowded
bank during the lunch hour but the teller presses a button that sets
off an alarm and locks the front door. The robbers realizing they
are trapped throw away their masks and disappear into the chaotic
crowd. Confronted with 40 people claiming they are innocent, the
police give everyone a lie detector test. Suppose that guilty people
are detected with probability 0.85 and innocent people appear to be
guilty with probability 0.08. What is the probability that Mr. Smith
was one of the robbers given that the lie detector says he is?
Guessing: Subject in an experiment are told that either a red or a
green light will flash. Each subject is to guess which light will flash.
The subject is told that the probability of a red light is 0.7, independent of guesses. Assume that the subject is a probability matcher- that
is, guesses red with probability 70 and green with probability 30.
(i) What is the probability that the subject guesses correctly?
(ii) Given that a subject guesses correctly, what is the probability
that the light flashed red?
False Positives: False positives are a problem in any kind of test:
no test is perfect, and sometimes the test will incorrectly report a
positive result. For example, if a test for a particular disease is performed on a patient, then there is a chance (usually small) that the
test will return a positive result even if the patient does not have the
disease. The problem lies, however, not just in the chance of a false
positive prior to testing, but determining the chance that a positive
result is in fact a false positive. As we will demonstrate, using Bayes
theorem, if a condition is rare, then the majority of positive results
may be false positives, even if the test for that condition is (otherwise) reasonably accurate.
Suppose that a test for a particular disease has a very high success rate:
If a tested patient has the disease, the test accurately reports this, a
positive, 99% of the time (or, with probability 0.99), and
If a tested patient does not have the disease, the test accurately reports that, a negative, 95% of the time (i.e. with probability 0.95).
Suppose also, however, that only 0.1% of the population have that disease (i.e. with probability 0.001).
We now have all the information required to calculate the probability
that, given the test was positive, that it is a false positive.
Let D be the event that the patient has the disease, and P be the event
that the test returns a positive result. The probability of a true positive is
P ( D|P ) =

0.99 * 0.001
= 0.019
0.99 * 0.001 + 0.05 * 0.999

and hence the probability of a false positive is about (1 - 0.019) = 0.981. Despite the apparent high accuracy of the test, the incidence of the disease is so
low (one in a thousand) that the vast majority of patients who test positive

199

Linear Programming

(98 in a hundred) do not have the disease. Nonetheless, this is 20 times the
proportion before we knew the outcome of the test! The test is not useless,
and re- testing may improve the reliability of the result. In particular, a test
must be very reliable in reporting a negative result when the patient does
not have the disease, if it is to avoid the problem of false positives. In mathematical terms, this would ensure that the second term in the denominator
of the above calculation is small, relative to the rst term. For example, if
the test reported a negative result in patients without the disease with probability 0.999, then using this value in the calculation yields a probability of
a false positive of roughly 0.5.
Multiple Choices: A student answers a multiple choice examination
question that has 4 possible answers. Suppose that the probability
that the student knows the answer to the question is 0.80 and the
probabilitythat the student guesses is 0.20. If student guesses, probability of correct answer is 0.25.
(i) What is the probability that the xed question is answered correctly?
(ii) If it is answered correctly what is the probability that the student
really knew the correct answer.
Manufacturing Bayes: Factory has three types of machines producing
an item. Probabilities that the item is I quality and it is produced on i-th machine are given in the following table:
Machine

probability of I quality

0.8

0.7

0.9

The total production is done 30% on type I machine, 50% on type II,
and 20% on type III.
One item is selected at random from the production.
(i) What is the probability that it is of I quality?
(ii) If it is of rst quality, what is the probability that it was produced on
the machine I?
Two-headed coin.4 One out of 1000 coins has two tails. The coin is selected at random out of these 1000 coins and ipped 5 times. If tails appeared all 5 times, what is the probability that the selected coin wastwotailed?
Kokomo, Indiana. In Kokomo, IN, 65% are conservatives, 20% are liberals and 15% are independents. Records show that in a particular election
82% of conservatives voted, 65% of liberals voted and 50% of independents
voted.
If the person from the city is selected at random and it is learned that
he/she did not vote, what is the probability that the person is liberal?
Ination and Unemployment: Businesses commonly project revenues
under alternative economic scenarios. For a stylized example, ination
could be high or low and unemployment could be high or low. There are
four possible scenarios, with the assumed probabilities:

200

Mathematics

Scenario
2

Ination
high
high

low

Unemployment
high
low
high

low

low

Probability

0.24

0.16
0.24
0.36

(i) What is the probability of high ination?


(ii) What is the probability of high ination if unemployment is high?
(iii) Are ination and unemployment independent?
Information Channel: One of three words AAAA, BBBB, and CCCC is
transmitted via an information channel. The probabilities of these words
are 0.3, 0.5, and 0.2, respectively. Each letter is transmitted independently
of the other letters and it is received correctly with probability 0.6. Since the
channel is not perfect, the letter can change to one of the other two letters
with equal probabilities of 0.2. What is the probability that word AAAA
had been submitted if word ABCA was received. An automatic machine
in a small factory produces metal parts. Most of the time (90% by long records), it produces 95% good parts and the remaining have to be scrapped.
Other times, the machine slips into a less productive mode and only produces 70% good parts. The foreman observes the quality of parts that are
produced by the machine and wants to stop and adjust the machine when
she believes that the machine is not working well. Suppose that the rst
dozen parts produced are given by the sequence.
SUSSSSSSSUSU
Wheres is satisfactory and u is unsatisfactory. After observing this sequence, what is the probability that the machine is in its good state? If the
foreman wishes to stop the machine when the probability of good state is
under 0.7, when should she stop?

10.4 MULTIPLE CHOICE QUESTIONS


1.

2.

There are 6 numbers on a normal die. What is the probability that if


you roll a pair of dice, you will get the sum of 20?
(a) 0

(b) 1/4

(c) 1/6

(d) 1/100

If we toss a coin 40 times, then heads will necessarily appear exactly 20


times.
(a)True

3.

4.

We toss two dice 1,000 times. How many times do we expect to have
the sum of the two dice equal to 4?
(a) about 250

(b) about 167

(c) about 83

(d) about 42

If we toss a coin 40 times, then we expect that heads will appear 20


times.
(a) True

5.

(b)False

(b) False

In a box, there are 8 red, 7 blue and 6 green balls. One ball is picked up
randomly. What is the probability that it is neither red nor green?

201

Linear Programming

6.

7.

8.

9.

(a) 1/3

(b) 3/4

(c) 2/3

(d) 3/8

Tickets numbered 1 to 20 are mixed up and then a ticket is drawn at


random. What is the probability that the ticket drawn has a number
which is a multiple of 3 or 5?
(a)

(b) 9/20

(b) 2/3

(d) None of These

A bag contains 2 red, 3 green and 2 blue balls. Two balls are drawn at
random. What is the probability that none of the balls drawn is blue?
(a) 11/21

(b) 10/21

(c) 12/21

(d) 3/5

What is the probability of getting a sum 9 from two throws of a dice?


(a) 1/6

(b) 1/9

(c) 1/8

(d) 1/7

Three unbiased coins are tossed. What is the probability of getting at


most two heads?
(a) 1/8

(b) 3/8

(c) 1/4

(d) 7/8

10. Two dice are thrown simultaneously. What is the probability of getting
two numbers whose product is even?
(a) 3/5

(b) 3/4

(c) 3/8

(d) 1/8

10.5 REVIEW QUESTIONS


1. Describe the basic concepts of probability.
2. A card is drawn from a pack of 52 cards. What is the probability of
getting a queen of club or a king of heart?
3. Explain the probability distribution.
4. In a lottery, there are 10 prizes and 25 blanks. A lottery is drawn at
random. What is the probability of getting a prize?
5. Discuss the complementary events.
6. From a pack of 52 cards, two cards are drawn together at random.
What is the probability of both the cards being kings?
7. Two dice are tossed. Find the probability that the total score is a
prime number.
8. A bag contains 6 black and 8 white balls. One ball is drawn at random. What is the probability that the ball drawn is white?
9. Explain the Bayes theorem with example.
10. What do you mean by relative frequency approach?

ANSWERS FOR MULTIPLE CHOICE QUESTIONS


1. (a)

2. (b)

3. (b)

4. (a)

5. (a)

6. (b)

7. (b)

8. (c)

9.(d)

10. (b)

You might also like