You are on page 1of 25

Q.1. Explain the fundamental steps in digital image processing. Ans.

Fundamental steps in Digital Image Processing There are some fundamental steps but as they are fundamental, all these steps may have sub-steps. The fundamental steps are described below with a neat diagram. 1. Image acquisition: Image acquisition is the first process shown in Figure 1. To do so requires an imaging sensor and the capability to digitize the signal produced by the sensor. The sensor could be a monochrome or color TV camera that produces an entire image of the problem domain every 1/30 seconds. The imaging sensor could also be a line-scan camera that produces a single image line at a time. If the output of the camera or other imaging sensor is not already in digital form, an analog-to-digital converter digitizes it. Note that acquisition could be as being given an image that is already in digital form. Generally, the image acquisition stage involves preprocessing, such as scaling. 2. Image enhancement: Image enhancement is among the simplest and most appealing areas of digital image processing. Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to highlight certain features of interest in an image. A familiar example of enhancement is when we increase the contrast of an image because it looks better. It is important to keep in mind that enhancement is a very subjective area of image processing.

Figure 1: Fundamental steps in digital image processing 3. Image restoration: Image restoration is an area that also deals with improving the appearance of an image. However, unlike enhancement, which is subjective, image restoration is objective, in the sense that restoration techniques tend to be based on mathematical or probabilistic models of image degradation. 4. Color image processing: Color image processing is an area that has been gaining in importance because of the significant increase in the use of digital images over the Internet. Color is used as the basis for extracting features of interest in an image. 5. Wavelets and Multires solution processing: Wavelets are the foundation for representing images in various degrees of resolution. In particular, this is used for image data compression and for pyramidal representation, in which images are subdivided successively into smaller regions. 6. Compression: Compression deals with techniques for reducing the storage required to save an image, or the bandwidth required to transmit it. Although storage technology has improved significantly over the past decade, the same cannot be said for transmission capacity. This is true particularly in uses of the Internet, which are characterized by significant pictorial content. Image compression is familiar (perhaps inadvertently) to most users of computers in the form of image file extensions. 7. Morphological processing: Morphological processing deals with tools for extracting image components that are useful in the representation and description of shape 8. Segmentation: Segmentation procedures partition an image into its constituent parts or objects. In general autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged segmentation procedure brings the process a long way towards successful solution of imaging problems that require objects to be identified individually. On the other hand, weak or erratic segmentation algorithms almost always guarantee eventual failure. In terms of character recognition, the key role of segmentation is to extract individual characters and words from the background. The output of a segmentation stage, which usually is raw pixel data, constituting either the boundary of a region (i. e., the set of pixels separating one image region from another) or all the points in the region itself. In either case, converting the data to a from suitable for computer processing is necessary. 9. Representation and description: The next stage is Representation and description. Here, the first decision that must be made is whether the data should be represented as a boundary or as a complete region. Boundary representation

is appropriate when the focus is on external shape characteristics, such as corners and inflections. Regional representation is appropriate when the focus is on internal properties, such as texture or skeletal shape. Choosing a representation is only part of the solution for transforming raw data into a form suitable for subsequent computer processing. Description, also called feature selection, deals with extracting attributes that result in some quantitative information of interest or are basic for differentiating one class of objects from another. 10. Object Recognition: Recognition is the process that assigns a label (e.g., vehicle) to an object based on its descriptors. Knowledge about a problem domain is coded into an image processing system in the form of a knowledge database. This knowledge may be as simple as detailing regions of an image where the information of interest is known to be located, thus limiting the search that has to be conducted in seeking that information. The knowledge base also can be quite complex, such as an interrelated list of all major possible defects in a materials inspection problem or an image database containing high-resolution satellite images of a region in connection with changedetection applications. In addition to guiding the operation of each processing module, the knowledge base also controls the interaction between the modules. Q.2. Explain the concept of Image acquisition using sensor arrays. Ans. Image acquisition is the first step in digital image processing. To acquire a digital image, two elements are required. The first is a physical device that is sensitive to a band in the electromagnetic energy spectrum, and that produces an electrical signal output proportional to the level of energy sensed. The second called digitizer is a device for converting electrical output of the physical sensing device into digital form. For example, consider the basics of an X-ray system. The output of an X-ray source is directed at an object and a medium sensitive to X-ray is placed on the other side of the object. The medium thus acquires an image of the materials having various degrees of X-ray absorption. The below figure 2.1 shows the three principal sensor arrangements used to transform illumination energy into digital images. The idea is simple: Incoming energy is transformed into a voltage by the combination of input electrical power and sensor material that is responsive to the particular type of energy being detected. The output voltage waveform is the response of the sensor, and a digital quantity is obtained from each sensor by digitizing its response.

Image Acquisition Using Sensor Arrays Figure 2.1(c) shows individual sensors arranged in the form of a 2-D array. Numerous electromagnetic and some ultrasonic sensing devices are frequently arranged in an array format. This is also the predominant arrangement found in digital cameras. A typical sensor for these cameras is a CCD (Charge couple device) array, which can be manufactured with a broad range of sensing properties and can be packaged in rugged arrays of elements or more. CCD sensors are used widely in digital cameras and other light sensing instruments. The response of each sensor is proportional to the integral of the light energy projected onto the surface of the sensor, a property that is used in astronomical and other applications requiring low noise images. Noise reduction is achieved by letting the sensor integrate the input light signal over minutes or even hours. The sensor array shown in Fig. 2.2(c) is two dimensional, its key advantage is that a complete image can be obtained by focusing the energy pattern onto the surface of the array.

The principal manner in which array sensors are used is shown in Fig. 2.2. This figure shows the energy from an illumination source being reflected from a scene element, but, as mentioned at the beginning of this section, the energy also could be transmitted through the scene elements. The first function performed by the imaging system shown in Fig. 2.2.(c) is to collect the incoming energy and focus it onto an image plane. If the illumination is light, the front end of the imaging system is a lens, which projects the viewed scene onto the lens focal plane, as Fig. 2.2(d) shows. The sensor array, which is coincident with the focal plane, produces outputs proportional to the integral of the light received at each sensor. Digital and analog circuitries sweep these outputs and convert them to a video signal, which is then digitized by another section of the imaging system. The output is a digital image, as shown diagrammatically in Fig. 2.2(e).

Figure 2.2: An example of the digital image acquisition process (a) Energy (illumination) source (b) An element of a scene (c) Imaging system (d) Projection of the scene onto the image plane (e) Digitized image Q.3. Explain the terms reflection, complement and translation with example diagrams. Ans. Let A and B sets in Z2 , with components a = ( a1 , a2 ) and b = ( b1 , b2 ) , respectively. The translation of A by x = ( x 1 , x 2 ) denoted ( A)x , is defined as ( a )z = {c|c = a + z, for a A}

The reflection of B, denoted (@B) , is defined as (@B) = {w | w = -b, for b B} The compliment of set A is Ac = {x | x A} Finally, the difference of two sets A and B denoted A-B is defined A-B Ac = {x | x A, x B} = A Bc

Figure 3.1: (a) Translation of A by z (b) Reflection of B Q.4 Explain about Image Subtraction and Image Averaging. Ans. Image Subtraction The difference between two images f (x, y) and h (x, y) expressed as g( x, y) = f (x, y) h (x, y) is obtained by computing the difference between all pairs of corresponding pixels from f and h. Image subtraction has numerous important applications in segmentation and enhancement. One of the most commercially successful and beneficial uses of image subtraction is the area of medical imaging called mask mode radiography. In this case h (x, y), the mask, is an X-ray image of a region of a patients body, captured by an intensified TV camera located opposite an X-ray source. The procedure consists of injecting a contrast medium into the patients blood stream, taking a series of images of the same anatomical region as h (x, y), and subtracting this mask from the series of incoming images after injection of the contrast medium. The net effect of subtracting the mask from each sample in the incoming stream of TV images is that the areas that are different between f (x, y)

and h (x, y) appear in the output image as enhanced detail. Because images can be captured at TV rates, this procedure in essence gives a movie showing how the contrast medium propagates through the various arteries in the area being observed. Image Averaging Consider a noisy image g(x, y) formed by the addition of noise h(x, y) to an original image f(x, y), that is g(x, y) = f(x, y) + (x, y)

Where the assumption is that at every pair of coordinates (x, y) the noise is uncorrelated and has zero average value. The objective of the following procedure is to reduce the noise content by adding a set of noisy images, { gi (x, y) }. If noise has zero mean and be uncorrelated, then it can be shown that if g ( x, y) = image formed by averaging K different noisy images _ k g (x, y) = gi (x, y) k i=1 Then it follows that E{ g (x, y) } = f (x, y) and 2 g(x, y) 2 (x, y) Where E{ (x, y)} is the expected value of g and 2 g(x,y) and 2 g(x,y) are the variances of g and all at coordinates (x, y). The standard deviation at any point in the average image is ------- = ---- 2 (x,y) g(x,y) k If K increase, it indicates that the variability (noise) of the pixel at each location (x, y) decreases. Thus E { g(x, y)} = f(x, y) E { g(x, y)} = expected value of g (output after averaging) = original image f(x, y) Note: The images gi (x, y) noisy images) must be registered (aligned) in order to avoid the introduction of blurring and other artifacts in the output image.

Q.5 Explain in detail about Histogram Equalization. Ans. Histogram Equalization The histogram of a digital image with gray levels in the range [0, L-1 ] is a discrete function ) [(p(r)] k ) = nk/n, where is the gray level, is the number of pixels in the image with that gray level, n is the total number of pixels in the image and K 0, ,2, L-1. Let the variable r represent the gray levels in the image to be enhanced. Assume that the pixel values are continuous quantities that have been normalized so that they lie in the interval [0, 1], with r = 0 representing black and r = 1 representing white. Later, we consider a discrete formulation and allow pixel values to be in the interval [0, L-1]. For any r in the interval [0, L-1], the transformations have the form S= T(r) [4.4.1-1]

Which produce a level s for every pixel value r in the original image. It is assumed that the transformation function given in equation (4.4.1-1) satisfies the conditions: (a) T(r) is single valued and monotonically increasing in the interval 0 r ; and (b) 0 T r) for 0 r

Figure 4.1 : a gray-level transformation function Condition (a) preserves the order from black to white in the gray scale, whereas condition (b) guarantees a mapping that is consistent with the allowed range of pixel values. Figure 4.1 shown a transformation function satisfying these conditions. The inverse transformations from s back to r denoted by R = T-1 s) 0 s [4.4.1-2] Where the assumption is that T1(s) also satisfies conditions (a) and (b) with respect to the variable S. The gray levels in an image may be viewed as random quantities in the interval [0, 1]. If they are continuous variables, the original and transformed gray levels can be characterized by their probability density functions P r ( r) and Ps ( s), respectively, where the subscripts of P are used to indicate that P r and Ps are different functions. From elementary probability theory, if Pr ( r) and T( r) are known and T1(s) satisfies condition (a), the probability density function of the transformed gray levels is Ps(s) = [Pr (r)dr / ds]r = T-1(s) [4.4.1-3]

The following enhancement techniques are based on modifying the appearance of an image by controlling the probability density function of its gray levels via the transformation function T( r). Consider the transformation function s= T(r) = Pr (w) dw
0 r

[4.4.1-4]

Where w is dummy variable of integration. The right most side of equation (4.4.14) is recognized as the cumulative distribution (CDF) of r. Conditions (a) and (b) presented earlier are satisfied by this transformation function, because the CDF increases monotonically from 0 to 1 as a function of r. From equation (4.4.1-4), the derivative of s with respect to r is ds / dr = Pr (r) [4.4.1-5] Substituting dr / ds in equation (4.3.1-3) yields Ps(s) = [Pr (r) 1/ Pr (r)]r = T-1 (s)

= [1]r = T1(s) =1 0 r 1

[4.4.1-6]

This is a uniform density in the interval of definition of the transformed variable s. This result is independent of the independent of the inverse transformation function, which is important, because obtaining T1(s) analytically is not always easy. The foregoing development indicates that using a transformation function equal to the cumulative distribution of r produces an image whose gray levels have a uniform density. In terms of enhancement, this result implies an increase in the dynamic range of the pixels, which can have a considerable effect in the appearance of an image. In order to be useful for digital image processing, these concepts must be formulated in discrete form. For gray levels that take a discrete value, we deal with probabilities. Pr [( r) k] = nk / n) = [4.4.1-7] 0 rk 1 and k= 0, L-1

Where L is the number of levels, Pr [( r) k] is the probability of the Kth gray level, nk is the number of times this level appears in the image and n is the total number of pixels in the image. A plot of Pr [( r) k] versus gk is called a histogram and the technique used for obtaining a uniform histogram is known as histogram equalization or histogram linearization. The discrete form of equation (4.4.1-4) is given by the relation k sk = [ T(r) k ] = nj / n j =0 = 1 (j = o)1 k = [(P1r [(r) 1j) The inverse transformation is denoted rk = T(-1) (s) k 0 Sk 0 k

[4.4.1-8]

and 0, 0 L-1.

Where both [ T(r) k ] and T(-1) (s) k are assumed to satisfy conditions (a) and (b) stated. The transformation function [ T(r) k ] may be computed directly from the image by using an equation (4.4.1-8). The inverse function T(-1) (s) k) is not used in histogram equalization. Q.6 Write the application of image processing with examples.

Ans. Applications of Image Processing Digital image processing has a broad spectrum of applications, such as remote sensing via satellites and other spacecrafts, image transmission and storage for business applications, medical processing, radar, sonar, robotics and automated inspection of industrial parts. Images acquired by satellites are useful in tracking of earth resources, geographical mapping, prediction of agricultural crops, urban growth, weather, flood and fire control and many other environmental applications. Space image applications include recognition and analysis of objects contained in images obtained from deep space-probe missions. Image transmission and storage applications occur in broadcast television, teleconferencing, transmission of facsimile images (printed documents and graphics) for office automation, commution over computer networks, closed-circuit television based security monitoring systems, and in military communications. In medical applications one is concerned with the processing of chest X-rays,

(a) Space probe images Moon and mars

( b )Medical Images : Xray of normal Cervical spine

( c ) Television Images of Boys and Girls Figure 1.2: Examples of digital images cineamgiograms, projection images of transaxial tomography, and other medical images that occur in radiology, nuclear magnetic resonance (NMR) and ultrasonic scanning. These images may be used for patient screening and monitoring or for detection of tumors or other diseases in patients. Radar and sonar images are used for detection and recognition of various types of targets or in guidance and maneuvering of aircraft or missile systems. Figure 6 shows examples of several different types of images. Below are examples of applications of image processing: Agricultural (Fruit grading, harvest control, seeding, fruit picking ...) Communications (compression, video conferencing, television, ...) Character recognition (printed and handwritten) Commercial (Bar code reading, bank cheques, signature, ...) Document processing (Electronic circuits, mechanical drawings, music, ...) Human (Heads and faces, hands, body, ...) Industrial (Inspection, part pose estimation and recognition, control...) Leisure and entertainment (museums, film industry, photography,...) Medical (X-rays, CT, NMR, ultrasound, intensity, ...) Military (Tracking, detection, etc) Police (Fingerprints, surveillance, DNA analysis, biometry, ...) Traffic and transport (Road, airport, seaport, license identification, ...)

Q.7 Explain the properties of FFT. Ans. 1. Separability The discrete FT pair can be expressed in separable forms which (after some manipulations) can be expressed as:

For each value of x, the expression inside the brackets is a 1-D transform, with frequency values v 0, . N-1. Thus, the 2-D function F(x,v) is obtained by taking a transform along each row of f(x,y) and multiply the result by N. The desired result F(u, v) is then obtained by making a transform along each column of F(x,v).

2. Translation or Shifting

The previous equation means: Multiply f(x, y) by the indicated exponential term and taking the transform of the product results in a shift of the origin of the frequency plane to the point (u 0, v0) Multiply F(u, v) by the exponential terms shown and taking the inverse transform moves the origin of the spatial plane to (x0, y0). A shift in f x, y) doesnt affect the magnitude of its Fourier transform.

3. Periodicity and Conjugate Symmetry The DFT and IDFT are periodic with period N ; that is:

4. Rotation Consider polar coordinate form x rcos This means that: f(x, y), F(u, v) become f r, ) F , ) f r, + 0) F(, 0 + + 0) y rsin u cos v cos

which means that rotating f(x, y) by an angle q0 rotates F(u, v) by the same angle (and vice versa.) 5. Distributivity The Fourier transform and its inverse are distributive over addition but not over multiplication.

6. Scaling This property is best summarized by "a contraction in one domain produces corresponding expansion in the Fourier domain".

7. Convolution The convolution of two functions f(x) and g(x) is defined by the integral:

The Convolution Theorem tells us that convolution in the spatial domain corresponds to multiplication in the frequency domain, and vice versa. 8. Correlation One of the principal applications of correlation in image processing is in the area of template or prototype matching i.e. finding the closest match between an unknown image and a set of known images.

Q.1 Explain the concept of point and line detection. Ans. There are three basic types of discontinuities in a digital image: Points, lines and edges. In practice, the most common way to look for discontinuities is to run a mask through the image. For the 3 x 3 mask shown in figure 1, this procedure involves computing the sum of products of the coefficients with the gray levels contained in the region encompassed by the mask. That is, the response of the mask at any point in the image R = w1 z1 + w2 z2 + .+ w9 z9

U1 U4 U7

U2 U5 U8

U3 U6 U9

Figure 1. A general 3 x 3 mask Where zt is the gray level of the pixel associated with mask coefficient wt. As usual, the response of the mask is defined with respect to its center location. When the mask is centered on a boundary pixel, the response is computed by using the appropriated partial neighbor hood. Point Detection The detection of isolated points in an image is straight forward. Using the mask shown in Figure 1, we say that a point has been detected at the location on which the mask is centered if |R| T [1]

Where T is a nonnegative threshold an R is the response of the mask at any point in the image. Basically all that this formulation does is measure the

weighted differences between the center point and its neighbors. The idea is that the gray level of an isolated point will be quite different from the gray level of its neighbors.

-1 -1 -1

-1 8 -1

-1 -1 -1

Figure 7.2: A mask used for detecting isolated points different from a constant background Line Detection Line detection is an important step in image processing and analysis. Lines and edges are features in any scene, from simple indoor scenes to noisy terrain images taken by satellite. Most of the earlier methods for detecting lines were based on pattern matching. The patterns directly followed from the definition of a line. These pattern templates are designed with suitable coefficients and are applied at each point in an image. A set of such templates is shown in Figure 2. If the first mask were moved around an image, it would respond more strongly to lines oriented horizontally. With constant background, the maximum response would result when the line passed through the middle row of the mask. This is easily verified by sketching a simple array of s with a line of a different gray level running horizontally through the array. A simple experiment would reveal that the second mask in Figure 7.3 responds best to lines oriented at +45; the third mask to vertical lines; and the fourth mask to lines in the -45 directions can also be established by noting that the preferred direction of each mask is weighted with larger coefficient i.e., 2 than other possible directions. Let R1, R2, R3 and R4 denote the responses of the masks in figure 2, from left to right, where the Rs are given by the equation 7.2. Suppose that all masks are run through an image. If, at a certain point in the image, |R i||Ri|, for all j 1, that point is said to be more likely associated with a line in the direction of mask i. For example, if at a point in the image, |Ri||Rj|, for j = 2, 3, 4, that particular point is said to be more likely associated with a horizontal line.

-1 2

-1 2

-1 2

-1 -1

-1 2

2 -1

-1 -1

2 2

-1 -1

2 -1

-1 2

-1 -1

-1

-1 ( a )

-1

-1

-1

-1 ( b )

-1

-1

-1 ( c )

(d ) Figure 7.3: Line masks (a) Horizontal (b) +45 (c) Vertical (d) -45 Q.2 Explain the Local Processing. Ans. One of the simplest approaches to linking edge points to analyze the characteristics of pixels in a small neighborhood (say, 3 x 3 or 5 x 5) about every pint ( x, y ) in an image that has undergone edge-detection. All points that are similar are linked, forming a boundary of pixels that share some common properties. The two principal properties used for establishing similarity of edge pixels in this kind if analysis are (1) the strength of the response of the gradient operator used to produce the edge pixel; and (2) the direction of the gradient vector. The first property is given by the value of f, the gradient. Thus an edge pixel with coordinates ( x0, y0) in a predefined neighborhood of ( x, y ) is similar in magnitude to the pixel at ( x, y ) if | Vf (x, y) - Vf (x0, y0) | The direction (angle) of the gradient vector is given by ( x, y ), an edge pixel at ( x0, y0) in the predefined neighborhood of ( x, y ) has an angle similar to the pixel at ( x, y ) if | f x, y) f x0, y0) |

Where a is a nonnegative angle threshold, as the direction of the edge at ( x, y ) is perpendicular to the direction of the gradient vector at the point. A point in the predefined neighbor of ( x, y) is linked to the pixel at ( x, y) if both magnitude and direction criteria are satisfied. This process is repeated at every location in the image. A record must be kept of linked points as the center of the neighborhood is moved from pixel to pixel. A simple book keeping procedure is to assign a different gray level to each set of linked edge pixels. Q.3 Explain the fundamentals of a compression. Ans. Fundamentals of a Compression Image compression addresses the problem of reducing the amount of data required to represent a digital image. The underlying basis of the reduction

process is the removal of redundant data. From a mathematical point of view point, this amounts to transforming a 2-D pixel array into a statistically uncorrelated data set. The transformation is applied prior to storage or transmission of the image. At some later time, the compressed image is decompressed to reconstruct the original image or an approximation of it. The term data compression refers to the process of reducing the amount of data required to represent a given quantity of information. Data are the means by which information is conveyed. Various amounts of data may be used to represent the same amount of information. If the data contains either irrelevant information or simply restate that which is already known, then it is said to contain data redundancy. Data redundancy is a central issue in digital image compression. It is not an abstract concept but a mathematically quantifiable entity. If n1 and n2 denote the number of information carrying units in two sets that represent the same information, the relative data redundancy RD of the first data set can be defined as RD = 1-1/CR Where CR, the compression ratio, is CR = n1/n2 [12.2-2] [12.2-1]

For the case n2 = n1, CR=1 and RD = 0, indicating that the first representation of the information contains no redundant data. When n 2 << n1 CR and RD 1, implying significant compression and highly redundant data. Finally, when n2 >> n1, CR 0, RD , indicating that the second data set contains much more data than the original representation. In general, CR and RD lie in the open intervals (0,) and (, 1) respectively. In digital image compression, three basic data redundancies can be identified and exploited; coding redundancy, inter pixel redundancy and psycho visual redundancy. Data compression is achieved when one or more of these redundancies are reduced or eliminated. Image Compression Models Figure 3 shows a compression system consisting of two distinct structural blocks: an encoder and a decoder. An input image f(x, y) is fed into the decoder, which creates a set of symbols from the input data. After transmission over the channel, the encoded representation is fed to the decoder, where a reconstructed output image @) (x, y) is generated. In general, @) (x, y) may or may not be an exact replica of f(x, y). If it is, the system is error free for information preserving; if not, some level of distortion is present in the reconstructed image.

Both the encoder and decoder shown in Figure (3) consist of two relatively independent functions or sub blocks. The encoder is made up of a source encoder, which removes input redundancies, and a channel encoder, which increases the noise immunity of the source encoders output. The decoder includes a channel decoder followed by a source decoder. If the channel between the encoder and decoder is noise free (not prone to error), the channel encoder and decoder are omitted, and the general encoder and decoder become the source encoder and decoder, respectively.

Figure 3: General compression system model.

Q.4 Explain about the region splitting and merging with example. Ans. Region Splitting and Merging Sub-divide an image into a set of disjoint regions and then merge and/or split the regions in an attempt to satisfy the conditions stated in section 4. Let R represent the entire image and select the predicate P. One approach for segmenting R is to subdivide it successively into smaller and smaller quadrant regions so that, for ant region Ri, P(Ri) = TRUE. We start with the entire region. If P(R) = FALSE, we divide the image into quadrants. If P is FALSE for any quadrant, we subdivide that quadrant into sub quadrants, and so on. This particular splitting technique has a convenient representation in the form of a so called quad tree (that is, a tree in which nodes have exactly four descendants), as shown in Figure 4.. The root of the tree corresponds to the entire image and each node corresponds to a subdivision. In this case, only R 4 is sub divided further. If only splitting were used, the final partition likely would contain adjacent regions with identical properties. This drawback may be remedied by allowing merging, as well as splitting. Satisfying the constraints of section 8.3.1 requires merging only adjacent regions whose combined pixels satisfy the predicate P. That is, two adjacent regions Rj and Rk are merged only if P (Rj Rk) = TRUE.

Figure 4: (a) partitioned image (b) Corresponding quad tree. 1. Split into four disjoint quadrants any region Ri for which where P (Ri) = FALSE 2. Merge any adjacent regions Ri and Rk for which P (Rj Rk) = TRUE. 3. Stop when no further merging or splitting is possible. Q.5 Explain thresholding and its foundation. Ans. Thresholding Thresholding is one of the most important approaches to image segmentation. The threshold can be treated as the class boundary. The idea can applied on the images that contain more than two types of regions. Evidently, the number of thresholds is equal to the number of classes minus one. It should be noted that though thresholding is the simplest method of image segmentation, it usually precedes the selection of appropriate features to obtain a useful result. Secondly, the selection of threshold is also a no-trivial task. Any in-appropriate threshold would incur significant and non acceptable error of classification. Foundation Suppose that the gray level histogram shown in Figure 5(a) corresponds to an image, f(x, y), composed of light objects on a dark background, in such a way that object and background pixels have gray levels grouped into two dominant modes. One obvious way to extract the objects from the background is to select a threshold T that separates these modes. Then, any point (x, y) for which f(x, y) > T is called an object point; otherwise, the point is called a background point. Figure 5(b) shows a slightly more general case of this approach. Here three dominant modes characterize the image histogram. The same basic approach classifies a point (x, y) as belonging to one object class if T 1 < f(x, y) T2, to the other object class if f(x, y) > T2, and to the background if f(x, y) T1. This type of multilevel thresholding is generally less reliable than its single-threshold counterpart. the reason is the difficulty of establishing multiple thresholds that

effectively isolate regions of interest, especially when the number of corresponding histogram modes is large. Typically, problems of this nature, if handled by thresholding, are best addressed by a single, variable threshold. Based on the preceding discussion, thresholding may be viewed as an operation that involves test against a function T of the form T = T [x, y, p(x, y), f(x, y] Where f (x, y) is the gray level of point (x, y) and p(x, y) denotes some local property of this point for example, the average gray level of a neighborhood centered on (x, y). A thresholded image g(x, y) is defined as

if 1........ .....f ( x, y) T g ( x, y) if 0........ .....f ( x, y) T


Thus pixels labeled 1 (or any other convenient intensity level) correspond to objects, whereas pixels labeled 0 correspond to the background. When T depends only on f(x, y), the threshold is called global. Figure 5(a) shows an example of such a threshold. If T depends on both f(x, y) and p(x, y), the threshold is called local. If T depends on the spatial coordinates x and y, the threshold is called dynamic.

T (a)

T1 (b)

T2

Figure 5: Gray-level histograms that can be partitioned by (a) single threshold and (b) multiple thresholds Q.6 Explain Global Thresholding. Ans. Global Thresholding Thresholding is one of the most important approaches to image segmentation. The threshold can be treated as the class boundary. The idea can applied on the images that cantain more than two types of regions. Evidently, the number of

thresholds is equal to the number of classes minus one. It should be noted that though thresholding is the simplest method of image segmentation, it usually precedes the selection of appropriate features t obtain a useful result. Secondly, the selection of threshold is also a no-trival task. An in-appropriate threshold would incur significant and non acceptables error of classification. The simplest of all thresholding techniques is to partition the image histogram by using a single global threshold, T. Segmentation is then accomplished by scanning the image pixel by pixel and labeling each pixel as an object or background, depending on whether the gray level of that pixel is greater or less than the value of T. The threshold can be specified by using a heuristic approach, based on visual inspection of the histogram. The following algorithm can be used to detain T automatically. 1. Select an initial estimate for T. 2. Segment the image using T. This will produce two groups of pixels: G 1 consisting of all pixels with gray level value > T and G 2 consisting of pixels with values T. 3. Compute the average gray level values 1and 2 for the pixels in the regions G1 and G2. 4. Computer a new threshold value : T = 1 + 2 2 5. Repeat steps-2 through 4 until the difference in T in successive iterations is smaller than a predefined parameter T0. When there is a reason to believe that the background and object occupy comparable areas in the image, a good initial value for T is the average gray level of the image. When objects are small compared to the area occupied by the background, then one group of pixels will dominate the histogram and the average gray level is not as good as the initial choice. A more appropriate initial value for T in cases such as this is a value midway between the maximum and minimum gray levels. The parameter T0 is used to stop the algorithm after changes become small in terms of this parameter. This is used when speed of iteration is an important issue. Q.7 Explain the HSI color model. Ans. The HSI Color Model: HSI sands for hue, saturation and intensity. When humans view a color object, it is described by its hue saturation and brightness. Hue is a color attribute that describes a pure color (pure yellow, orange, or red), whereas saturation gives a measure of the degree to which a pure color is diluted by white light. Brightness is a subjective descriptor that is practically impossible to measure. It embodies the achromatic notion of intensity and is one of the key factors in describing color sensation. Thus, intensity or gray level is a most useful descriptor of monochromatic images. This quantity definitely is measurable and

easily interpretable. The HSI model decouples the intensity component from the color carrying information (hue and saturation) in a color image. As a result, the HSI model is an ideal tool for developing image processing algorithms based on color descriptions that are natural and intuitive to humans, who are the developers and users of these algorithms. The RGB color model is ideal for image color generation, but its use for color description is much more limited. The saturation of a color increases as a functional distance from the intensity axis. In fact, the saturation of points on the intensity axis is zero, as evidenced by the fact that all points along this axis are gray. Figure 7(a) shows an RGB color cube whereas Figure 7(b) describes HSI color model. In order to see how hue can be determined also from a given RGB point, consider Figure 7(b), which shows a plane defined by three points (black, white and cyan). The fact that the black and white points are contained in the plane tells that the intensity axis also is contained in the plane. Furthermore, all points contained in the plane segment defined by the intensity axis and the boundaries of the cube have the same hue. All colors generated by three colors lie in the triangle defined by those colors. If two of those points are black and white and the third is a color point, all points on the triangle would have the same hue because the black and white components cannot change the hue. By rotating the shaded plane about the vertical intensity axis, we would obtain different hues. From these concepts, the hue, saturation and intensity values required to form the HSI space can be obtained from the RGB color cube. That, is, we can convert any RGB point to a corresponding point in the HSI color model by working out the geometrical formulas describing the reasoning outlined. The key point to keep in mind regarding the cube arrangement in Figure 7 and its corresponding HSI color space is that the HSI space is represented by a vertical intensity axis and the locus of color points that lie on planes perpendicular to this axis. As the planes move up and down the intensity axis, the boundaries defined by the intersection of each plane with the faces of the cube have either a triangular or hexagonal shape.

Figure 7: Conceptual relationships between the RGB and HSI color models

You might also like