You are on page 1of 30

DIP

Lecture 2
Image Processing Fields
Computer Graphics: The creation of images
Image Processing: Enhancement or other manipulation of the image
Computer Vision: Analysis of the image content
Image Processing is defined as a discipline in which both
the input and output of a process are images
A digital image is nothing more than a data matrix that
assigns both a value and a location to each picture element
(pixel) in the image. When considering the resolution of a
digital image we need to consider both of these aspects; value of the pixel
and its position in the data matrix
Spatial resolution is defined as the number of pixels used to create the
image or more simply the total number of pixels in the matrix.
Pixels vs. Dots per Inch
Pixel resolution refers to the number of pixels that comprise the image
regardless of how large the image is. A typical computer monitor has a 1024
x 768 pixel resolution whether it is a 15 or 21
Dots per Inch or DPI refers to the number of pixels per linear inch in the
final image. Thus the DPI of a given image decreases as the size of the final
image increases.
Grey level resolution refers to the range of values that each pixel might
have. The greater the value range, the greater the grey level resolution.
This is often referred to as the bit depth of an image and it limits the
range of pixel values that can be displayed in the image.
Digital Image Definition An image can be defined as a two-dimensional
function f(x,y) x,y: Spatial coordinate F: the amplitude of any pair of
coordinate x,y, which is called the intensity or gray level of the
image at that point.x,y and f, are all finite and discrete quantities.
Fundamental Steps in DIP: (Description)
Step 1: Image Acquisition
The image is captured by a sensor (eg. Camera), and digitized if the
output of the camera or sensor is not already in digital form, using
analogue-to-digital convertor
Step 2: Image Enhancement
The process of manipulating an image so that the result is more
suitable than the original for specific applications. The idea behind
enhancement techniques is to bring out details that are hidden, or simple to
highlight certain features of interest in an image.
Step 3: Image Restoration
- Improving the appearance of an image
- Tend to be mathematical or probabilistic models. Enhancement, on the
other hand, is based on human subjective preferences regarding what
constitutes a good enhancement result.
Step 4: Colour Image Processing
Use the colour of the image to extract features of interest in an image
Step 5: Wavelets
Are the foundation of representing images in various degrees of
resolution. It is used for image data compression.
Step 6: Compression
Techniques for reducing the storage required to save an image or the
bandwidth required to transmit it.
Step 7: Morphological Processing
Tools for extracting image components that are useful in the
representation and description of shape.
In this step, there would be a transition from processes that output
images, to processes that output image attributes.
Step 8: Image Segmentation
Segmentation procedures partition an image into its constituent parts
or objects.Important Tip: The more accurate the segmentation, the more likely
recognition is to succeed.
Step 9: Representation and Description
- Representation: Make a decision whether the data should be represented
as a boundary or as a complete region. It is almost always follows the
output of a segmentation stage.
- Boundary Representation: Focus on external shape characteristics,
such as corners and inflections
- Region Representation: Focus on internal properties, such as
texture or skeleton shape
Step 9: Recognition and Interpretation
Recognition: the process that assigns label to an object based on the
information provided by its description.
Step 10: Knowledge Base
Knowledge about a problem domain is coded into an image processing
system in the form of a knowledge database.
Components of an Image Processing System
1. Image Sensors:Two elements are required to acquire digital images. The
first is the physical device that is sensitive to the energy radiated
by the object we wish to image (Sensor). The second, called a
digitizer, is a device for converting the output of the physical
sensing device into digital form.
2. Specialized Image Processing Hardware
Usually consists of the digitizer, mentioned before, plus hardware that
performs other primitive operations, such as an arithmetic logic unit (ALU),
which performs arithmetic and logical operations in parallel on entire
images.
This type of hardware sometimes is called a front-end subsystem, and its most
distinguishing characteristic is speed. In other words, this unit performs
functions that require fast data throughputs that the typical main computer
cannot handle.
3. Computer
The computer in an image processing system is a general-purpose computer and
can range from a PC to a supercomputer. In dedicated applications, sometimes
specially designed computers are used to achieve a required level of
performance.
4. Image Processing Software
Software for image processing consists of specialized modules that
perform specific tasks. A well-designed package also includes the
capability for the user to write code that, as a minimum, utilizes the
specialized modules.
5. Mass Storage Capability
Mass storage capability is a must in a image processing applications.
And image of sized 1024 * 1024 pixels requires one megabyte of storage space
if the image is not compressed.Digital storage for image processing
applications falls into three principal categories:
1. Short-term storage for use during processing.
2. on line storage for relatively fast recall
3. Archival storage, characterized by infrequent access
One method of providing short-term storage is computer memory. Another is by
specialized boards, called frame buffers, that store one or more images and
can be accessed rapidly.
The on-line storage method, allows virtually instantaneous image zoom,
as well as scroll (vertical shifts) and pan (horizontal shifts). On-line
storage generally takes the form of magnetic disks and optical-media storage.
The key factor characterizing on-line storage is frequent access to the
stored data.Finally, archival storage is characterized by massive storage
requirements but infrequent need for access.
6. Image Displays
The displays in use today are mainly color (preferably flat screen) TV
monitors. Monitors are driven by the outputs of the image and graphics
display cards that are an integral part of a computer system.
7. Hardcopy devices
Used for recording images, include laser printers, film cameras, heat-
sensitive devices, inkjet units and digital units, such as optical and CD-Rom
disks.
8. Networking
Is almost a default function in any computer system, in use today.
Because of the large amount of data inherent in image processing applications
the key consideration in image transmission is bandwidth.

In dedicated networks, this typically is not a
problem, but communications with remote sites via the
internet are not always as efficient.
Lecture 3
Image is a spatial representation of an object or a
scene. (image of a person, place, object)
Graphic is a broader and general definition which
includes:
Pictures or Photographs, Drawings or Line arts, Clip arts,
Buttons and Banner, Charts and graphs, Backgrounds, Icons
Pictures:
Pictures are found in the world which is external to the computers.
Images / Graphics:
Images are the 2-Dimensional digital representations of pictures found
in computers.
Types of Graphics : There are two basic types of graphics:
1.Bitmapped Graphics 2. Vector Graphics
Bitmap graphics
The most common and comprehensive form of storage for images on
computers is bitmap image.
Bitmap use combination blocks of different colours (known as pixels) to
represent an image. Each pixel is assigned a specific location and
colour value.
There are also called pixelized graphics.
The bitmapped graphic is stored as an array of dots, or pixels Each
pixel gets assigned a specific color
The more pixels you have, the more detailed the image can be
Some common bitmap graphics programs are:
Photoshop ,Paint Shop Pro ,GIMP,Photo-Paint,Graphic Converter
These are paint programs
Very flexible Any image can be represented (with enough pixels) Created by
scanners, digital cameras, and other similar devices Used most commonly,
especially on the web .Takes a lot of memory.The color of each pixel must be
stored.Can be compressed.Exaggerated Example of a Bitmap Image
Bitmap graphics
Advantage
Can have different textures on the drawings; detailed and
comprehensive.
Disadvantage
Large file size.
Not easy to make modification to objects/drawings.
Vector Graphics
The second major type of computer graphics.Vector graphics are created
and manipulated using drawing programs
Instead of using pixels to describe the image, it describes the image
using shapes
Circles ,Lines, Curves
Also has to store the color of these shapes
A verbal example would be something like:
A yellow circle with a center here and a radius of x, a purple
line from here to here
The programs used with vector graphics are drawing programs
Some of these programs include:
Corel Draw, Adobe Illustrator, Acrobat
Vector Images: Easy to change parts of the image since each part is
stored as a different shape
Can manipulate the image smoothly
Rotating, changing color, size, line width, etc.
Typically takes less memory and disk space than a bitmap
Must be converted to a bitmap to display
Vector Graphics
Advantage: Small file size.
Easy to edit the drawings as each object is independent of the
other.
Disadvantage
Objects/drawings cannot have texture; it can only have plain
colours; limited level of detail that can be presented in an
image.
Bitmap vs. Vector Images
Bitmap and vector images are obviously different
Both have strengths and weaknesses
They dont manipulate images in the same way
They dont store images in the same way
The images are edited differently
Color Depth
Image formats will only allow a certain number of different colors for
each pixel
Depends on the number of bits assigned to each pixel
The format stores the level of each of the three values for RGB to
specify color
R,G and B are referred to as components
Bit depth describes the number of colors that can be used in the image
Bit Depth:24-bit image allows you to specify up to 2
24
different colors
in your image
The 24 in 24-bit says how many bits it takes to store a single color
Transparency
Allows part of the image to be transparent The background will show through
Often used with graphics on web pagesNot all image formats support
transparency
Resolution:There are three types of resolution measuring different
aspects of the quality, detail and size of an image:
Colour resolution, Image resolution, Display resolution
Image Resolution:
The term resolution often associated with an images degree of detail or
quality.
Display Resolution:
Resolution can also refer to quality capability of graphic output (monitor).
Colour Resolution / Colour Depth:
Colour depth describe the number of bits used to represent the colour of a
single pixel.
Image resolution measures the pixel dimension of an overall image or
how many pixel the image has.Image resolution is measured in width and
height.For example, 100 * 100-pixel image has a total of 10,000 pixels.
Display resolution
Display resolution is also measured in pixels in terms of height and
width.It simply means how many pixels can be displayed on the computer
screen.Display resolution normally uses a setting of 640x480(VGA),
800x600 (SVGA), 1024x768, etc.You can change the display resolution
under Display Properties in Control panel.If your image resolution is
bigger than the display resolution, the result would be part of the
image will be out of the display area.
Colour Resolution/Colour Depth
Each pixel can represent at least 2 possible colours or more.Colour
resolution or Colour depth/channel depth is measured in bits.Memory
Storage RequirementsMemory/Storage requirementFactors to consider:
The height of the graphics The width of the graphics The colour depth or bit
depth.The file size of a bitmap image (in bytes):
Height X Width X (Colour depth / 8)
Types of Images
Binary (Bitonal) Image
These images have two possible values of pixel intensities: black and white.
Also called 1-bit monochrome image, since it contains only black and
white.Typical applications of bitonal images include office/business
documents, handwritten text, line graphics, engineering graphics etc. The
scanned output contains a sequence of black or white pixels. Binary 1
represents a black pixel and binary 0 represents a white pixel.
Grayscale Image
They contain several shades of grey. Typical applications of grayscale
images include newspaper photographs (non-color), magnetic resonance images
and cat-scans. An uncompressed grayscale image can be represented by n bits
per pixel, so the number of gray levels supported will be 2
n
.For example, 8-
bit Grayscale Image. It consists of 256 gray levels. A dark pixel might have
a pixel value of 0, a bright one might be 255.
Colour Image
They are characterized by the intensity of three primary colours (RGB). For
example, 24-bit image or 24 bits per pixel. There are 16,777,216 (2
24
)
possible colours. In other words, 8 bits for R(Red), 8 bits for G(Green), 8
bits for B(Blue).Since each value is in the range 0-255, this format supports
256 x 256 x 256 or 16,777,216 different colours.
RGBA / 32-bit images
An important point: many 24-bit colour images are actually stored as 32-bit
images, with the extra byte of data for each.Allows RGBA colour scheme; Red,
Green, Blue, Alpha.Pixel used to store an alpha value representing the degree
of transparency.
Colour Dithering
Usually, digitised images are 24 bit, 16 million colour depth.If display
system is limited to less than 16 million colours, the image must be
transformed for display in the lesser colour environment (colour dithering).
Colour Dithering the process through which colours are changed to meet the
closest available colour.
Colours are substituted with closest available colours (output device).The
quality of dithering will depend on the algorithm used to select the closest
colour.
Graphic File Formats,File Formats
Once weve chosen which graphics format were going to use now we have to
select which file format
Each way of storing an image is called a file format.Each file format
converts the image to a corresponding string of bits differently in order to
store them on disk or transmit them over the Internet.Its beyond the scope
of this course to discuss how these images are convertedBut, we will discuss
the common file formats
Common File Formats
Bitmap Formats
GIF: graphics interchange format,JPEG: joint photographic experts group
PNG: portable network graphic,BMP: Windows bitmap,TIFF: tagged image file
format
Vector Formats
SVG : scalable vector graphics, EPS: encapsulated postscript
CMX: Corel meta exchange,PICT: Macintosh Picture,WMF: Windows metafile
Other Graphic File formats
RAW Graphics File Format (.raw)
A flexible basic file format for transferring files between applications and
computer platforms. This format consists of a stream of bytes describing the
colour information in the file.
Tagged Image File Format (.tif, .tiff)
TIFF is mainly used for exchanging documents between different applications
and different computer platforms. It supports the LZW method compression for
image types.
Truevision Targa (.tga)
Developed by Truevision Inc. TGA files is a file format that will support
images suitable for display on Targa hardware but is supported by many
applications on a wide range of platforms.
Z Soft Paintbrush (.pcx)
Bitmap graphics file format, originally developed by Z-Soft for use with PC-
Paintbrush. This file format is now used and generated by many applications
and scanners.
Image Compression is used to make files smaller
Bitmap images, especially, can be quite large if stored without compression
Ex. A 640 x 480 pixel image with 24-bit color depth would take:
640 * 480*24 bits = 7372800 bits = 900kB
Windows bmp format is uncompressedShould NEVER be used online
There are two main types of compression:
Lossy and Lossless
When an image is compressed and later uncompressed with a lossless algorithm,
the image is exactly the same Most common type,No information is lost
Lossy algorithms produce images which may be slightly different before
compression and after uncompressing Some information may be lost Can
produce sometimes produce much smaller files than a lossless
algorithmCan be used when the difference isnt very noticeableOften
takes into account human anatomy We cant tell the difference between
certain values of color JPEG The JPEG format was designed for
photographs at high quality and in full color
Lossy compression format
Normally changes cant be seen except at the lowest quality settings
Good for 8-bit grayscale GIF and PNG GIF uses LZW compression algorithm
developed by Unisys Its patented (or was patented) PNG was created as
an alternative to GIF FREE!!!!
Both are used for the web frequently For this course, you should try
and use PNG instead of GIF
Considered better algorithm ,No patent issues
Common File Formats for the Web
The three most common file formats on the web are:
JPEG (.jpg) GIF, (.gif) ,PNG (.png)
Most programs have their own file format called a native format
These formats are unique to the program
Good because you can save without losing any information
They use no compression or lossless compression
JPEG : Joint Photographic Experts Group .jpg is the file extension
(normally)
Bitmap format Lossy compression 24-bit color or 8-bit gray-scaleLike
youd get from a digital camera or scanner
GIF : Graphics Interchange Format .gif file extension
Bitmap format Lossless compression
Used a lot on the web Good for simple animations
Limited to 8-bit color (256 colors)
Compression algorithm is patented
PNG : Portable Network Graphic
Bitmap format
Lossless compression Replacement for GIF (no patent!)
Better compression algorithm than GIF Supports 24-bit color
How to Pick a File Format?
Type: does the file format store the type of image you want to use
(bitmap or vector)?
Portability: can other people use images in this format?
Color Depth: does the format support the number of colors you need?
Compression: can make the file smaller, but it takes time to compress
and decompress a file
Transparency: do parts of your image need to be transparent?
Summary
1. Types of graphics
- bitmap graphic
- vector graphic
2. Resolution
- Image resolution
- Display resolution
- Colour resolution
3. Graphic file size = Height X Width X (Colour depth / 8)
4. Types of images
- Binary/Bitonal image
- Grayscale image
- Colour image
- RGBA/32-bit image
Summary
6. Colour Dithering
7. Graphics File Formats
8. Image Compression
~END~
Lecture 4: Light and Electromagnetic Perception
What is perception? Perception is about perceiving.
What tools do we have for perceiving? We see, smell, hear, feel, taste,
What tools do we have for perceiving?
Eyes, ears, nose, tongue, skin,
eyes see,
ears hear,
nose smell,
tongue taste,
skin feel (temperature)
skin feel (touch)
What do eyes, a nose, ears, a tongue, and skin do?
Eye:Ear:Nose: Tongue:Skin:
What does the eye do? Seeing,
yes but what is seeing?
detecting light What is light?
electromagnetic energy
What is electromagnetic energy?Electro + magnetic + energy =
electromagnetic energyElectromagnetic energy is a stream of photons.What
are photons?
They are massless particles each traveling in a wave-like pattern and
moving at the speed of light. The smallest (quantum) unit of
light/electromagnetic energy. It is the carrier of electromagnetic
radiation of all wavelengths
such as gamma rays, X-rays, ultraviolet light, visible light, infrared
light, microwaves, and radio waves.
So what is seeing?
Detecting some form of the movement of photons (electromagnetic
radiation).What does the ear do? Hears sound
What is sound? the vibration of air.
What is the vibration of air?Where do they come from?
Air moves when something else moves
What does the nose do?Nose: is for smelling Where does smelling come
from?
Smelling is a sensation caused by odorant molecules dissolved in
Taste How do we get that?
From our tongues.Where do we get that?From foods we eat.From
specific chemicals. What does the skin do?
Detect temperature:What is temperature?
It is about how hot or cold something is.
How does something get hot or cold?
Temperature is the result of the motion of particles which
make up a substance. Temperature increases as the energy of
this motion increases.
What does the skin do?
feels touching
It responds to mechanical stimulation or pressure
So, what is perception?
Eye, ear, nose, tongue & skin
They are sensors.
They are detecting some kind of changes in an environment.
What is perception?
Perception involves:
Detecting the information in the environment.
Sending the information to the brain, and interpreting it.
Perception is about
Detecting and interpreting
Measuring Perception
Psychophysical level of analysis
Description
Let a person describe what they see
Recognition
Detection
Absolute threshold is the smallest amount of stimulus energy
necessary to detect a stimulus.E.g., eye exam
Difference threshold
is the smallest difference between two stimuli that a
person can detect.
Electromagnetic perception
Electromagnetic Perception is the ability to perceive the entire
electromagnetic spectrum, from radio waves to light waves to electrical
energy; one with this ability can even sense disturbances in the Sun
and Earth's electromagnetic fields
What do we know so far about electromagnetic waves?
Electromagnetic waves (EM waves) carry energy.
EM waves can travel through empty space OR matter.
ALL EM waves are transverse waves.
What more is there to know about EM waves?
EM waves have a wide range of wavelengths/frequencies.
Each wavelength/frequency of EM waves is a different type of energy.
gamma rays, X-rays, ultraviolet radiation, visible light, infrared
radiation, microwaves, television/radio waves
Most of the EM energy that makes it to the surface of Earth is in the
middle range of the electromagnetic spectrum:
What is ultraviolet radiation?
Ultraviolet radiation is the range of EM waves with frequencies higher
than violet on the visible spectrum. (ULTRA-violet)
They have shorter wavelengths, which means higher frequencies-so
because of this we know they have more energy than visible light.
Because of the high energy of ultraviolet radiation, too much exposure
is damaging to the eyes and skin.
What is visible light?
Visible light is the range of EM waves that can be detected by the
human eye.
The entire range of visible light is called the visible light spectrum.
Violet (highest frequencies/shortest wavelengths/highest energy)
Indigo Blue Green (middle frequencies/middle wavelengths/middle energy)
Yellow Orange
Red (lowest frequencies/longest wavelengths/lowest energy)
The visible light spectrum is the middle range of
wavelengths/frequencies of EM spectrum.
The longer the wavelength, the lower the energy; the shorter the
wavelength, the higher the energy.
Red has lowest energy; violet has the highest energy.
Why do we see color based on the visible light spectrum?
The human eye reacts to different energies and frequencies of light so
that different colors are seen.
Higher frequencies (shorter wavelengths) are perceived as colors toward
the blue-violet range and have higher energy.
Lower frequencies (longer wavelengths) are perceived as colors toward
the orange-red range and have lower energy.
The reason we see stuff in color:
Most materials absorb light of some frequencies and reflect the rest.
If a material absorbs a certain frequency of light, that frequency will
not be reflected, so its color will not be perceived by the observer.
If the material does not absorb a certain frequency of light, that
frequency will be reflected, so its color will be perceived by the
observer.
If all colors of light are reflected by a material, it will appear
white. If all colors of light are absorbed by a material, it will
appear black.
Color filters allow only certain colors of light to pass/transmit
through them; they absorb or reflect all other colors.
For example, a blue filter only transmits blue light.
Objects seen through a blue filter will look blue if the objects
reflect blue.Objects of other colors will appear black because the
other color wavelengths are being absorbed by the filter.
What is infrared radiation?
Infrared radiation is the range of electromagnetic waves with
frequencies lower than red on the visible spectrum, thereby having
longer wavelengths and less energy than red wavelengths.All objects
emit infrared radiation, and hotter objects emit more infrared
radiation than cooler objects.Heat energy is transmitted by infrared
radiation.When objects absorb infrared radiation, they become
warmer.Reflected Light
The colours that we perceive are determined by the nature of the light
reflected from an objectFor example, if white light is shone onto a
green object most wavelengths are absorbed, while green light is
reflected from the object
Lecture 5: Visual Perception
CORNEA light enters through the cornea and is bent onto the lens.
PUPIL light passes through the SPACE called the pupil.
IRIS a band of muscle that contracts/expands to manage the amount of
light entering the pupil and hitting the lens.
LENS the lens bends the light more and focuses them on the retina
in particular, on the FOVEA. The lens can bulge or stretch to help the
light reach the fovea.
RETINA registers the electromagnetic energy, processes the incoming
information and transduces the energy to a form the brain can
interpret.
BLIND SPOT where the optic nerve attaches to the retina no
photoreceptors.
PHOTORECEPTORS 2 types:
Rods 125 Million
Cones 6.5 Million
The lens focuses light from objects onto the retina
The retina is covered with
light receptors called
cones (6-7 million) and
rods (75-150 million)
Cones are concentrated
around the fovea and are
very sensitive to colour
Rods are more spread out
and are sensitive to low levels of illumination
Rods VS Cones
RODS
Black and white vision
Operate well in low level light night vision
Sensitive to brightness, darkness and movement
Mainly located in outer part of retina
low sharpness and focus
19x the number of cones
CONES
Daytime and colour vision
Excellent sharpness and clean images
Concentrated in the fovea
Not useful at night
cant discriminate
colours
Outnumbered by rods 19:1
Muscles within the eye can be used to change the shape of the lens
allowing us focus on objects that are near or far away
An image is focused onto the retina causing rods and cones to become
excited which ultimately send signals to the brain
Visual sensation is the (passive) process of detecting input.
Example: Eyes receive visual sensations and turn them into neural
impulses. The brain perceives or translates it into usable information.
Visual Sensation Stages
Reception eye senses a stimulus
Transduction changes it so brain can understand it
Transmission sends it to the visual cortex
Reception or capture of visual stimuli in the retinas of the eye by
sensory neurons called photoreceptors.
Electromagnetic energy (light) sensed from the environment
RODS (black and white) and CONES (C for colour) are
photoreceptors that pick up the electromagnetic energy
This energy must be CHANGED for the brain to be able to interpret
it.
Transduction or conversion of electromagnetic energy (light energy)
into electrochemical energy (or neural impulses) by photoreceptors
The electromagnetic energy is converted, changed or transduced
into electrochemical energy.
This means that information sensed by the rods and cones can be
sent as neural impulses to the visual cortex.
Transmission of neural impulses, via one neuron to another, through the
optic nerve to the brain.
Neural impulses once triggered are sent to visual cortex
They are sent via the optic nerve.
Visual Perception
This area of study focuses on perception and the general
characteristics of the visual sensory system.
It is the (active) ability to interpret input
Visual Perception Stages
Selection aspects selected of stimulus
Organisation grouping of elements to form a whole
Interpretation given meaning with the aid of psychological factors
Selection of features of a stimulus by specialised neurons in the visual
cortex called feature detectors.
This is where discrimination and identification of the FEATURES
of the stimulus takes place.
Feature detectors are cells that are in the retina, optic nerve
and visual cortex, that respond to patterns, lines, edges and
angles
Organisation of the stimulus features into patterns or groupings to
closely represent the original stimulus in a meaningful way.
This can only happen once the brain has received the neural
impulses.
Single elements are grouped to form a whole, using perceptual
principles that work like rules of organisation.
These principles are called Gestalt principles.
Interpretation or understanding of what the stimulus represents in the
external world.
Interpretation involves the brain using psychological factors in order to
make sense of what it is considering.
Absolute Threshold
- is the lowest level of a stimulus light, sound, touch, etc.that an
organism can detect.
-Physiological
- Visual sensation is the same for everyone
It is our physiological make up of the eye and the
way it functions
Reception
Transduction
Transmission




- Psychological and physiological
-Visual perception differs as everyone perceives
and interprets things differently
-When studying Visual Perception it is difficult to
say where one starts and the other begins so we
see it as an interrelated process.
-Selection (also occurs in sensation through
reception process)
-Organisation
-Interpretation
- - For example, the absolute threshold for the visual sensory system
enables us to see a candle flame at 50 kilometres .
- Differential Threshold
- is the smallest difference in the amount of a given stimulus that a
specific, sense can detect.
- For example, the differential threshold enables us to perceive
difference in two separate stimuli (i.e. When we go to an optometrist).
When a change can no longer be detected it has exceeded our threshold.
Interpretation
Visual cortex: part of brain responsible of processing visual
information.
Photoreceptors: a specialized type of neuron found in eye retina
capable of phototransduction.
Neuron: electrically excited cell that processes and transmits
information through electrical and chemical signals.
Neural impulse: electric discharge that travels along nerve fiber.
Physiological: of the body.
Pshychological: of the brain.
Stimulus: it is an energy pattern (light or sound) which is registered
by the senses.
~END~
Lecture 6: Image representation, sensing, acquisition, sampling and
quantization
Image representation Image sensing Image acquisition Image
sampling Image quantisation
A digital image is composed of M rows and N columns of pixels
each storing a value
Pixel values are most
often grey levels in the
range 0-255(black-white)
Sampling and quantization will convert a continuous
image signal fto a discrete digital form
Digitizing the coordinate values is called sampling
Digitizing the amplitude is called quantization
Image Sensing
Process in which an optical image is converted into an electronic
signal.
Image sensors are used in digital cameras and other camera modules.
Incoming energy lands on a sensor material responsive to that type of
energy and this generates a voltage
Collections of sensors are arranged to capture images
Image Acquisition
Images are typically generated by illuminating a scene and absorbing
the energy reflected by the objects in that scene
The first stage of any vision system is the image acquisition stage.
After the image has been obtained, various methods of processing can be
applied to the image to perform the many different vision tasks
required today.
However, if the image has not been acquired satisfactorily then the
intended tasks may not be achievable, even with the aid of some form of
image enhancement
2D Image Input
2D Input Devices
Frame Stores
The 3D Image -- Depth Maps
Why use 3D data?
Methods of Acquisition
2D Image Input
The basic two-dimensional image is a monochrome (greyscale) image which
has been digitized.
Describe image as a two-dimensional light intensity function f(x,y)
where x and y are spatial coordinates and the value of f at any point
(x, y) is proportional to the brightness or grey value of the image at
that point.
A digitized image is one where spatial and greyscale values have been
made discrete.
intensity measured across a regularly spaced grid
in x and y directions
Each element in the array is called a pixel (picture element).
TV Camera
A first choice for a two-dimensional image input device may be a television
camera -- output is a video signal
Image focused onto a photoconductive target.
Target scanned line by line horizontally by an electron beam
Electric current produces as the beam passes over target.
Tap current to give a video signal.
Limited resolution and distortion problems.
CCD Camera
By far the most popular two-dimensional imaging device is
the charge-coupled device (CCD) camera.
Single IC device
Consists of an array of photosensitive cells
each cell produces an electric current dependent on the incident
light falling on it.
Video Signal Output
Less geometric distortion
More linear Video output.
Frame Stores
A device known as a frame store usually performs this task. It:
Digitizes the incoming video signal
Samples signal into discrete pixels at appropriate intervals -- line by line.
Samples signal into a digital value.
Stores sample frame own memory.
Frame easily transferred to computer memory or a file.
3D Image -- Depth Maps
The simplest and most convenient way of representing and storing the
depth measurements taken from a scene is a depth map.
A depth map is a two-dimensional array where the x and y distance
information corresponds to the rows and columns of the array as in an
ordinary image, and the corresponding depth readings (z values) are
stored in the array's elements (pixels). Depth map is like a grey scale
image except the z information (float - 32 bytes) replaces the
intensity information
Why use 3D data?
An 3D image containing has many advantages over its 2D
counterpart:
2D images give only limited information the physical shape and size of an
object in a scene.
3d images express the geometry in terms of three-dimensional coordinates.
e.g.; Size (and shape) of an object in a scene can be straightforwardly
computed from its three-dimensional coordinates.
Methods of Acquisition
Laser Ranging Systems
Laser ranging works on the principle that the
surface of the object reflects laser light
back towards a receiver which then measures the time (or phase
difference) between transmission and reception in order to
calculate the depth.
Work at long distances (greater than 15m)
Consequently their depth resolution is inadequate for detailed
vision tasks.
Shorter range systems exist but still have an inadequate depth
resolution (1cm at best) for most practical industrial vision
purposes
Image Sampling and Image quantization
In order to become suitable for computer processing, an image function
f(x,y) must be digitized both spatially and in amplitude.
Typically, a camera is used to digitize an image. A frame grabber or
digitizer is used to sample and quantize the analog video signal and
store it in the so called frame buffer.
The sampling rate (or pixel clock) determines the spatial resolution of
the digitized image, while the quantization level determines the number
of grey levels in the digitized image
Spatial resolution:
- # of samples per unit length or area.
Gray level resolution:
- Number of bits per pixel
A continuous image f(x,y) is normally approximated by equally spaced
samples arranged in the form of an N x M array where each element of
the array is a discrete quantity.
The sampling rate (or pixel clock) of the digitizer determines
the spatial resolution of the digitized image.
The finer the sampling (i.e., the larger M and N) the better the
approximation of the continuous image function f(x,y).
A digital sensor can only measure a limited number of samples at a
discrete set of energy levels.
Sampling limits established by no. of sensors.
Image quantization
A magnitude of the sampled image is expressed as a digital value in
image processing.
The transition between continuous values of the image function
(brightness) and its digital equivalent is called quantitation.
The number of quantitation levels should be high enough for human
perception of fine shading details in the image.
Image quantization
The occurrence of false contours is the main problem in image which
have been quantized with insufficient brightness levels.
This effect arises when the number of brightness levels is lower than
that which humans can easily distinguish.
This number is dependent on many factors -- for example, the average
local brightness -- but displays which avoids this effect will normally
provide a range of at least 100 intensity levels
Image quantization
Most digital image processing devices use quantitation into k equal
intervals.
If b bits are used ... the number of brightness levels is
Eight bits per pixel are commonly used, specialized measuring devices
use 12 and more bits per pixel.
Quantization limits by color levels.
Quantisation is the process of converting a continuous analogue signal
into a digital representation of this signal
Lecture 7: Basic Transformations
Table of Contents
1. General Concepts
2.Translation 3.Scaling 4.Rotation 5.Concatenation
4.Basic Idea
In many imaging systems, the
acquired images are subject to geometric distortion.
Applying some basic transformation to a uniformly distorted image can
correct for a range of perspective distortions by transforming the
measurements from the ideal coordinates to those actually used.
For example, this is useful in satellite imaging where geometrically
correct ground maps are desired.
2D Geometric Transforms
An affine transformation is an important class of linear 2-D geometric
transformations which maps variables (e.g. pixel intensity values
located at position in an input image) into new variables (e.g. in an
output image) by applying a linear combination of basic operations.
Two dimensional geometric transforms are used to rotate, shear,
translate or zoom ( scale) on whole images or sometimes on parts of
them.
Remarks
These operations are very common in computer
graphics.
Any linear operation can be written in matrix
form and using homogeneous coordinates.
We will consider the following basic transformations
in a 3D cartesian coordinate system.
Translation
Suppose that the task is to translate a point with coordinates (X,Y,Z)
to a new location by using displacements (X o ,Y o ,Z o ).
The translation is easily accomplished by using the equations:
X * = X + X o
Y * = Y + Y o
Z * = Z + Z o
The three equations may be
expressed in matrix form by
writing:
To be able to concatenate
several transformations, the
use of square matrices
simplifies the notational
representation of this process. With this in mind, the above equation
can be written as follow:
This will create a unified matrix representation of
v * = Tv. Where T is a 4 x 4 transformation matrix,
v is the column vector containing the original
coordinates, and v * is a column vector whose
components are the transformed coordinates. With this
notation, matrix T for translation is:
Example of Translation
Scaling
The scale operator performs a geometric transformation
which caused to shrink or zoom the size of an image (or
part of an image).
Image reduction and zooming are the two image scaling
operations.
Image reduction, commonly known as subsampling, is performed by
replacement (of a group of pixel values by one arbitrarily chosen pixel
value from within this group) or by interpolating between pixel values
in a local neighborhoods
Image zooming is achieved by pixel replication or by
interpolation
The subsampling, for example, can be performed by :
(a) Replacement with upper left pixel.
(b) Interpolation using the mean value.
Scaling by factors Sx , Sy , and Sz along the X,
Y, and Z axes is given by the transformation
matrix
Example of Scaling
Rotation
The transformations used for 3-D rotation are
inherently more complex.
The simplest form of these transformations is
for rotation of a point about the coordinate
axes.
Rotation of a point about the Z coordinate axis
by an angle is achieved by using the
transformation:
Rotation of a point about the X coordinate axis by an angle a is
performed by using the transformation:
Finally, rotation of a point about the Y axis by an angel B is achieved
by using the transformation:
Concatenation
The application of several transformations can be represented by a
single 4 x 4 transformation matrix.
For example, translation, scaling, and rotation about the Z axis of a
point v is given by:

Where A is the 4 x 4 matrix A = ST.
Matrices after concatenation generally do not commute,
so the order of application is important.

You might also like