You are on page 1of 52

Contents

1. ABSTRACT...............................................................................2
2. INTRODUCTION........................................................................3
3. FEASILBILITY STUDY...............................................................23
4. SYSTEM PROFILE....................................................................29
5. TOOLS AND TECHNOLOGY.......................................................30
6. SYSTEM FLOW DIAGRAM.........................................................33
7. TESTING................................................................................37
6. FORM LAYOUT........................................................................43
7. CONCLUSION..........................................................................49
8. FUTURE ENHANCEMENTS........................................................51
9. BIBLIOGRAPHY.......................................................................52

1. Abstract
What is Steganography?
Steganography is the practice of hiding private or sensitive information within something that
appears to be nothing out to the usual. Steganography is often confused with cryptology because
the two are similar in the way that they both are used to protect important information. The
difference between two is that steganography involves hiding information so it appears that no
information is hidden at all. If a person or persons views the object that the information is hidden
inside of he or she will have no idea that there is any hidden information, therefore the person
will not attempt to decrypt the information.
What steganography essentially does is exploit human perception, human senses are not trained
to look for files that have information inside of them, although this software is available that can
do what is called Steganography. The most common use of steganography is to hide a file inside
another file.

2. Introduction
One of the reasons that intruders can be successful is that most of the information they acquire
from a system is in a form that they can read and comprehend. Intruders may reveal the
information to others, modify it to misrepresent an individual or organization, or use it to launch
an attack. One solution to this problem is, through the use of steganography. Steganography is a
technique of hiding information in digital media. In contrast to cryptography, it is not to keep
others from knowing the hidden information but it is to keep others from thinking that the
information even exists.
Steganography is the art of hiding the fact that communication is taking place, by hiding
information in other information. Many different carrier file formats can be used, but digital
images are the most popular because of their frequency on the Internet. For hiding secret
information in images, there exists a large variety of steganographic techniques some are more
complex than others and all of them have respective strong and weak points. Different
applications have different requirements of the steganography technique used. For example,
some applications may require absolute invisibility of the secret information, while others
require a larger secret message to be hidden. This project intends to give an overview of image
steganography, its uses and techniques. It also attempts to identify the requirements of a good
steganographic algorithm and briefly reflects on which steganographic techniques are more
suitable for which applications.
Although, Steganography is not to be confused with Encryption, which is the process of making
a message unintelligibleSteganography attempts to hide the existence of communication. The
basic structure of Steganography is made up of three components: the carrier, the message,
and the key. Carrier is also known as cover-object, in which the message is embedded and serves
to hide the presence of the message. The carrier can be a painting, a digital image, an mp3, even
a TCP/IP packet among other things. It is the object that will carry the hidden message. A key is
used to decode/decipher/discover the hidden message. This can be anything from a password, a
pattern, a black-light, or even lemon juice. In this project we will focus on the use of

Steganography within digital images (BMP) using LSB Substitution, although the properties of
Image Steganography may be substituted with audio mp3s, zip archives, and any other digital
document format relatively easily.
Basically, the model for steganography is shown in Figure 1. Message is the data that the sender
wishes to remain it confidential. It can be plain text, ciphertext, other image, or anything that can
be embedded in a bit stream such as a copyright mark, a covert communication, or a serial
number. Password is known as stego-key, which ensures that only recipient who knows the
corresponding decoding key will be able to extract the message from a cover-object. The coverobject with the secretly embedded message is then called the stego-object.

Recovering message from a stego-object requires the cover-object itself and a corresponding
decoding key if a stego-key was used during the encoding process. The original image may or
may not be required in most applications to extract the message.

There are several suitable carriers below to be the cover-object:


Network Protocols such as TCP, IP and UDP
Audio that using digital audio formats such as wav, midi, avi, mpeg, mpi and voc
File and Disk that can hides and append files by using the slack space
Text such as null characters, just alike Morse code including html and java
Images file such as bmp, gif and jpg, where they can be both color and gray-scale.
In general, the information hiding process extracts redundant bits from cover-object. The process
consists of two steps:
Identification of redundant bits in a cover-object. Redundant bits are those bits that can be
modified without corrupting the quality or destroying the integrity of the cover object.
The embedding process then selects the subset of the redundant bits to be replaced with data
from a secret message. The stego-object is created by replacing the selected redundant bits with
message bits.

STEGANOGRAPHY vs. CRYPTOGRAPHY


Basically, the purpose of cryptography and steganography is to provide secret communication.
However, steganography is not the same as cryptography. Cryptography hides the contents of a
secret message from a malicious people, whereas steganography even conceals the existence of
the message. Steganography must not be confused with cryptography, where we transform the

message so as to make its meaning obscure to malicious people who intercept it. Therefore, the
definition of breaking the system is different. In cryptography, the system is broken when the
attacker can read the secret message. Breaking a steganographic system need the attacker to
detect that steganography has been used and he is able to read the embedded message.
In cryptography, the structure of a message is scrambled to make it meaningless and
unintelligible unless the decryption key is available. It makes no attempt to disguise or hide the
encoded message. Basically, cryptography offers the ability of transmitting information between
persons in a way that prevents a third party from reading it. Cryptography can also provide
authentication for verifying the identity of someone or something.
In contrast, steganography does not alter the structure of the secret message, but hides it inside a
cover-image so it cannot be seen. A message in cipher text, for instance, might arouse suspicion
on the part of the recipient while an invisible message created with steganographic methods
will not. In other word, steganography prevents an unintended recipient from suspecting that the
data exists. In addition, the security of classical steganography system relies on secrecy of the
data encoding system. Once the encoding system is known, the steganography system is
defeated.
It is possible to combine the techniques by encrypting message using cryptography and then
hiding the encrypted message using steganography. The resulting stego-image can be transmitted
without revealing that secret information is being exchanged. Furthermore, even if an attacker
were to defeat the steganographic technique and detect the message from the stego-object, he
would still require the cryptographic decoding key to decipher the encrypted message. Table 1
shows that both technologies have counter advantages and disadvantages.

HISTORY OF STEGANOGRAPHY
The word Steganography technically means covered or hidden writing. Its ancient origins
can be traced back to 440 BC. Although the term steganography was only coined at the end of
the 15th century, the use of steganography dates back several millennia.
Some examples of use of Steganography is past times are:
In ancient Greece, messages were hidden on the back of wax writing tables where someone
would peel off the wax that was or written on the stomachs of rabbits.
In Ancient Greece they used to select messengers and shave their head, they would then write a
message on their head. Once the message had been written the hair was allowed to grow back.
After the hair grew back the messenger was sent to deliver the message, the recipient would
shave off the messengers hair to see the secrete message.

During World War 2 invisible ink was used to write information on pieces of paper so that the
paper appeared to the average person as just being blank pieces of paper. Liquids such as milk,
vinegar and fruit juices were used, because when each one of these substances were heated they
darken and become visible to the human eye. Invisible ink has been in use for centuriesfor fun
by children and students and for serious espionage by spies and terrorists.
Cryptography became very common place in the middle ages. Secret writing was employed by
the Catholic Church in its various struggles down the ages and by the major governments of the
time. Steganography was normally used in conjunction with cryptography to further hide secret
information.

STEGANOGRAPHY TYPES
The majority of todays steganographic systems uses multimedia objects like image, audio, video
etc. as cover media because people often transmit digital pictures over email and other Internet
communication. In modern approach, depending on the nature of cover object, steganography
can be divided into five types:
Text Steganography
Image Steganography
Audio Steganography
Video Steganography
Protocol Steganography
So, in the modern age so many steganographic techniques have been designed which work with
the above concerned objects. More often in todays security advancement, we sometimes come
across a combination of Cryptography and Steganography to achieve data privacy over secrecy.
Various software tools are also available in this regard.

IMAGE AND TRANSFORM DOMAIN


Image steganography techniques can be divided into two groups: those in the Image Domain and
those in the Transform Domain. Image also known as spatial domain techniques embed
messages in the intensity of the pixels directly, while for transform also known as frequency
domain, images are first transformed and then the message is embedded in the image. Image
domain techniques encompass bit-wise methods that apply bit insertion and noise manipulation
and are sometimes characterized as simple systems. The image formats that are most suitable
for image domain steganography are lossless and the techniques are typically dependent on the
image format. Steganography in the transform domain involves the manipulation of algorithms
and image transforms. These methods hide messages in more significant areas of the cover
image, making it more robust. Many transform domain methods are independent of the image

format and the embedded message may survive conversion between lossy and lossless
compression.
In the next sections steganographic algorithms will be explained in categories according to image
file formats and the domain in which they are performed.

Image Domain

Least Significant Bit


Least significant bit (LSB) insertion is a common, simple approach to embedding information in
a cover image. The least significant bit (in other words, the 8th bit) of some or all of the bytes
inside an image is changed to a bit of the secret message. When using a 24-bit image, a bit of
each of the red, green and blue color components can be used, since they are each represented by
a byte. (It has been explained in detail later).
LSB and Palette Based Images
Palette based images, for example GIF images, are another popular image file format commonly
used on the Internet. By definition a GIF image cannot have a bit depth greater than 8, thus the
maximum number of colors that a GIF can store is 256. GIF images are indexed images where
the colors used in the image are stored in a palette, sometimes referred to as a color lookup table.
Each pixel is represented as a single byte and the pixel data is an index to the color palette. The
colors of the palette are typically ordered from the most used color to the least used colors to
reduce lookup time.
GIF images can also be used for LSB steganography, although extra care should be taken. The
problem with the palette approach used with GIF images is that should one change the least
significant bit of a pixel, it can result in a completely different color since the index to the color
palette is changed. If adjacent palette entries are similar, there might be little or no noticeable
change, but should the adjacent palette entries be very dissimilar, the change would be evident.

One possible solution is to sort the palette so that the color differences between consecutive
colors are minimized. Another solution is to add new colors which are visually similar to the
existing colors in the palette. This requires the original image to have less unique colors than the
maximum number of colors (this value depends on the bit depth used). Using this approach, one
should thus carefully choose the right cover image. Unfortunately any tampering with the palette
of an indexed image leaves a very clear signature, making it easier to detect. A final solution to
the problem is to use greyscale images. In an 8-bit greyscale GIF image, there are 256 different
shades of grey. The changes between the colors are very gradual, making it harder to detect.

Transform Domain
To understand the steganography algorithms that can be used when embedding data in the
transform domain, one must first explain the type of file format connected with this domain. The
JPEG file format is the most popular image file format on the Internet, because of the small size
of the images.
JPEG compression
To compress an image into JPEG format, the RGB color representation is first converted to a
YUV representation. In this representation the Y component corresponds to the luminance (or
brightness) and the U and V components stand for chrominance (or color).

According to research the human eye is more sensitive to changes in the brightness (luminance)
of a pixel than to changes in its color. This fact is exploited by the JPEG compression by downsampling the color data to reduce the size of the file. The color components (U and V) are halved
in horizontal and vertical directions, thus decreasing the file size by a factor of 2. The next step is

the actual transformation of the image. For JPEG, the Discrete Cosine Transform (DCT) is used,
but similar transforms are for example the Discrete Fourier Transform (DFT). These
mathematical transforms convert the pixels in such a way as to give the effect of spreading the
location of the pixel values over part of the image. The DCT transforms a signal from an image
representation into a frequency representation, by grouping the pixels into 8 8 pixel blocks and
transforming the pixel blocks into 64 DCT coefficients each. A modification of a single DCT
coefficient will affect all 64 image pixels in that block. The next step is the quantization phase of
the compression. Here another biological property of the human eye is exploited: The human eye
is fairly good at spotting small differences in brightness over a relatively large area, but not so
good as to distinguish between different strengths in high frequency brightness. This means that
the strength of higher frequencies can be diminished, without changing the appearance of the
image. JPEG does this by dividing all the values in a block by a quantization coefficient. The
results are rounded to integer values and the coefficients are encoded using Huffman coding to
further reduce the size.
JPEG steganography
Originally it was thought that steganography would not be possible to use with JPEG images,
since they use lossy compression which results in parts of the image data being altered. One of
the major characteristics of steganography is the fact that information is hidden in the redundant
bits of an object and since redundant bits are left out when using JPEG it was feared that the
hidden message would be destroyed. Even if one could somehow keep the message intact it
would be difficult to embed the message without the changes being noticeable because of the
harsh compression applied. However, properties of the compression algorithm have been
exploited in order to develop a steganographic algorithm for JPEGs.
One of these properties of JPEG is exploited to make the changes to the image invisible to the
human eye.
During the DCT transformation phase of the compression algorithm, rounding errors occur in the
coefficient data that are not noticeable. Although this property is what classifies the algorithm as
being lossy, this property can also be used to hide messages. It is neither feasible nor possible to
embed information in an image that uses lossy compression, since the compression would

destroy all information in the process. Thus it is important to recognize that the JPEG
compression algorithm is actually divided into lossy and lossless stages. The DCT and the
quantization phase form part of the lossy stage, while the Huffman encoding used to further
compress the data is lossless. Steganography can take place between these two stages. Using the
same principles of LSB insertion the message can be embedded into the least significant bits of
the coefficients before applying the Huffman encoding. By embedding the information at this
stage, in the transform domain, it is extremely difficult to detect, since it is not in the visual
domain.
Image or Transform domain
As seen in the diagram above, some steganographic algorithms can either be categorized as
being in the image domain or in the transform domain depending on the implementation.
Patchwork
Patchwork is a statistical technique that uses redundant pattern encoding to embed a message in
an image. The algorithm adds redundancy to the hidden information and then scatters it
throughout the image. A Pseudo random generator is used to select two areas of the image (or
patches), patch A and patch B. All the pixels in patch A is lightened while the pixels in patch B
are darkened. In other words the intensities of the pixels in the one patch are increased by a
constant value, while the pixels of the other patch are decreased with the same constant value.
The contrast changes in this patch subset encodes one bit and the changes are typically small and
imperceptible, while not changing the average luminosity.
A disadvantage of the patchwork approach is that only one bit is embedded. One can embed
more bits by first dividing the image into sub-images and applying the embedding to each of
them. The advantage of using this technique is that the secret message is distributed over the
entire image, so should one patch be destroyed, the others may still survive. This however,
depends on the message size, since the message can only be repeated throughout the image if it is
small enough. If the message is too big, it can only be embedded once. The patchwork approach
is used independent of the host image and proves to be quite robust as the hidden message can
survive conversion between lossy and lossless compression.

Spread Spectrum
In spread spectrum techniques, hidden data is spread throughout the cover-image making it
harder to detect. A system proposed by Marvel et al. combines spread spectrum communication,
error control coding and image processing to hide information in images. Spread spectrum
communication can be defined as the process of spreading the bandwidth of a narrowband signal
across a wide band of frequencies. This can be accomplished by adjusting the narrowband
waveform with a wideband waveform, such as white noise. After spreading, the energy of the
narrowband signal in any one frequency band is low and therefore difficult to detect. In spread
spectrum image steganography the message is embedded in noise and then combined with the
cover image to produce the stego-image. Since the power of the embedded signal is much lower
than the power of the cover image, the embedded image is not perceptible to the human eye or
by computer analysis without access to the original image.

ALGORITHM REQUIREMENTS
All steganographic algorithms have to comply with a few basic requirements. The most
important requirement is that a steganographic algorithm has to be imperceptible. These
requirements are as follows:
Invisibility The invisibility of a steganographic algorithm is the first and foremost
requirement, since the strength of steganography lies in its ability to be unnoticed by the human
eye. The moment that one can see that an image has been tampered with, the algorithm is
compromised.
Payload capacity Unlike watermarking, which needs to embed only a small amount of
copyright information, steganography in other hand requires sufficient embedding capacity.
Robustness against statistical attacks Statistical steganlysis is the practice of detecting
hidden information through applying statistical tests on image data. Many steganographic
algorithms leave a signature when embedding information that can be easily detected through
statistical analysis. To be able to pass by a warden without being detected, a steganographic
algorithm must not leave such a mark in the image as be statistically significant.
Robustness against image manipulation In the communication of a stego-image by trusted
systems, the image may undergo changes by an active warden in an attempt to remove hidden

information. Image manipulation, such as cropping or rotating, can be performed on the image
before it reaches its destination. Depending on the manner in which the message is embedded,
these manipulations may destroy the hidden message. It is preferable for steganographic
algorithms to be robust against either malicious or unintentional changes to the image.
Independent of file format With many different image file formats used on the Internet, it
might seem suspicious that only one type of file format is continuously communicated between
two parties. The most powerful steganographic algorithms thus possess the ability to embed
information in any type of file. This also solves the problem of not always being able to find a
suitable image at the right moment, in the right format to use as a cover image.
Unsuspicious files This requirement includes all characteristics of a steganographic algorithm
that may result in images that are not used normally and may cause suspicion. Abnormal file
size, for example, is one property of an image that can result in further investigation of the image
by a warden.

The levels at which the algorithms satisfy the requirements are defined as high, medium and low.
A high level means that the algorithm completely satisfies the requirement, while a low level
indicates that the algorithm has a weakness in this requirement. A medium level indicates that the
requirement depends on outside influences, for example the cover image used. LSB in GIF

images has the potential of hiding a large message, but only when the most suitable cover image
has been chosen.
The ideal, in other words a perfect steganographic algorithm would have a high level in every
requirement. Unfortunately it's hard to develop an algorithm that satisfies all of the requirements.
Thus a trade-off will exist in most cases, depending on which requirements are more important
for the specific application.

EVALUATION OF DIFFERENT TECHNIQUES ACCORDING TO TABLE-2


LSB in BMP When embedding a message in a raw image, which has not been changed with
compression, such as a BMP, there exists a trade-off between the invisibility of the message and
the amount of information that can be embedded. A BMP is capable of hiding quite a large
message, but the fact that more bits are altered results in a larger possibility that the altered bits
can be seen with the human eye. The main disadvantage regarding LSB in BMP images is surely
the suspicion that might arise from a very large BMP image being transmitted between parties,
since BMP is not widely used anymore.
Suggested applications: LSB in BMP is most suitable for applications where the focus is on the
amount of information to be transmitted and not on the secrecy of that information.
LSB in GIF The strong and weak points regarding embedding information in GIF images
using LSB are more or less the same as those of using LSB with BMP. The main difference is
that since GIF images only have a bit depth of 8, the amount of information that can be hidden is
less than with BMP. GIF images are especially vulnerable to statistical or visual attacks since
the palette processing that has to be done leaves a very definite signature on the image. This
approach is dependent on the file format as well as the image itself, since a wrong choice of
image can result in the message being visible.
Suggested applications: LSB in GIF is a very efficient algorithm to use when embedding a
reasonable amount of data in a greyscale image.
JPEG compression The process of embedding information during JPEG compression results
in a stego image with a high level of invisibility, since the embedding takes place in the

transform domain. JPEG is the most popular image file format on the Internet and the image
sizes are small because of the compression, thus making it the least suspicious algorithm to use.
However, the process of the compression is a very mathematical process, making it more
difficult to implement.
Suggested applications: The JPEG file format can be used for most applications of
steganography, but is especially suitable for images that have to be communicated over an open
systems environment like the Internet.
Patchwork The biggest disadvantage of the patchwork approach is the small amount of
information that can be hidden in one image. This property can be changed to accommodate
more information but one may have to sacrifice the secrecy of the information. Patchworks main
advantage, however, is its robustness against malicious or unintentional image manipulation.
Should a stego image using patchwork be cropped or rotated, some of the message data may be
lost but since the message is repeatedly embedded in the image, most of the information will
survive.
Suggested applications: Patchwork is most suitable for transmitting a small amount of very
sensitive information.
Spread spectrum Spread spectrum techniques satisfies most requirements and is especially
robust against statistical attacks, since the hidden information is scattered throughout the image,
while not changing the statistical properties.
Suggested applications: Spread spectrum techniques can be used for most steganography
applications, although it's highly mathematical and intricate approach may prove too much for
some.
IMAGE STEGANOGRAPHY TECHNIQUES
To hide information, straight message insertion may encode every bit of information in the image
or selectively embed the message in noisy areas that draw less attention- those areas where
there is a great deal of natural color variation. The message may also be scattered randomly
throughout the image. A number of ways exist to hide information in digital media. Common
techniques which with varying degrees of success include:

Least significant bit insertion


Masking and filtering
Redundant Pattern Encoding
Encrypt and Scatter
Algorithms and transformations

least significant bit insertion


Least significant bits (LSB) insertion is a simple approach to embedding information in image
file. The simplest steganographic techniques embed the bits of the message directly into least
significant bit plane of the cover-image in a deterministic sequence. Modulating the least
significant bit does not result in human-perceptible difference because the amplitude of the
change is small. In this method the LSB of a byte is replaced with an Ms bit. This technique
works well for image, audio and video steganography. To the human eye, the resulting image will
look identical to the cover object.
Masking and filtering
Masking and filtering techniques, usually restricted to 24 bits and gray scale images, hide
information by marking an image, in a manner similar to paper watermarks. The techniques
performs analysis of the image, thus embed the information in significant areas so that the hidden
message is more integral to the cover image than just hiding it in the noise level. They hide info
in a way similar to watermarks on actual paper and are sometimes used as digital watermarks.
Masking images entails changing the luminance of the masked area. The smaller the luminance
change, the less of a chance that it can be detected.
Masking is more robust than LSB insertion with respect to compression, cropping, and some
image processing. Masking techniques embed information in significant areas so that the hidden
message is more integral to the cover image than just hiding it in the noise level. This makes it
more suitable than LSB with, for instance, lossy JPEG images.
Redundant Pattern Encoding

Patchwork and other similar tools do redundant pattern encoding, which is a sort of spread
spectrum technique. It works by scattering the message throughout the picture. This makes the
image more resistant to cropping and rotation. Smaller secret images work better to increase the
redundancy embedded in the cover image, and thus make it easier to recover if the stego-image
is manipulated.
Encrypt and Scatter
The Encrypt and Scatter technique tries to emulate white noise. It is mostly used in image
steganography. White Noise Storm is one such program that employs spread spectrum and
frequency hopping. It does this by scattering the message throughout an image on eight channels
within a random number that is generated by the previous window size and data channel. The
channels then swap rotate, and interlace amongst each other. Each channel represents one bit and
as a result there are many unaffected bits in each channel. This technique is a lot harder to extract
a message out of than an LSB scheme because to decode you must first detect that a hidden
image exists and extract the bit pattern from the file. While that is true for any stego-image you
will also need the algorithm and stego key to decode the bit pattern, both of which are not
required to recover a message from LSB. Some people prefer this method due to the considerable
amount of extra effort that someone without the algorithm and stego-key would have to go
through to extract the message. Even though White Noise Storm provides extra security against
message extraction it is just as susceptible as straight LSB to image degradation due to image
processing.
Algorithms and transformations
Transform techniques embed the message by modulating coefficients in a transform domain,
such as the Discrete Cosine Transform (DCT) used in JPEG compression, Discrete Fourier
Transform, or Wavelet Transform. These methods hide messages in significant areas of the
cover-image, which make them more robust to attack. Transformations can be applied over the
entire image, to block throughout the image, or other variants. LSB modification technique for
images does hold good if any kind of compression is done on the resultant stego-image e.g.
JPEG, GIF etc.

JPEG images use the discrete cosine transform to achieve compression. DCT is a lossy
compression transform because the cosine values cannot be calculated exactly, and repeated
calculations using limited precision numbers introduce rounding errors into the final result.
Variances between original data values and restored data values depend on the method used to
calculate DCT.
DETECTION TECHNIQUE FOR IMAGE STEGANOGRAPHY
Even though stego-images can rarely be spotted by the naked eye, they usually leave behind
some type of fingerprint or statistical hint that they have been modified. It is those discrepancies
which an analysis tool may be able to detect. Since some techniques and their effects are
commonly known, a statistical analysis of an image can be performed to check for a hidden
message(s) in it.
A widely used technique for image scanning involves statistical analysis. Most steganographic
algorithms that work on images, assume that the least significant bit is more or less random. This
is however, an incorrect assumption. While the LSB might not seem to be of much importance,
applying a filter which only shows the least significant bits, will still produce a recognizable
image. Since this is the case, it can be concluded that the LSB are not random at all, but actually
contain information about the whole image. When inserting a hidden message into an image, this
property changes. Especially with encrypted data, which has very high entropy, the LSB of the
cover image will no longer contain information about the original, but because of the
modifications they will now be more or less random. With a statistical analysis on the LSB, the
difference between random values and real image values can easily be detected. Using this
technique, it is also possible to detect messages hidden inside JPEG files with the DCT method,
since this also involves LSB modifications, even though these take place in the frequency
domain.
IMPLEMENTATION
There are currently three effective methods in applying Image Steganography in spatial domain:
LSB Substitution
Blocking (DCT)

Palette Modification.
LSB (Least Significant Bit) Substitution is the process of modifying the least significant bit of
the pixels of the carrier image.
Blocking works by breaking up an image into blocks and using Discrete Cosine Transforms
(DCT). Each block is broken into 64 DCT coefficients that approximate luminance and color
the values of which are modified for hiding messages.
Palette Modification replaces the unused colors within an images color palette with colors that
represent the hidden message.

STEGANOGRAPHY APPLICATIONS
Image Steganography has many applications, especially in todays modern, high-tech world.
Steganography can be used anytime you want to hide data. There are many reasons to hide data
but they all boil down to the desire to prevent unauthorized persons from becoming aware of the
existence of a message. In the business world steganography can be used to hide a secret
chemical formula or plans for a new invention. Steganography can also be used for corporate
espionage by sending out trade secrets without anyone at the company being any the wiser.
Steganography can also be used in the noncommercial sector to hide information that someone
wants to keep private.
Privacy and anonymity is a concern for most people on the internet. Image Steganography allows
for two parties to communicate secretly and covertly. It allows for some morally-conscious
people to safely whistle blow on internal actions; it allows for copyright protection on digital
files using the message as a digital watermark. One of the other main uses for Image
Steganography is for the transportation of high-level or top-secret documents between
international governments. While Image Steganography has many legitimate uses, it can also be
quite nefarious. It can be used by hackers to send viruses and Trojans to compromise machines,
and also by terrorists and other organizations that rely on covert operations to communicate
secretly and safely.

Spies have used it since the time of the Greeks to pass messages undetected. Terrorists can also
use steganography to keep their communications secret and to coordinate attacks. It is exactly
this potential that we will investigate in the next section. Because you can hide information
without the cover source changing, steganography can also be used to implement watermarking.
Although the concept of watermarking is not necessarily steganography, there are several
steganographic techniques that are being used to store watermarks in data. The main difference is
on intent, while the purpose of steganography is hiding information, watermarking is merely
extending the cover source with extra information. Since people will not accept noticeable
changes in images, audio or video files because of a watermark, steganographic methods can be
used to hide this.
In feature tagging, captions, annotations, time stamps, and other descriptive elements can be
embedded inside an image. Copying the stegoimage also copies of the embedded features and
only parties who possess the decoding stego-key will be able to extract and view the features. On
the other hand, secret communication does not advertise a covert communication by using
steganography. Therefore, it can avoid scrutiny of the sender, message and recipient. This is
effective only if the hidden communication is not detected by the others people

3. FEASILBILITY STUDY
Technical Feasibility
Image Steganography is a defense application with a back-end coding done in C#.NET
that allows a user to hide data (documents or text files) in an image. The user friendly interface
requires no technical skills and is easy to operate on. Visual Studio 2010 using .NET 4.0
framework is used for the design and coding purposes. Adobe Photoshop CS5 has also been used
to design the header of the application with an animated sequence of images.

Economic Feasibility
The project is economic and highly beneficial project as far as the cost of development is
considered. No extra costs were incurred apart from the software used.

Operational Feasibility
The project is operationally very feasible as it is user-friendly, the user doesnt need any kind of
knowledge about the software used in the project. The project is also really helpful as the user
can use it to send encrypted data at any moment of time using the internet or the LAN system.
OBJECTIVE
This project comprehends the following objectives:
To produce security tool based on steganography techniques.
To explore LSB techniques of hiding data using steganography.

SCOPE
The scope of the project as follows:
Implementation of a variation of LSB technique for hiding information i.e. text in image files.
IMAGE DEFINITION

To a computer, an image is a collection of numbers that constitute different light intensities in


different areas of the image. This numeric representation forms a grid and the individual points
are referred to as pixels. Most images on the Internet consists of a rectangular map of the images
pixels (represented as bits) where each pixel is located and its color. These pixels are displayed
horizontally row by row. The number of bits in a color scheme, called the bit depth, refers to the
number of bits used for each pixel.
The smallest bit depth in current color schemes is 8, meaning that there are 8 bits used to
describe the color of each pixel. Monochrome and greyscale images use 8 bits for each pixel and
are able to display 256 different colors or shades of grey. Digital color images are typically
stored in 24-bit files and use the RGB color model, also known as true color. All color variations
for the pixels of a 24-bit image are derived from three primary colors: red, green and blue, and
each primary color is represented by 8 bits. Thus in one given pixel, there can be 256 different
quantities of red, green and blue, adding up to more than 16-million combinations, resulting in
more than 16-million colors. Not surprisingly the larger amount of colors that can be displayed,
the larger the file size. For this project, we are considering 8-bit images.
IMAGE COMPRESSION
When working with larger images of greater bit depth, the images tend to become too large to
transmit over a standard Internet connection. In order to display an image in a reasonable amount
of time, techniques must be incorporated to reduce the images file size. These techniques make
use of mathematical formulas to analyses and condense image data, resulting in smaller file
sizes. This process is called compression. In images there are two types of compression: lossy
and lossless. Both methods save storage space, but the procedures that they implement differ.
Lossy compression creates smaller files by discarding excess image data from the original image.
It removes details that are too small for the human eye to differentiate, resulting in close
approximations of the original image, although not an exact duplicate. An example of an image
format that uses this compression technique is JPEG (Joint Photographic Experts Group).
Lossless compression, on the other hand, never removes any information from the original
image, but instead represents data in mathematical formulas. The original images integrity is
maintained and the decompressed image output is bit-by-bit identical to the original image input.

The most popular image formats that use lossless compression is GIF (Graphical Interchange
Format) and 8-bit BMP (a Microsoft
Windows bitmap file).
Compression plays a very important role in choosing which steganographic algorithm to use.
Lossy compression techniques result in smaller image file sizes, but it increases the possibility
that the embedded message may be partly lost due to the fact that excess image data will be
removed. Lossless compression though, keeps the original digital image intact without the
chance of lost, although it does not compress the image to such a small file size. Different
steganographic algorithms have been developed for both of these compression types and will be
explained in the following sections.
LEAST SIGNIFICANT BIT
Least significant bit (LSB) insertion is a common, simple approach to embedding information in
a cover image. The least significant bit (in other words, the 8th bit) of some or all of the bytes
inside an image is changed to a bit of the secret message. When using a 24-bit image, a bit of
each of the red, green and blue color components can be used, since they are each represented by
a byte. In other words, one can store 3 bits in each pixel. An 800 600 pixel image, can thus
store a total amount of 1,440,000 bits or 180,000 bytes of embedded data. For example a grid for
3 pixels of a 24-bit image can be as follows:
(00101101 00011100 11011100)
(10100110 11000100 00001100)
(11010010 10101101 01100011)

When the number 200, which binary representation is 11001000, is embedded into the least
significant bits of this part of the image, the resulting grid is as follows:

(00101101 00011101 11011100)

(10100110 11000101 00001100)


(11010010 10101100 01100011)

Although the number was embedded into the first 8 bytes of the grid, only the 3 underlined bits
needed to be changed according to the embedded message. On average, only half of the bits in an
image will need to be modified to hide a secret message using the maximum cover size. Since
there are 256 possible intensities of each primary color, changing the LSB of a pixel results in
small changes in the intensity of the colors. These changes cannot be perceived by the human eye
- thus the message is successfully hidden. With a well-chosen image, one can even hide the
message in the least as well as second to least significant bit and still not see the difference.
In the above example, consecutive bytes of the image data from the first byte to the end of the
message are used to embed the information. This approach is very easy to detect. A slightly
more secure system is for the sender and receiver to share a secret key that specifies only certain
pixels to be changed. Should an adversary suspect that LSB steganography has been used, he has
no way of knowing which pixels to target without the secret key.
In its simplest form, LSB makes use of BMP images, since they use lossless compression.
Unfortunately to be able to hide a secret message inside a BMP file, one would require a very
large cover image. Nowadays, BMP images of 800 600 pixels are not often used on the
Internet and might arouse suspicion. For this reason, LSB steganography has also been
developed for use with other image file formats.
DETECTION/ ATTACKS
While the purpose of Steganography is to hide messages, it may not be very effective at doing so.
There are several attacks that one may execute to test for Steganography images. They are:
Visual Attacks
Enhanced LSB Attacks
Chi-Square Analysis, and

Other statistical analyses.


In performing a visual attack you must have the original virgin image to compare it the
Steganography image and visually compare the two for artifacts.
In the Enhanced LSB Attack, you process the image for the least significant bits and if the LSB
is equal to one, multiply it by 255 so that it becomes its maximum value.
Chi-Square Analysis calculates the average LSB and constructs a table of frequencies and Pair
of Values; it takes the data from these two tables and performs a chi-square test. It measures the
theoretical vs. calculated population difference. The Chi-Square Analysis calculates the chisquare for every 128 bytes of the image. As it iterates through, the chi-square value it calculates
becomes more and more accurate until too large of a dataset has been produced. Because this
attack relies on statistical analysis it cannot detect patterns or Steganography on very complex
images with lots of noise than one can detect through visualization of the Enhanced LSBs.
BENEFITS/ DRAWBACKS
The advantages of LSB are its simplicity to embed the bits of the message directly into the LSB
plane of cover-image and many techniques use these methods. Modulating the LSB does not
result in a human-perceptible difference because the amplitude of the change is small. Therefore,
to the human eye, the resulting stego-image will look identical to the cover-image. This allows
high perceptual transparency of LSB.
However, there are a few weaknesses of using LSB. It is very sensitive to any kind of filtering or
manipulation of the stego-image. Scaling, rotation, cropping, addition of noise, or lossy
compression to the stego-image will destroy the message.
On the other hand, for the hiding capacity, the size of information to be hidden relatively depends
to the size of the cover-image. The message size must be smaller than the image. A large capacity
allows the use of the smaller cover-image for the message of fixed size, and thus decreases the
bandwidth required to transmit the stego-image.
Another weakness is an attacker can easily destruct the message by removing or zeroing the
entire LSB plane with very little change in the perceptual quality of the modified stego-image.

Therefore, if this method causes someone to suspect something hidden in the stego-image, then
the method is not success.

4. System Profile
Hardware Requirements:

System

Pentium IV 2.4 GHz.

Hard Disk

40 GB.

Floppy Drive :

1.44 Mb.

Monitor

15 VGA Color.

Mouse

Logitech.

Ram

256 Mb.

Software Requirements:

Operating system

Windows XP Professional.

Front End

Visual Studio.Net 2010

Coding Language

Visual C# .Net.

Back End

SQL 2000.

Tool used

visual studio 2010

Visual Studio 2010, .NET Framework 4.0 and Adobe Photoshop CS5 installed on the system.

5. Tools and technology


SOFTWARE DESCRIPTION
C#.NET/ C++
C# is a relatively new language that was unveiled to the world when Microsoft announced the
first version of its .NET Framework in July 2000. Since then its popularity has rocketed, and it
has arguably become the language of choice for both Windows and Web developers who use
the .NET Framework. Part of the appeal of C# comes from its clear syntax, which derives from
C/C++ but simplifies some things that have previously discouraged some programmers. Despite
this simplification, C# has retained the power of C++, and there is now no reason not to move
into C#. The language is not difficult and its a great one to learn elementary programming
techniques with.
.

C#.NET a window application framework developed and marketed by Microsoft to allow


programmers to build dynamic web sites, web applications and web services is used for our
projects software coding.
By design, C# is the programming language that most directly reflects the underlying Common
Language Infrastructure (CLI). Most of its intrinsic types correspond to value-types implemented
by the CLI framework. However, the language specification does not state the code generation
requirements of the compiler: that is, it does not state that a C# compiler must target a Common
Language Runtime, or generate Common Intermediate Language (CIL), or generate any other
specific format. Theoretically, a C# compiler could generate machine code like traditional
compilers of C++ or FORTRAN.
Some notable distinguishing features of C# are:
There are no global variables or functions. All methods and members must be declared within
classes. Static members of public classes can substitute for global variables and functions.
Local variables cannot shadow variables of the enclosing block, unlike C and C++. Variable
shadowing is often considered confusing by C++ texts.

C# supports a strict Boolean data type, bool. Statements that take conditions, such as while and
if, require an expression of a type that implements the true operator, such as the Boolean type.
While C++ also has a Boolean type, it can be freely converted to and from integers, and
expressions such as if (a) require only that a is convertible to bool, allowing a to be an int, or a
pointer. C# disallows this "integer meaning true or false" approach on the grounds that forcing
programmers to use expressions that return exactly bool can prevent certain types of common
programming mistakes in C or C++ such as if (a = b) (use of assignment = instead of equality
==).
In C#, memory address pointers can only be used within blocks specifically marked as unsafe,
and programs with unsafe code need appropriate permissions to run. Most object access is done
through safe object references, which always either point to a "live" object or have the welldefined null value; it is impossible to obtain a reference to a "dead" object (one which has been
garbage collected), or to a random block of memory. An unsafe pointer can point to an instance
of a value-type, array, string, or a block of memory allocated on a stack. Code that is not marked
as unsafe can still store and manipulate pointers through the System.IntPtr type, but it cannot
dereference them.
Managed memory cannot be explicitly freed; instead, it is automatically garbage collected.
Garbage collection addresses the problem of memory leaks by freeing the programmer of
responsibility for releasing memory which is no longer needed.
In addition to the try...catch construct to handle exceptions, C# has a try...finally construct to
guarantee execution of the code in the finally block.
Multiple inheritance is not supported, although a class can implement any number of interfaces.
This was a design decision by the language's lead architect to avoid complication and simplify
architectural requirements throughout CLI.
C# is more type safe than C++. The only implicit conversions by default are those which are
considered safe, such as widening of integers. This is enforced at compile-time, during JIT, and,
in some cases, at runtime. There are no implicit conversions between Booleans and integers, nor
between enumeration members and integers (except for literal 0, which can be implicitly
converted to any enumerated type). Any user-defined conversion must be explicitly marked as

explicit or implicit, unlike C++ copy constructors and conversion operators, which are both
implicit by default.
Enumeration members are placed in their own scope.
C# provides properties as syntactic sugar for a common pattern in which a pair of methods,
accessor (getter) and mutator (setter) encapsulate operations on a single attribute of a class.
Full type reflection and discovery is available.
C# currently (as of version 4.0) has 77 reserved words.
Checked exceptions are not present in C# (in contrast to Java). This has been a conscious
decision based on the issues of scalability and version ability.

Microsoft Visual Studio 2010


Microsoft Visual Studio 2010 version 10.0.30319.1 RTMRel. 2010 Microsoft Corporation.

6. System flow diagram


DATA FLOW DIAGRAMS:
Data flow diagrams are the basic building blocks that define the flow of data in a system to the
particular destination and difference in the flow when any transformation happens. It makes
whole procedure like a good document and makes simpler and easy to understand for both
programmers and non-programmers by dividing into the sub process. The data flow diagrams are
the simple blocks that reveal the relationship between various components of the system and
provide high level overview, boundaries of particular system as well as provide detailed
overview of system elements.
Data Flow Diagram Level 0
"DFD level 0" is the highest level view of the system, contains only one process
Which represents whole function of the system. It doesn't contain any data stores
And the data is stored with in the process. For constructing DFD level 0 diagram for the
proposed approach we need two
Sources one is for "source" and another is for "destination" and a "process".

3.4.3 Data Flow Diagram Level 1

For constructing "DFD level 1", we need to identify and draw the process that make the level 0
process. In the project for transferring the personal data from source to destination, the personal
data is first encrypted and processed and latter decrypted.

3.4.4 Data Flow Diagram Level 2


The image and the text document are given to the encryption phase. The encryption algorithm is
used for embedding the data into the image. The resultant image acting as a carrier image is
transmitted to the decryption phase using the transmission medium. For extracting the message
from the carrier image, it's sent to the decryption section. The plain text is extracted from the
carrier image using the decryption algorithm.

3.5 ACTIVITY DIAGRAM


The sender sends the message to the receiver using three phases. Since we are using the
steganographic approach for transferring the message to the destination, the sender sends text as
well as image file to the primary phase i.e., to encryption phase. The encryption phase uses the
encryption algorithm by which the carrier image is generated. The encryption phase generates
the carrier image as output.

GANTT CHART

7. TESTING
Testing defines the status of the working functionalities of any particular system.
Through testing particular software one can't identify the defects in it but can
analyses the performance of software and its working behavior. By testing the
software we can find the limitations that become the conditions on which the
performance is measured on that particular level. In order to start the testing
process the primary thing is requirements of software development cycle. Using
this phase the testing phase will be easier for testers. The capacity of the software
can be calculated by executing the code and inspecting the code in different
conditions such as testing the software by subjecting it to different sources as input
and examining the results with respect to the inputs. After the designing phase, the
next phase is to develop and execute the code indifferent conditions for any errors
and progress to the developing phase. Without testing and execution, the software
cannot be moved to the developing phase. There are two types of testing. They are:
The functional testing, which defines the specified function of a particular code in
the program. This type of testing gives us a brief description about the program's
performance and security in the various functional areas. The other type of testing
is non-functional testing. Non-functional testing defines the capabilities of
particular software like its log data etc. It is opposite to functional testing and so
will not describe the specifications like security and performance. The performance
of the particular program not only depends on errors in coding. The errors in the
code can be noticed during execution, but the other types of errors can affect the
program performance like when the program is developed based on one platform
that may not perform well and give errors when executed in different platform. So,
compatibility is another issue that reduce the software performance. The code

tuning helps us in optimizing the program to perform at its best utilizing minimal
resources possible under varied conditions.
AIM OF TESTING
The main aim of testing is to analyses the performance and to evaluate the errors
that occur when the program is executed with different input sources and running
indifferent operating environments. In this project, I developed a steganographic
application based on Microsoft Visual Studio which focuses on data hiding based
on Least Significant Bit algorithm. The main aim of testing in this project is to find
the compatibility issues as well as the working performance when different sources
are given as the inputs.

There are different types of approaches for testing a JAVA framework based
application. The types of testing are
Unit testing
Validation testing
Integration testing
User acceptance testing
Output testing
Black box and white box testing.

UNIT TESTING
'Unit testing' is the approach of taking a small part of testable application and
executing it according to the requirements and testing the application behavior.
Unit testing is used for detecting the defects that occur during execution. When an
algorithm is executed, the integrity should be maintained by the data structures.

Unit testing is made use for testing the functionality of each algorithm during
execution. Unit testing can be used in the bottom up test approach which makes the
integration test much easier. Unit testing reduces the ambiguity in the units. Unit
testing uses regression testing, which makes the execution simpler. Using
regression testing, the fault can be easily identified and fixed. In this project, I have
developed an application using different phases like encryption, decryption, etc.
So, for getting the correct output all the functions that are used are executed and
tested at least once making sure that all the control paths, error handling and
control structures are in proper manner. Unit testing has it's applications for
extreme programming, testing unit frame works and good support for language
level unit testing.

VALIDATION TESTING
Validation is the process of finding whether the product is built correct or not. The
software application or product that is designed should fulfil the requirements and
reach the expectations set by the user. Validation is done while developing or at the
final stage of development process to determine whether it is satisfies the specified
requirements of user.
Using validation test the developer can qualify the design, performance and its
operations. Also the accuracy, repeatability, selectivity, Limit of detection and
quantification can be specified using 'Validation testing'.

OUTPUT TESTING
After completion of validation testing the next process is output testing. Output
testing is the process of testing the output generated by the application for the
specified inputs. This process checks weather the application is producing the
required output as per the user's specification or not. The 'output testing' can be
done by considering mainly by updating the test plans, the behavior of application
with different type of inputs and with produced outputs, making the best use of the
operating capacity and considering the recommendations for fixing the issues.

INTEGRATION TESTING
'Integration testing' is an extension to unit testing, after unit testing the units are
integrated with the logical program. The integration testing is the process of
examining the working behavior of the particular unit after embedding with
program. This procedure identifies the problems that occur during the combination
of units.
The integration testing can be commonly done in three approaches
Top-down approach
Bottom-up approach
Umbrella approach

Top-down approach:
In the top-down approach the highest level module should be considered first and
integrated. This approach makes the high level logic and data flow to test first and
reduce the necessity of drivers. One disadvantage with top-down approach is its
poor support and functionality is limited.

Bottom-up approach

Bottom-up approach is opposite to top-down approach. In this approach, the lowest


level units are considered and integrated first. Those units are known as utility
units. The utility units are tested first so that the usage of stubs is reduced. The
disadvantage in this method is that it needs the respective drivers which make the
test complicated, the support is poor and the functionality is limited.
Umbrella approach

The third approach is umbrella approach, which makes use of both the top - bottom
and bottom - top approaches. This method tests the integration of units along with
its functional data and control paths. After using the top - bottom and bottom-top
approaches, the outputs are integrated in top - bottom manner. The advantage of
this approach is that it provides good support for the release of limited
functionality as well as minimizing the needs of drivers and hubs. The main
disadvantage is that it is less systematic than the other two approaches.
USER ACCEPTANCE TESTING

'User acceptance testing' is the process of obtaining the confirmation from the user
that the system meets the set of specified requirements. It is the final stage of
project; the user performs various tests during the design of the applications and
makes further modifications according to the requirements to achieve the final
result. The user acceptance testing gives the confidence to the clients about the
performance of system

BLACK BOX AND WHITE BOX TESTING

'Black box testing' is the testing approach which tells us about the possible
combinations for the end-user action. Black box testing doesn't need the
knowledge about the interior connections or programming code. In the black box
testing, the user tests the application by giving different sources and checks
whether the output for the specified input is appropriate or not.
'White box testing' is also known as 'glass box' or 'clear box' or 'open box' testing.
It is opposite to the black box testing. In the white box testing, we can create test
cases by checking the code and executing in certain intervals and know the
potential errors.
The analysis of the code can be done by giving suitable inputs for the specified
applications and using the source code for the application blocks. The limitation
with the white box testing is that the testing only applies to unit testing, system
testing and integration testing

6. Form Layout

Fig 1:- login form

Fig: - Encrypt Image

fig : selection of image file and data

fig :- message after encryption done

fig : Location for Decrypted image and data

Fig :- decryption done successfully

7.

Conclusion
In the present world, the data transfers using internet is rapidly growing

because it is so easier as well as faster to transfer the data to destination.


So, many individuals and business people use to transfer business documents,
important information using internet.
Security is an important issue while transferring the data using internet because
any unauthorized individual can hack the data and make it useless or obtain
information un- intended to him.
The proposed approach in this project uses a new steganography approach called
image steganography. The application creates a stego image in which the personal
data is embedded and is protected with a password which is highly secured.
The main intention of the project is to develop a Stefano graphic application that
provides good security. The proposed approach provides higher security and can
protect the message from stego attacks.
The image resolution doesn't change much and is negligible when we embed the
message into the image and the image is protected with the personal password.
So, it is not possible to damage the data by unauthorized personnel. I used the
Least Significant Bit algorithm in this project for developing the application which
is faster and reliable and compression ratio is moderate compared to other
algorithms.

The major limitation of the application is designed for bit map images (.bmp). It
accepts only bit map images as a carrier file, and the compression depends on the
document size as well as the carrier image size.
The future work on this project is to improve the compression ratio of the image to
the text.
This project can be extended to a level such that it can be used for the different
types of image formats like .bmp, .jpeg, .tiff etc., in the future.
The security using Least Significant Bit Algorithm is good but we can improve the
level to a certain extent by varying the carriers as well as using different keys for
encryption and decryption.

8. Future Enhancements
In future this application has features of:Hiding data in to audio file.
Hiding data in to video file.
It will be password protected.

9. BIBLIOGRAPHY
http://www.engpaper.com/free-research-papers-steganography.htm

http://en.wikipedia.org/wiki/Steganography
www.ece.stevens-tech.edu/~mouli/lsbsteg.pdf
www.waset.org/journals/waset/v50/v50-74.pdf
mo.co.za/open/stegoverview.pdf
www.maths.nuigalway.ie/cstudents/mcomms/.../steganography.pdf
http://www.ijcaonline.org/archives/volume6/number2/1057-1378
ipublishing.co.in/jarvol1no12010/EIJAER1018.pdf
faculty.ksu.edu.sa/ghazy/Steg/References/ref26-2.pdf
www1.chapman.edu/~nabav100/.../ImageSteganography.pdf
Kesslet, Gary C. An Overview of Steganography for the Computer Forensics
Examiner, Burlington, 2004.

Hosmer, Chet. Discovering Hidden Evidence, Cortland, 2006.


N.F. Johnson, S. Jajodia, Staganalysis: The Investigation of Hiding
Information, IEEE, pp. 113-116, 1998.
N.F. Johnson & S. Jajodia, Steganalysis of Images Created Using Current
Steganography Software, in Proceeding for the Second Information Hiding
Workshop, Portland Oregon, USA, April 1998, pp. 273-289.

You might also like