You are on page 1of 15

OBSERVATORY

Instituto Nacional

Notebook
de Tecnologías
de la Comunicación

STEGANOGRAPHY, THE ART OF HIDING INFORMATION

From the Greek steganos (covered) and graphos (writing), steganography can be defined
as the hiding of information through a covert channel with the purpose of preventing the
detection of a hidden message.

Steganography studies the collection of techniques aimed to embed sensitive information


into another file. This file is known as “container file” or “cover file” (graphs, documents,
executable programs, etc.). By doing this, information is passed to third parties without
being noticed and can only be retrieved by a legitimate user who knows a specific
algorithm to extract it.

Image 1: News stories: “Steganography is not just a theoretical invention”

Source: INTECO

This science has aroused great interest in recent years since it has been used by crime
and terrorist organisations. However, this is not a new invention, but has been employed
since ancient times. This article is intended to introduce the reader to the field of
steganography, clarifying its differences from cryptography and showing examples of the
software used for this technique.

INFORMATION SECURITY OBSERVATORY


“Steganography, the art of hiding information”
Information Security Observatory
Page 1 of 15
I History and origins

Over 400 years before Christ, Herodotus had already reflected in his book The Histories
the use of steganography in ancient Greece. He describes how a character takes a little
book of two ‘leaves’ or small wooden boards, properly scratches the wax covering them,
engraves a message in the wood and covers it with wax again.

Another story in this book describes how another character shaves the head of one of his
slaves with a razor and tattoos a message on his scalp. Then he waits for the slave’s hair
to grow again and send him to the recipient of the message with instructions to shave his
head.

Image 2: Small wooden board used to write a hidden message engraved in the wood under
the wax

Source: INTECO

Another historical example of the use of steganography is the book Hypnerotomachia


Poliphili by Francesco Colonna, dating from 1499. If we take the first letter of the titles of
its 38 chapters we can read: “Poliam frater Franciscus Columna peramavit”, which is
translated as “The brother Francesco Colonna loves Polia passionately”.

Likewise, during the World War II, small holes were punched through some letters of a
newspaper in a way that, when holding it to the light, it was possible to see all those
letters and interpret them as a message.

The “invisible ink” example is however much more familiar to the reader. Many kids play
this game of sending each other messages written with lemon juice or similar substances
(highly carbonated), so that when heating the surface on which the message is written,
this emerges in a shade of coffee brown. This technique may be more complex if further
chemical reactions are applied.

“Steganography, the art of hiding information” Page 2 of 15


Information Security Observatory
It becomes clear that steganography has existed in our civilization from time immemorial
and has been traditionally used by military and intelligence agencies, criminals and police,
as well as by civilians who want to disobey government restrictions. However, whereas
traditional steganography was only based on ignoring the covert channel used, digital
channels (image, video, audio, communication protocols, etc.) are nowadays used to
achieve that target. In many cases, the container object is known; what is ignored is the
algorithm to insert information into that object.

II Definitions and theoretical foundations

Steganography is a solution to the prisoner’s problem: two inmates of a high-security


prison, Romulo and Remo, who are in separate cells, want to communicate each other to
prepare a plan to escape. However, all information exchanged between them is examined
by a security guard who, in view of any suspicion of covert communication, isolates them
from one another. By means of steganography the guard analyses seemingly innocuous
messages which contain a subliminal channel really useful to the prisoners.

Different actors are involved in the field of steganography:

• Cover object (container object): it is the object used to carry the hidden
message. Going back to the example of the messages tattooed on the slave’s
scalp, the cover object is the slave himself.

• Stego-object: it is the cover object together with the hidden message. Following
with the previous example, the stego-object is the slave once the message has
been written on his scalp and once his hair has grown back to normal.

• Adversary: these are all those entities from whom the covert information is being
hidden. In the previous example, it is the guard who delivers the messages to one
or the other prisoner. This adversary can be passive or active. A passive
adversary suspects that covert communication may be taken place and tries to
discover the algorithm extracted from the stego-object, but does not attempt to
alter that object. An active adversary, apart from trying to find out the covert
communication algorithm, modifies the stego-object with the aim to corrupt any
attempt of subliminal messaging.

• Steganalysis: science that studies the detection (passive attacks) and/or


cancellation (active attacks) of information hidden behind different covers, as well
as the possibility of finding the useful information inside them (existence and size).

Considering that there can be active adversaries, a good steganographic technique must
be robust against distortions, be they accidental or a result of the interaction of an active
adversary.

“Steganography, the art of hiding information” Page 3 of 15


Information Security Observatory
Robustness against distortions is usually a target of cryptography as well; however,
steganography and cryptography are different fields. In cryptography, the aim is to
guarantee the confidentiality of information before the eyes of an interceptor who is able to
see the cryptogram, even though this knows the algorithm which generates it. On the
other hand, steganography seeks to hide the very presence of the message, since if the
location of the message is identified, the communication becomes directly known (once
the hiding algorithm is known), which does not happen with cryptography.

Therefore, the use of steganography alone stands in contradiction to one of the basic
security principles: security through obscurity (ignorance) does not work.

At the beginning of the 20th century, Kerckhoff formulates a series of principles which
have become key pillars in the field of security; one of them states: “assume that the
(malicious) user knows all encryption procedures”. If such a principle is applied to
steganography, it means assuming that the security guard knows the algorithm that hides
the message into the cover object, which involves the immediate isolation of the prisoners.

In order for steganography to be more useful, it must be combined with cryptography. The
message to be exchanged must be encrypted (in a robust way) and then embedded in the
cover object. As a result, even though the interceptor discovers the steganographic
pattern, it will never get to know the message exchanged.

The combination of both techniques has another advantage: when cryptography is used
alone, one knows that messages are being exchanged, which can act as a starting point
for an attack aiming to discover the message. By introducing steganography, in most
cases one does not even know that an encrypted communication is being taken place.

III Functioning and examples

This article focuses on the most used cover object: digital images and, particularly,
images in BMP format for its simplicity (this is an uncompressed file format). The ideas
presented can be applied to other formats (JPG, PNG, etc.) and other carrier objects
(videos, documents, etc.) as long as the specific characteristics of each format are
respected.

“Steganography, the art of hiding information” Page 4 of 15


Information Security Observatory
Replacement of bits in the cover object

This technique replaces certain bits of the cover file by those of the information to be
hidden. The advantage of this approach is that the size of the cover file is not modified
and, on many occasions, neither its quality thanks to the redundancy and/or excess detail
in such files.

For instance, in an audio file, it is possible to replace the bits which are not audible to
human ears with the bits of the message itself.

When working with images, the traditional method is to replace the least significant bits
(LSB), in a 24-bit colour scale (over 16 millions of colours). This only results in that a pixel
in a shade of red is seen as 1% darker. In many cases these are changes imperceptible to
human senses and can only be detected through computational analysis of the files’
structure.

BMP files are a standard bitmap image format in DOS and Windows operating systems
and valid for MAC and PC. It supports 24-bit (millions of colours) and 8-bit (256 colours)
images, and can work in grey scale, RGB and CMYK.

Image 3: Zoom on a pixel in an image

Source: INTECO

Every pixel of a 24-bit BMP file is represented by three bytes. Each of these bytes
contains the red, green and blue colour intensity (RGB: red, green and blue). Combining
the values in those positions it is possible to obtain the 224 (more than 16 millions) colours
that a pixel can take.

Likewise, each byte has a value between 0 and 255, which is to say between 00000000
and 11111111 in binary system, the leftmost bit being the most significant bit. This proves
that the least significant bits of a pixel can be modified without causing great alteration.

“Steganography, the art of hiding information” Page 5 of 15


Information Security Observatory
Image 4: Visual effect of the modification of the least significant bits of the RGB
components in a pixel.

Source: INTECO

Each RGB component of the pixel has been given a one-unit higher value and the effect is
unnoticeable for the human eye. In fact, if we take into account that a pixel is surrounded
by other pixels, the visual effect goes even more unnoticed if its surroundings are not
modified.

The implication of this is that, by using one-bit changes in each component of a pixel, it is
possible to embed three bits of hidden information per pixel without producing noticeable
changes on the image. This may be done for each image pixel. Eight pixels are needed to
hide three bytes of information; in ASCII codification this means three letters of hidden
information. Therefore, in a BMP image of 502x126 pixels, it is possible to hide a
message of 23,719 ASCII characters.

Image 5: Image where information has been hidden in the least significant bits of their
pixels

Source: INTECO

“Steganography, the art of hiding information” Page 6 of 15


Information Security Observatory
As for BMP images, steganography by replacement is quite simple; this technique gets
more complex when dealing with other formats, but the basic idea is the same:

• Modification of the values for the colour palette in a GIF file.

• Replacement of quantified DCT coefficient in JPG files.

This technique has an underlying conceptual error: it assumes that the information
originally stored in the least significant bits is random and that, consequently, modifying it
to insert hidden information does not reveal that the image has been edited. This is not
true and may serve as a basis for a steganalysis mechanism, as explained later in section
IV.

Insertion of bits into the cover object

The information bits are added from a certain structural aspect of the file (end of file –
EOF-, padding spaces or alignment, etc.). This option has the disadvantage that, if the
size of the container object is modified, it may raise suspicions.

In order to extrapolate this idea to the example of BMP images, we must first understand
how this format is structured. The first 54 bytes contain image metadata, which are
divided as follows:

• 2 bytes Æ always containing the ‘BM’ string, which reveals it is a BMP file.

• 4 bytes Æ file size in bytes

• 4 bytes Æ reserved (for future uses), containing zeros.

• 4 bytes Æ offset, distance between the heading and the first pixel of the image.

• 4 bytes Æ metadata size (this structure itself).

• 4 bytes Æ width (number of horizontal pixels).

• 4 bytes Æ height (number of vertical pixels).

• 2 bytes Æ number of colour planes.

• 2 bytes Æ colour depth.

• 4 bytes Æ compression type (zero value, because BMP is an uncompressed


format)

• 4 bytes Æ image structure size.

“Steganography, the art of hiding information” Page 7 of 15


Information Security Observatory
• 4 bytes Æ pixels per horizontal metre

• 4 bytes Æ pixels per vertical metre

• 4 bytes Æ number of colours used

• 4 bytes Æ number of significant colours

Given this structure, the trivial way of hiding data is to hide them just after the metadata
(between the image metadata and data) and change the “offset” field (distance between
metadata and image pixels). By doing this, it is possible to leave space for all the
additional content you want to include.

Image 6: Diagram of the result obtained from steganography by insertion in BMP images.

Source: INTECO

Image 6 proves that this is not a very silent technique. If the data to be hidden have
enough weight (several megabytes), it is somewhat suspicious to have a 10x10 pixel icon
taking up 5 megabytes. For this reason, the person in charge of hiding the information
must distribute it in different images in order for the change not to be so obvious.

Ad-hoc creation of a cover object from the information to be hidden

This option is simply the generation of a container file with the very information to be
hidden, instead of obtaining the container file separately and manipulating it to include that
information.

For instance, given a specific algorithm to reorder the bytes of the data to be hidden, a
string of pixels of a BMP file can be generated with some visual meaning. If the receiver
knows the reordering algorithm, the transmission of information is possible.

“Steganography, the art of hiding information” Page 8 of 15


Information Security Observatory
IV Steganalysis

As mentioned above, steganalysis is the technique used to retrieve hidden messages or


to prevent communication via steganography. There are two main types of passive
steganalysis, which are briefly explained below:

Manual steganalysis

It is the manual search of differences between the cover object and the stego-object,
looking for changes in the structure in order to find hidden data. The main disadvantages
of this technique are that the cover object is necessary and that, on many occasions, one
can detect hidden information within an object but is unable to retrieve it.

Nevertheless, when we do not have the container file, it is possible to look for irregularities
in the steganographied file in order to find signs of the existence of hidden data.

Visual attacks alert the human eye of the presence of hidden information thanks to the
applying of filters. Let’s consider the BMP file, where the least significant bit of the
components of some of its pixels has been replaced by hidden information. Within this
setting, the manual steganalysis involves applying such a filter that only the least
significant bit of each RGB component of each pixel is considered.

This is what Image 7 shows: the first image hides information and, when applying the
filter, a small uniform pattern is noticeable on the top of the image, apart from the overall
change of shade compared to the filtered image of the original file.

Image 7: Manual steganalysis of a BMP file containing information hidden through LSB

Source: INTECO

These differences are produced because the hiding of information in LSB is based on the
premise that the information originally stored in that bit is random. But this is not true and
the hiding of information in it provides additional clues to an analyst. It is precisely for this
reason that the images with little variability of colour and/or uniform areas are not good
candidates for LSB steganography. An image which is robust against an attack of this
type is a natural, not artificial, image with great variation of shades and/or colours.

“Steganography, the art of hiding information” Page 9 of 15


Information Security Observatory
Statistical steganalysis

It is the process of comparing the frequency of colour distribution in the stego-object. This
is a slow technique, for which specialised software is required. These programs usually
look for message hiding patterns used by the most common steganography programs.
This approach makes them really effective when we work on messages hidden with these
typical programs. However, it is almost impossible for these programs to find the
messages that have been hidden manually.

The details of the statistical steganalysis techniques go beyond the scope of this article;
only one mechanism is briefly explained with the aim to provide the reader with a basic
reference model.

One of these techniques is the Chi-Square 1 attack, which permits to estimate the size of
the information possibly hidden in a stego-object. It can be applied when a fixed set of
pairs of values (PoVs) switch from a value to the other value in the pair when the bits of
the hidden message are inserted.

V Curious applications and implications of steganography

Some unusual situations in which steganography has been used or could be used are
described in this section. It is not a comprehensive list, but an attempt to show the
practical application of the previously stated theory.

Steganography using the TCP/IP protocol

The TCP/IP protocol is appropriate to create covert communication channels, since it is


possible to send relevant data through the headers for two entities that agree a cover
protocol. By using this approach, data can be embedded in initial connection requests,
established connections or other intermediate steps.

1
Westfeld, A. Pfitzmann, A. Attacks on Steganographic Systems. http://www.ece.cmu.edu/~adrian/487-s06/westfeld-
pfitzmann-ihw99.pdf

“Steganography, the art of hiding information” Page 10 of 15


Information Security Observatory
Image 8: TCP protocol header

Source: RFC793 Transmission Control Protocol

For instance, considering the TCP header only, data can be hidden in the initial sequence
number of a connection. This provides 32 bits of hidden data per packet of initial
connection (SYN), i.e. 4 ASCII characters. Following this philosophy it is possible to hide
information in other header fields of the different protocols which make up TCP/IP, as long
as the changes do not involve the refusal of the exchanged packets.

Malware control

Today’s malware usually communicates with an attacker’s control point in order to receive
commands to download additional modules, send stolen data, warn that a new victim has
been infected, etc.

The most used protocol for this type of communication is HTTP, since it generally takes
place on a port that is not filtered by firewalls and it is able to go unnoticed through the
rest of the network traffic generated by legitimate browsing.

In addition, the ease of establishing a control channel with traditional HTTP GET/POST
requests involves communication being easily detectable (if encryption methods are not
used) by enterprises managing web servers/DNS servers associated with the control link.
Even simpler is the identification and interpretation of such communication for a malware
analyst. This means that in the view of suspected illegal activity related to a specific
control point, the infrastructures are more rapidly shut down by the companies managing
them.

The implication for the attacker is a shorter average life time of its control channel and,
consequently, a lower investment return.

“Steganography, the art of hiding information” Page 11 of 15


Information Security Observatory
In order for the identification and shutting down of the infrastructures associated with a
trojan threat not to be so trivial, attackers have devised various techniques: from the
simple encoding of the instructions to generate seemingly meaningless strings to the use
of P2P communications.

Within the aim to hide and strengthen malicious communication channels, steganography
sets itself up as a highly interesting weapon. In fact, the authors of the Waledac 2 worm
have already used steganography by insertion in the download and installation of
additional modules used by the malicious specimen.

Image 9: Image used by Waledac containing an added executable

Source: INTECO

One of the functionalities of Waledac is its capacity to download and interpret a specially
manipulated JPEG image file. This file is an ordinary JPEG image behind which an
executable has been added, after a certain JPEG marker in order to be consistent with the
standard.

The executable is encrypted using a simple one-byte XOR operation. The result is an
image which can be viewed in most browsers and viewers, but which carries additional
malicious code to be installed by a computer already infected by Waledac. For instance, it
has been seen that Waledac uses this technique to install a library of captured network
packets with the aim to record all FTP, HTTP, etc, codes used by the victim.

2
Further information in http://alerta-antivirus.inteco.es/virus/detalle_virus.html?cod=8426 (Spanish).

“Steganography, the art of hiding information” Page 12 of 15


Information Security Observatory
The picture in Image 9 was published on a web server. To convince the company’s
administrator managing the web server that it is a malicious code deliberately uploaded by
one of its customers to be used combined with another piece of malware is not an easy
task. This means that the amount of time to shut down infrastructures like this increases
greatly, as well as the attackers’ benefits.

Similarly, it is possible to create stego-objects containing orders for a trojan (to attack the
subnet 217.140.16.0/24, click on the ads of a certain web page, etc.) and to upload them
on social network profiles. This hypothetical trojan would periodically visit that social
network profile and process the recently uploaded images in search for new instructions. If
the social network profile is believable and the photographs seem to indicate that it is a
real individual, it is very likely that the social network’s abuse department is ignoring any
request of profile removal. It is more reasonable to think that it is a complaint made by
some known individual who wants to annoy the owner of the profile causing its shutting
than to think that it is a part of a trojan’s infrastructure.

Digital watermarks

A digital watermark is an identification code which is directly inserted into the content of a
multimedia file, usually with the aim to include information related to copyright or
intellectual property of the digital content in question.

The presence of this watermark must be imperceptible to the human perception system,
as well as easily extractable by a telematic application that knows the algorithm to retrieve
it.

As a result, diverse techniques, which can be considered steganography in some cases,


are currently being used to achieve that purpose.

The most common applications of watermarks are:

• Property test: identification of the source, author, owner, distributor and/or


consumer of a digital file.

• Fingerprinting: this includes the data associated with a transaction, the file owner
and buyer. It allows identifying the person responsible for illegal copies of
copyrighted content.

• Classification of contents: watermarks can be used to show the type of content


of a file. For example, in an ideal world, websites with adult content would include
specific watermarks in all their images, videos, etc. Content-filtering programs
would detect them automatically and easily and prevent children from viewing
them.

“Steganography, the art of hiding information” Page 13 of 15


Information Security Observatory
• Restriction in the use of contents: used in combination with applications or
embedded systems programmed for that purpose, watermarks can be used to
prevent the display of some contents when these are copied more than a certain
number of times, from a specific date, etc.

In any case, a proper watermark must be robust against changes made to the original file
and manipulations of the watermark itself, it must not be perceptible to humans and must
have a minimal impact on the statistical properties of its carrier object. Depending on the
use which is made of the watermark, it is also possible to state as a desired property its
ease of modification, e.g. to count how many times a specific content has been
reproduced.

Further applications

Just like malware, physical individuals can also use steganography to create a covert
communication channel through social networks. In fact, according to “USA Today” 3
newspaper the FBI and the CIA discovered that Bin Laden used steganographied images
uploaded to public websites to communicate with his officials.

Another not very ethical use made of steganography is the leak of information within
corporate, military, governmental, etc. environments. In environments where the content
extracted by an employee through digital media is monitored, steganography is useful to
take diagrams, documents and other sensitive information without raising suspicions by
the supervisor.

Not all steganography applications have to be malicious, however. For instance, it can be
used to embed information of patients into radiographies, TACs and so on, and to classify
multimedia contents or be integrated into authentication mechanisms.

VI Conclusions

Steganography is a constantly evolving technique, with a long history and capacity to


adapt itself to the new technologies. While steganography tools become more advanced,
the techniques and tools used in steganalysis become more complex.

The cover files do not need to be images exclusively, any medium is valid (audio, video,
executable file, etc.). Far from being an exclusively theoretical invention, malware has
demonstrated to actively use steganography, and it is to be expected that new use
approaches arise to make the work of analysts more difficult.

3
Further information in http://www.usatoday.com/tech/news/2001-02-05-binladen.htm

“Steganography, the art of hiding information” Page 14 of 15


Information Security Observatory
But applications of this science are not only limited to the field of what is not very ethical,
but can be of help in fields such as medicine, child protection, etc.

In any case, the efficiency of the new steganalysis techniques makes necessary the use
of steganography combined with cryptography with the aim to achieve a reasonable
security level. Cryptography ensures confidentiality of a conversation but does not hide
the fact that this conversation is taken place. On the other hand, steganography alone can
hide the fact that a conversation is taken place, but once the interaction is revealed, it is
possible for an attacker to know the exchanged content. Even though discovering the
original content may be difficult, an attacker may modify the stego-object to prevent
communication (active attack). By combining both techniques it is possible to achieve a
complementarity that multiplies the security of message exchanging.

“Steganography, the art of hiding information” Page 15 of 15


Information Security Observatory

You might also like