Professional Documents
Culture Documents
Instituto Nacional
Notebook
de Tecnologías
de la Comunicación
From the Greek steganos (covered) and graphos (writing), steganography can be defined
as the hiding of information through a covert channel with the purpose of preventing the
detection of a hidden message.
Source: INTECO
This science has aroused great interest in recent years since it has been used by crime
and terrorist organisations. However, this is not a new invention, but has been employed
since ancient times. This article is intended to introduce the reader to the field of
steganography, clarifying its differences from cryptography and showing examples of the
software used for this technique.
Over 400 years before Christ, Herodotus had already reflected in his book The Histories
the use of steganography in ancient Greece. He describes how a character takes a little
book of two ‘leaves’ or small wooden boards, properly scratches the wax covering them,
engraves a message in the wood and covers it with wax again.
Another story in this book describes how another character shaves the head of one of his
slaves with a razor and tattoos a message on his scalp. Then he waits for the slave’s hair
to grow again and send him to the recipient of the message with instructions to shave his
head.
Image 2: Small wooden board used to write a hidden message engraved in the wood under
the wax
Source: INTECO
Likewise, during the World War II, small holes were punched through some letters of a
newspaper in a way that, when holding it to the light, it was possible to see all those
letters and interpret them as a message.
The “invisible ink” example is however much more familiar to the reader. Many kids play
this game of sending each other messages written with lemon juice or similar substances
(highly carbonated), so that when heating the surface on which the message is written,
this emerges in a shade of coffee brown. This technique may be more complex if further
chemical reactions are applied.
• Cover object (container object): it is the object used to carry the hidden
message. Going back to the example of the messages tattooed on the slave’s
scalp, the cover object is the slave himself.
• Stego-object: it is the cover object together with the hidden message. Following
with the previous example, the stego-object is the slave once the message has
been written on his scalp and once his hair has grown back to normal.
• Adversary: these are all those entities from whom the covert information is being
hidden. In the previous example, it is the guard who delivers the messages to one
or the other prisoner. This adversary can be passive or active. A passive
adversary suspects that covert communication may be taken place and tries to
discover the algorithm extracted from the stego-object, but does not attempt to
alter that object. An active adversary, apart from trying to find out the covert
communication algorithm, modifies the stego-object with the aim to corrupt any
attempt of subliminal messaging.
Considering that there can be active adversaries, a good steganographic technique must
be robust against distortions, be they accidental or a result of the interaction of an active
adversary.
Therefore, the use of steganography alone stands in contradiction to one of the basic
security principles: security through obscurity (ignorance) does not work.
At the beginning of the 20th century, Kerckhoff formulates a series of principles which
have become key pillars in the field of security; one of them states: “assume that the
(malicious) user knows all encryption procedures”. If such a principle is applied to
steganography, it means assuming that the security guard knows the algorithm that hides
the message into the cover object, which involves the immediate isolation of the prisoners.
In order for steganography to be more useful, it must be combined with cryptography. The
message to be exchanged must be encrypted (in a robust way) and then embedded in the
cover object. As a result, even though the interceptor discovers the steganographic
pattern, it will never get to know the message exchanged.
The combination of both techniques has another advantage: when cryptography is used
alone, one knows that messages are being exchanged, which can act as a starting point
for an attack aiming to discover the message. By introducing steganography, in most
cases one does not even know that an encrypted communication is being taken place.
This article focuses on the most used cover object: digital images and, particularly,
images in BMP format for its simplicity (this is an uncompressed file format). The ideas
presented can be applied to other formats (JPG, PNG, etc.) and other carrier objects
(videos, documents, etc.) as long as the specific characteristics of each format are
respected.
This technique replaces certain bits of the cover file by those of the information to be
hidden. The advantage of this approach is that the size of the cover file is not modified
and, on many occasions, neither its quality thanks to the redundancy and/or excess detail
in such files.
For instance, in an audio file, it is possible to replace the bits which are not audible to
human ears with the bits of the message itself.
When working with images, the traditional method is to replace the least significant bits
(LSB), in a 24-bit colour scale (over 16 millions of colours). This only results in that a pixel
in a shade of red is seen as 1% darker. In many cases these are changes imperceptible to
human senses and can only be detected through computational analysis of the files’
structure.
BMP files are a standard bitmap image format in DOS and Windows operating systems
and valid for MAC and PC. It supports 24-bit (millions of colours) and 8-bit (256 colours)
images, and can work in grey scale, RGB and CMYK.
Source: INTECO
Every pixel of a 24-bit BMP file is represented by three bytes. Each of these bytes
contains the red, green and blue colour intensity (RGB: red, green and blue). Combining
the values in those positions it is possible to obtain the 224 (more than 16 millions) colours
that a pixel can take.
Likewise, each byte has a value between 0 and 255, which is to say between 00000000
and 11111111 in binary system, the leftmost bit being the most significant bit. This proves
that the least significant bits of a pixel can be modified without causing great alteration.
Source: INTECO
Each RGB component of the pixel has been given a one-unit higher value and the effect is
unnoticeable for the human eye. In fact, if we take into account that a pixel is surrounded
by other pixels, the visual effect goes even more unnoticed if its surroundings are not
modified.
The implication of this is that, by using one-bit changes in each component of a pixel, it is
possible to embed three bits of hidden information per pixel without producing noticeable
changes on the image. This may be done for each image pixel. Eight pixels are needed to
hide three bytes of information; in ASCII codification this means three letters of hidden
information. Therefore, in a BMP image of 502x126 pixels, it is possible to hide a
message of 23,719 ASCII characters.
Image 5: Image where information has been hidden in the least significant bits of their
pixels
Source: INTECO
This technique has an underlying conceptual error: it assumes that the information
originally stored in the least significant bits is random and that, consequently, modifying it
to insert hidden information does not reveal that the image has been edited. This is not
true and may serve as a basis for a steganalysis mechanism, as explained later in section
IV.
The information bits are added from a certain structural aspect of the file (end of file –
EOF-, padding spaces or alignment, etc.). This option has the disadvantage that, if the
size of the container object is modified, it may raise suspicions.
In order to extrapolate this idea to the example of BMP images, we must first understand
how this format is structured. The first 54 bytes contain image metadata, which are
divided as follows:
• 2 bytes Æ always containing the ‘BM’ string, which reveals it is a BMP file.
• 4 bytes Æ offset, distance between the heading and the first pixel of the image.
Given this structure, the trivial way of hiding data is to hide them just after the metadata
(between the image metadata and data) and change the “offset” field (distance between
metadata and image pixels). By doing this, it is possible to leave space for all the
additional content you want to include.
Image 6: Diagram of the result obtained from steganography by insertion in BMP images.
Source: INTECO
Image 6 proves that this is not a very silent technique. If the data to be hidden have
enough weight (several megabytes), it is somewhat suspicious to have a 10x10 pixel icon
taking up 5 megabytes. For this reason, the person in charge of hiding the information
must distribute it in different images in order for the change not to be so obvious.
This option is simply the generation of a container file with the very information to be
hidden, instead of obtaining the container file separately and manipulating it to include that
information.
For instance, given a specific algorithm to reorder the bytes of the data to be hidden, a
string of pixels of a BMP file can be generated with some visual meaning. If the receiver
knows the reordering algorithm, the transmission of information is possible.
Manual steganalysis
It is the manual search of differences between the cover object and the stego-object,
looking for changes in the structure in order to find hidden data. The main disadvantages
of this technique are that the cover object is necessary and that, on many occasions, one
can detect hidden information within an object but is unable to retrieve it.
Nevertheless, when we do not have the container file, it is possible to look for irregularities
in the steganographied file in order to find signs of the existence of hidden data.
Visual attacks alert the human eye of the presence of hidden information thanks to the
applying of filters. Let’s consider the BMP file, where the least significant bit of the
components of some of its pixels has been replaced by hidden information. Within this
setting, the manual steganalysis involves applying such a filter that only the least
significant bit of each RGB component of each pixel is considered.
This is what Image 7 shows: the first image hides information and, when applying the
filter, a small uniform pattern is noticeable on the top of the image, apart from the overall
change of shade compared to the filtered image of the original file.
Image 7: Manual steganalysis of a BMP file containing information hidden through LSB
Source: INTECO
These differences are produced because the hiding of information in LSB is based on the
premise that the information originally stored in that bit is random. But this is not true and
the hiding of information in it provides additional clues to an analyst. It is precisely for this
reason that the images with little variability of colour and/or uniform areas are not good
candidates for LSB steganography. An image which is robust against an attack of this
type is a natural, not artificial, image with great variation of shades and/or colours.
It is the process of comparing the frequency of colour distribution in the stego-object. This
is a slow technique, for which specialised software is required. These programs usually
look for message hiding patterns used by the most common steganography programs.
This approach makes them really effective when we work on messages hidden with these
typical programs. However, it is almost impossible for these programs to find the
messages that have been hidden manually.
The details of the statistical steganalysis techniques go beyond the scope of this article;
only one mechanism is briefly explained with the aim to provide the reader with a basic
reference model.
One of these techniques is the Chi-Square 1 attack, which permits to estimate the size of
the information possibly hidden in a stego-object. It can be applied when a fixed set of
pairs of values (PoVs) switch from a value to the other value in the pair when the bits of
the hidden message are inserted.
Some unusual situations in which steganography has been used or could be used are
described in this section. It is not a comprehensive list, but an attempt to show the
practical application of the previously stated theory.
1
Westfeld, A. Pfitzmann, A. Attacks on Steganographic Systems. http://www.ece.cmu.edu/~adrian/487-s06/westfeld-
pfitzmann-ihw99.pdf
For instance, considering the TCP header only, data can be hidden in the initial sequence
number of a connection. This provides 32 bits of hidden data per packet of initial
connection (SYN), i.e. 4 ASCII characters. Following this philosophy it is possible to hide
information in other header fields of the different protocols which make up TCP/IP, as long
as the changes do not involve the refusal of the exchanged packets.
Malware control
Today’s malware usually communicates with an attacker’s control point in order to receive
commands to download additional modules, send stolen data, warn that a new victim has
been infected, etc.
The most used protocol for this type of communication is HTTP, since it generally takes
place on a port that is not filtered by firewalls and it is able to go unnoticed through the
rest of the network traffic generated by legitimate browsing.
In addition, the ease of establishing a control channel with traditional HTTP GET/POST
requests involves communication being easily detectable (if encryption methods are not
used) by enterprises managing web servers/DNS servers associated with the control link.
Even simpler is the identification and interpretation of such communication for a malware
analyst. This means that in the view of suspected illegal activity related to a specific
control point, the infrastructures are more rapidly shut down by the companies managing
them.
The implication for the attacker is a shorter average life time of its control channel and,
consequently, a lower investment return.
Within the aim to hide and strengthen malicious communication channels, steganography
sets itself up as a highly interesting weapon. In fact, the authors of the Waledac 2 worm
have already used steganography by insertion in the download and installation of
additional modules used by the malicious specimen.
Source: INTECO
One of the functionalities of Waledac is its capacity to download and interpret a specially
manipulated JPEG image file. This file is an ordinary JPEG image behind which an
executable has been added, after a certain JPEG marker in order to be consistent with the
standard.
The executable is encrypted using a simple one-byte XOR operation. The result is an
image which can be viewed in most browsers and viewers, but which carries additional
malicious code to be installed by a computer already infected by Waledac. For instance, it
has been seen that Waledac uses this technique to install a library of captured network
packets with the aim to record all FTP, HTTP, etc, codes used by the victim.
2
Further information in http://alerta-antivirus.inteco.es/virus/detalle_virus.html?cod=8426 (Spanish).
Similarly, it is possible to create stego-objects containing orders for a trojan (to attack the
subnet 217.140.16.0/24, click on the ads of a certain web page, etc.) and to upload them
on social network profiles. This hypothetical trojan would periodically visit that social
network profile and process the recently uploaded images in search for new instructions. If
the social network profile is believable and the photographs seem to indicate that it is a
real individual, it is very likely that the social network’s abuse department is ignoring any
request of profile removal. It is more reasonable to think that it is a complaint made by
some known individual who wants to annoy the owner of the profile causing its shutting
than to think that it is a part of a trojan’s infrastructure.
Digital watermarks
A digital watermark is an identification code which is directly inserted into the content of a
multimedia file, usually with the aim to include information related to copyright or
intellectual property of the digital content in question.
The presence of this watermark must be imperceptible to the human perception system,
as well as easily extractable by a telematic application that knows the algorithm to retrieve
it.
• Fingerprinting: this includes the data associated with a transaction, the file owner
and buyer. It allows identifying the person responsible for illegal copies of
copyrighted content.
In any case, a proper watermark must be robust against changes made to the original file
and manipulations of the watermark itself, it must not be perceptible to humans and must
have a minimal impact on the statistical properties of its carrier object. Depending on the
use which is made of the watermark, it is also possible to state as a desired property its
ease of modification, e.g. to count how many times a specific content has been
reproduced.
Further applications
Just like malware, physical individuals can also use steganography to create a covert
communication channel through social networks. In fact, according to “USA Today” 3
newspaper the FBI and the CIA discovered that Bin Laden used steganographied images
uploaded to public websites to communicate with his officials.
Another not very ethical use made of steganography is the leak of information within
corporate, military, governmental, etc. environments. In environments where the content
extracted by an employee through digital media is monitored, steganography is useful to
take diagrams, documents and other sensitive information without raising suspicions by
the supervisor.
Not all steganography applications have to be malicious, however. For instance, it can be
used to embed information of patients into radiographies, TACs and so on, and to classify
multimedia contents or be integrated into authentication mechanisms.
VI Conclusions
The cover files do not need to be images exclusively, any medium is valid (audio, video,
executable file, etc.). Far from being an exclusively theoretical invention, malware has
demonstrated to actively use steganography, and it is to be expected that new use
approaches arise to make the work of analysts more difficult.
3
Further information in http://www.usatoday.com/tech/news/2001-02-05-binladen.htm
In any case, the efficiency of the new steganalysis techniques makes necessary the use
of steganography combined with cryptography with the aim to achieve a reasonable
security level. Cryptography ensures confidentiality of a conversation but does not hide
the fact that this conversation is taken place. On the other hand, steganography alone can
hide the fact that a conversation is taken place, but once the interaction is revealed, it is
possible for an attacker to know the exchanged content. Even though discovering the
original content may be difficult, an attacker may modify the stego-object to prevent
communication (active attack). By combining both techniques it is possible to achieve a
complementarity that multiplies the security of message exchanging.