You are on page 1of 28

COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY COCHIN 682022 2011

Seminar Report On Confidential Data Storage and Deletion Submitted By Mithun A V In partial fulfilment of the requirement for the award of Degree of Master of Technology (M.Tech) In Computer and Information Science

DEPARTMENT OF COMPUTER SCIENCE COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY COCHIN 682022

Certificate
This is to certify that the Seminar report entitled Confidential Data Storage and Deletion, submitted by Mithun A V, Semester II, in the partial fulfilment of the requirement for the award of M.Tech. Degree in Computer and Information Science is a bonafide record of the Seminar presented by him in the academic year 2011.

G Santhosh Kumar Seminar Guide

Dr. K Paulose Jacob Head of the Department

Department of Computer Science

CUSAT

ACKNOWLEDGEMENT
I express our profound gratitude to the Head of Department Dr. K Paulose Jacob for allowing me to proceed with the seminar and also for giving me full freedom to access the lab facilities. My heartfelt thanks to my guide Mr.G Santhosh Kumar, Lecturer, Department of Computer Science for taking time and helping me through my seminar. He has been a constant source of encouragement without which the seminar might not have been completed on time. I am very grateful for his guidance. I am also thankful to, Dr. Sumam Mary Idicula for helping me with my seminar. Her ideas and thoughts have been of great importance.

Department of Computer Science

ii

CUSAT

ABSTRACT
With the decrease in cost of electronic storage media, more and more sensitive data gets stored in those media. Laptop computers regularly go missing, either because they are lost or because they are stolen. These laptops contain confidential information, in the form of documents, presentations, emails, cached data, and network access credentials. This confidential information is typically far more valuable than the laptop hardware, if it reaches right people. There are two major aspects to safeguard the privacy of data on these storage media/laptops. First, data must be stored in a confidential manner. Second, we must make sure that confidential data once deleted can no longer be restored. Various methods exist to store confidential data such as encryption programs, encryption file system etc. Microsoft BitLocker Drive Encryption provides encryption for hard disk volume and is available with Windows Vista Ultimate and Enterprise editions. This seminar describes the most commonly used encryption algorithm, Advanced Encryption System (AES) which is used for many of the confidential data storage methods. This seminar also describes some of the confidential data erasure methods such as physical destruction, data overwriting methods and Key erasure. Keywords: Privacy of data, Confidential data storage, Encryption, Advanced Encryption Standard (AES), Microsoft Bit Locker, Confidential data erasure, Data overwriting, Key erasure.

Department of Computer Science

iii

CUSAT

Confidential Data Storage and Deletion

Contents 1. Introduction ............................................................................................................................... 2 2. Encryption ................................................................................................................................. 3 2.1 Advanced Encryption Standard (AES) ................................................................................ 4 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 Sub Bytes ................................................................................................................... 6 Shift Rows.................................................................................................................. 6 Mix Columns.............................................................................................................. 7 Add Round Key.......................................................................................................... 8 Key Expansion ........................................................................................................... 8

3. Confidential Data Storage ........................................................................................................ 10 3.1 Block based Encryption System ....................................................................................... 11 3.1.1 3.1.2 3.1.3 Microsoft Bit Locker ................................................................................................ 11 User Space File System ............................................................................................ 13 Encryption Programs ................................................................................................ 14

3.2 Hardware Based Methods................................................................................................. 14 4. Confidential Data Erasure ........................................................................................................ 16 4.1 Physical Destruction......................................................................................................... 16 4.2 Data Overwriting ............................................................................................................. 17 4.2.1 4.2.2 Software Applications .............................................................................................. 18 File System .............................................................................................................. 19

4.3 Encryption with Key Erasure ............................................................................................ 20 5. Other Challenges ..................................................................................................................... 21 5.1 Hard Disk Issues .............................................................................................................. 21 5.2 Data Life Time Problem ................................................................................................... 22 6. Conclusion .............................................................................................................................. 23 7. References ............................................................................................................................... 24

Department of Computer Science

CUSAT

Confidential Data Storage and Deletion

1. Introduction
As the cost of electronic storage declines rapidly, more and more sensitive data is stored on media such as hard disks, CDs, and pen drives. Many computers store data about personal finances, online transactions, tax records, passwords for bank accounts and emails. All these sensitive information are vulnerable to theft. Sensitive data may also be leaked accidentally due to improper disposal or resale of storage media. To protect the secrecy of the entire data lifetime, we must have confidential ways to store and delete data. Traditional methods for protecting confidential information rely on upholding system integrity. If a computer is safe from hackers and malicious software (malware), then so is its data. Ensuring integrity in todays interconnected world, however, is exceedingly difficult. There are two major components to safeguard the privacy of data on electronic storage media. First, the data must be stored confidentially without incurring much inconvenience during normal use. Second, data must be removed from the storage medium in an irrecoverable manner, at the time of disposal. The general concept of secure handling of data is composed of three aspects: confidentiality, integrity, and availability. Confidentiality involves ensuring that information is not read by unauthorized persons. Using encryption to store data or authenticating valid users are examples of means by which confidentiality is achieved. Integrity ensures that the information is not altered by unauthorized persons. Storing a message authentication code or a digital signature computed on encrypted data is a way to verify integrity. Finally, availability ensures that data is accessible when needed. Having multiple servers withstand a malicious shutdown of a server is one way to improve availability.

Department of Computer Science

CUSAT

Confidential Data Storage and Deletion

2. Encryption
Encryption is the process of transforming information (referred to as plaintext) using an algorithm (called cipher) to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The unreadable text created is known as cipher text. The reverse process is known as decryption. There are two basic techniques for encrypting information: symmetric encryption (also called secret key encryption) and asymmetric encryption (also called public key encryption). Symmetric encryption is the oldest and best-known technique. A secret key, which can be a number, a word, or just a string of random letters, is applied to the text of a message to change the content in a particular way. This might be as simple as shifting each letter by a number of places in the alphabet. As long as both sender and recipient know the secret key, they can encrypt and decrypt all messages that use this key. This is shown in Fig 1.

Fig 1. Symmetric Key Encryption The problem with secret keys is exchanging them over the Internet or a large network while preventing them from falling into the wrong hands. Anyone who knows the secret key can decrypt the message. One answer is asymmetric encryption, in which there are two related keys-a key pair. A public key is made freely available to anyone who might want to send you a message. A second, private key is kept secret, so that only you know it. Any message (text, binary files, or documents) that are encrypted by using the public key can only be decrypted by applying the same algorithm, but by using the matching private key. Any message that is encrypted by using the private key can only be decrypted by using the
Department of Computer Science 3 CUSAT

Confidential Data Storage and Deletion

matching public key. Here, we do not have to worry about passing public keys over the Internet (the keys are supposed to be public). A problem with asymmetric encryption, however, is that it is slower than symmetric encryption. It requires far more processing power to both encrypt and decrypt the content of the message. 2.1 Advanced Encryption Standard (AES) In cryptography, the Advanced Encryption Standard (AES) is a symmetrickey encryption standard adopted by the U.S. government. The standard comprises three block ciphers, AES-128, AES-192 and AES-256, adopted from a larger collection originally published as Rijndael. Each of these ciphers has a 128-bit block size, with key sizes of 128, 192 and 256 bits, respectively. The AES ciphers have been analysed extensively and are now used worldwide, as was the case with its predecessor, the Data Encryption Standard (DES). AES was announced by National Institute of Standards and Technology (NIST) as U.S. FIPS PUB 197 (FIPS 197) on November 26. There are three versions of AES with 10, 12 and 14 rounds. The key size can be 128, 12 or 256 bits depending on the number of rounds. General design of an AES encryption cipher is given in Fig 2.
128 bit plain text

Pre Round Transformation

Round Keys (128 Bits) K0 Cipher Key (128, 192 or 256 bits)
Key Expansion

Round 1 K1 Round 2 K2

Round N (Slightly Different) KNr

Nr 10 12 14

Key size 128 192 256

128 bit cipher text

Fig 2. General design of AES Encryption

Department of Computer Science

CUSAT

Confidential Data Storage and Deletion

AES uses five units of measurements to refer to data: bits, bytes, words, blocks and state. Bit is a binary digit with a value of 0 or 1. Byte is a group of 8 bits that can be treated as a single entity, a row matrix (1 x 8) of 8 bits. A word is a group of 32 bits that can be treated as a single entity, a row matrix of 4 bytes. A block is a group of 128 bits. AES encrypts and decrypts data blocks. AES uses several rounds in which each round is made of several stages. Data block is transformed from one stage to another. At the beginning and end of the cipher, AES uses the term data block; before and after each stage, the data block is referred to as a state.
State

Sub Bytes

State

Shift Rows

State

Mix Columns

State

Add Round Key

Round Key

State

Fig 3. Structure of each round

Department of Computer Science

CUSAT

Confidential Data Storage and Deletion

Fig 3. shows the structure of each round at the encryption side. Each round except the last uses four transformations those are invertible. The last round has only three transformations. One Add Round Key is applied before the first round. The third transformation is missing in the last round. At the decryption site, the inverse transformations are used.

2.1.1 Sub Bytes The first transformation, Sub Bytes is used at the encryption site. To substitute a byte, we interpret the byte as two hexadecimal digits. The left digit defines the row and the right digit defines the column of the substitution table. The two hexadecimal digits at the junction of the row and the column are the new byte. In the Sub Byte transformation a state is treated as a 4 x 4 matrix of bytes. Transformation is done one byte at a time. The content of each byte is changed, but the arrangement of bytes in the matrix remains the same. Fig 4. shows this idea.

b16

a16

Table

cd16

ab16

State

State

Fig 4. Sub Byte transformation 2.1.2 Shift Rows Shifting is the permutation of bytes. Unlike DES, in which permutation is done at the bit level, shifting transformation in AES is done at the byte level; the order of bits in the byte is not changed. The number of shifts depends on the row number (0, 1, 2 or 3) of the state matrix. This means the row 0 is not shifted at all and the last row is shifted three bytes. Fig 5. shows this idea. Also, Shift Rows transformation operates one row at a time.

Department of Computer Science

CUSAT

Confidential Data Storage and Deletion

Shift left Row 0: no shift Row 1: 1 byte shift Row 2: 2 byte shift Row 3: 3 byte shift

Fig 5. Shift Rows transformation 2.1.3 Mix Columns Mix Columns transformation operates at the column level; it transforms each column of the state to a new column. The transformation is actually the matrix multiplication of a state column by a constant square matrix. The bytes in the state column and constant matrix are interpreted as 8-bit words (or polynomials). Fig 6. shows this idea

Mix Columns

Constant

State

Fig 6. Mix Columns transformation

State

Department of Computer Science

CUSAT

Confidential Data Storage and Deletion

2.1.4 Add Round Key Add Round Key adds a round key word with each state column matrix. Similar to mix column, Add Round Key also proceeds column by column. This idea is shown in Fig 7. AES uses a process called Key Expansion that creates N r + 1 round keys from the cipher key.

Add Round Key

Constant

State

Fig 6. Add Round Key transformation

State

2.1.5 Key Expansion Key expansion creates round keys word by word. The routine creates 4 x (Nr + 1) words that are called w0, w1, w2, . w4(Nr+1) 1.
Round Pre Round 1 2 .. Nr Words w0 w4 w8 w4Nr w4Nr+1 w4Nr+2 w4Nr+3 w1 w5 w9 w2 w6 w10 w3 w7 w11

The first four words are made from the cipher key (w0, w1, w2, w3). The remaining words (wi for i = 4 to 43) are made as follows if (i mod 4) 0, then wi = wi-1 wi-4
Department of Computer Science 8 CUSAT

Confidential Data Storage and Deletion

if (i mod 4) = 0, then wi = t wi-4


where t = SubWord(RotWord(wi-1)) RCon i/4

The RotWord transformation is similar to ShiftRows transformation, but it is applied to only one row. The routine takes a word as an array of four bytes and shifts each byte to the left with wrapping. The SubWord routine is similar to Sub Bytes transformation, but it is applied only to four bytes. The routine takes each byte in the word and substitutes another byte for it. Each round constant, RCon, is a 4-byte value in which the rightmost three bytes are always zero. Round 1 2 3 4 5 Constant (RCon) (01 00 00 00)16 (02 00 00 00)16 (04 00 00 00)16 (08 00 00 00)16 (10 00 00 00)16 Round 6 7 8 9 10 Constant (RCon) (20 00 00 00)16 (40 00 00 00)16 (80 00 00 00)16 (1B 00 00 00)16 (36 00 00 00)16

Department of Computer Science

CUSAT

Confidential Data Storage and Deletion

3. Confidential Data Storage


Achieving confidentiality means storing data in a way that can be read or deciphered only by authorized persons. No unauthorized persons should be able to read or otherwise obtain meaningful information from this data, even with physical access to the storage media (e.g., a stolen laptop). Fig 7 shows the storage data paths for popular Unix-based and Windows operating systems. For both platforms, applications reside in user space. When a Unix application makes a call to a file system, the call crosses the kernel boundary and is handled by the Virtual File System (VFS) layer. VFS provides functions commonly used in various file systems to ease individual file system implementations, and allows different file systems to coexist, including local file systems such as ext3 and network file systems such as NFS. Local file systems then proceed to read and write to the block layer, which provides a unified API to access block-layer devices. When a Windows application makes a file system call, that call gets passed to the I/O Manager. The I/O Manager translates application file system calls into I/O request packets, which it then translates into device-specific calls. The File System Drivers are high-level drivers such as FAT and NTFS. These drivers rely on the Storage Device Drivers, which are lower-level drivers that directly access the storage media. Note that both UNIX and Windows storage data paths share almost one-to-one mapping in terms of their internal structures. Thus, a confidential storage solution designed for one can be generalized to both platforms
UNIX Storage Path Application Windows Storage Path Application User Kernel VFS I/O Manager

File System

File System Driver

Block Layer

Storage Device Drivers

Storage Media

Storage Media

Fig 7. Unix and Windows storage path


Department of Computer Science 10 CUSAT

Confidential Data Storage and Deletion

3.1 Block based Encryption System Block-based encryption systems work at a lower layer of abstraction than file systems. In other words, these systems work transparently below file systems to encrypt data at the disk-block level. Examples of block based encryption systems include dm-crypt, BestCrypt, the CryptoGraphic Disk driver, the Encrypted Volume and File System, and Microsoft BitLocker Drive Encryption. 3.1.1 Microsoft Bit Locker BitLocker is not a software-only technology. Every software-only solution is vulnerable to software-only attacks. BitLocker makes use of the TPM security chip which will be incorporated in most PCs in the near future. The TPM is a tamper-resistant chip mounted on the motherboard. Though the TPM has many functions, BitLocker uses only a few basic ones. The TPM keeps several Platform Configuration Registers, or PCRs. At power-up the PCRs are set to zero. PCRs are only modified by the extend function which sets a PCR to the hash of its old value and a supplied data string. We can think of a PCR as a hash over all the data strings provided in extend function calls for that PCR. There is no other way to set the value of a PCR, so if a PCR has value x after a sequence of extends, then the only way to reach the value x again is to perform the exact same sequence of extends after a power-up. The seal/unseal functions of the TPM allow selective access to cryptographic keys based on PCR values. The seal function is used to encrypt a key into a string which can only be decrypted by that same TPM. Furthermore, the TPM will decrypt the string if and only if the selected PCRs have the value that was specified during the seal operation. In other words: we can store a key in an encrypted string so that it can only be accessed when selected PCRs have a particular value. During the boot process the PCRs are used to keep track of the code that runs. The key used to encrypt the disk is sealed against a particular set of PCR values. During a normal boot the PCRs reach the same values, and the key can be unsealed by the TPM. If an attacker boots into any other operating system, the machine will be fully functional but the PCR values will be different and the TPM will not unseal the key. Thus, other operating systems cannot read the data on the disk, or find out how to modify the disk to reset the Administrator password. At power-up the processor starts running the BIOS from ROM. The first part of the BIOS cannot be modified. This part extends the BIOS PCR with the entire BIOS code and proceeds with the rest of the BIOS start up. After BIOS initialization the BIOS reads the Master Boot Record (MBR) of the hard disk, extends the boot sector PCR with the sector's data, and then executes the code in the boot sector. The boot sequence of a PC contains several more iterations, but in each case the newly-loaded code is first measured using an extend function before it is executed. These functions do not interfere with the boot process of another operating system. Other operating systems can boot normally; but the TPM PCRs will have a different value. The PCRs merely report what software was run during the boot process. Fig 8 gives an overview of our solution. There are four separate operations in each encryption. The plaintext is exclusive-orred (xorred) with a sector key, then run through two (un keyed) diffusers, and finally encrypted with AES in CBC mode.
Department of Computer Science 11 CUSAT

Confidential Data Storage and Deletion

Plain Text (512 8192 bytes)

Key (512 bits)

Derive Sector Key A Diffuser

B Diffuser

AES-CBC

Cipher text

Fig 8. An overview of AES-CBC +Diffuser The AES-CBC component is straightforward. The AES key KAES is either 128 bits or 256 bits, depending on the selected version. The block size is a always a multiple of 16 bytes, so no padding is necessary. The IV for sector s is computed as: IVs = E(KAES, e(s)) where E() is the AES encryption function, and e() is an encoding function that maps each sector number s into a unique 16-byte value. The plaintext is encrypted using AES-CBC and the IV for the sector. Decryption is the obvious inverse function. The sector key for sector s is defined by: Ks = E(Ksec, e(s)) || E(Ksec, e(s)) where E() is the AES encryption function, Ksec is the 128 or 256-bit key for this component, e() is the encoding function used in the AES-CBC layer, and e(s) is the same as e(s) except that the last byte of the result has the value 128. The A and B diffusers are very similar, but work in opposite directions. Core diffuser design has good diffusion properties in one direction and bad diffusion properties in the other direction. Having two diffusers provides good diffusion in both directions. The diffusers have been designed in the decryption direction, as decryption is the more common operation. Each diffuser interprets the sector data as an array of 32-bit words, where each word is encoded using the least-significant-byte first convention. Let n be the number of words in the sector, and (d0, d 1,. d n-1) be the words of the sector. For index values outside the range define d i = d i mod n to allow easy wrap-around without confusing notation. The decryption function of the A diffuser is given by:
Department of Computer Science 12 CUSAT

Confidential Data Storage and Deletion

for i = 0, 1, 2, ., n.Acycles -1 di = di + (di-2 (di-5 <<<R(a)i mod 4)) The value i is a loop counter that goes around the data array Acycles times. The addition is modulo 232, <<< is the rotate-left operator, and R(a) = [9, 0, 13, 0] is an array of 4 constants that specify the rotation amounts. The corresponding encryption function of the A diffuser is given by: for i = n.Acycles -1, .., 2, 1, 0 di = di - (di-2 (di-5 <<<R(a)i mod 4)) The B diffuser is very similar. It has good diffusion in the encryption direction. The B diffuser decryption function is given by: for i = 0, 1, 2, ., n.Bcycles -1 di = di + (di+2 (di+5 <<<R(b)i mod 4)) where R(b) = [0, 10, 0, 25]. The B diffuser encryption function is: for i = n.Bcycles -1, .., 2, 1, 0 di = di - (di+2 (di+5 <<<R(b)i mod 4)) The constants Acycles and Bcycles define how many times each of the diffusers loop around the sector, and are chosen as Acycles = 5 and Bcycles = 3. 3.1.2 User Space File System User-space file systems take advantage of the Filesystem in Userspace (FUSE) module, which is a Unix kernel module that allows a virtual file system to be built inside a user-space program without having to write any kernel-level code. FUSE intercepts VFS calls and directs them to a user-space file system with added security features before forwarding requests to an underlying legacy file system in the kernel space. Two examples of FUSE-based secure storage file systems include EncFS and CryptoFS. Both systems are similar in i. ii. iii. iv. v. Storing encrypted files and file names in encrypted directories; Requiring users to mount encrypted directories onto a special mount point with the correct key to see decrypted files and file names Prompting users for a password to generate the encryption key Typically supporting common encryption algorithms such as AES, DES, Blowfish, Twofish, based on what is available in external encryption libraries; and Encrypting files on a per-block basis.
13 CUSAT

Department of Computer Science

Confidential Data Storage and Deletion

User Space File System Application

libfuse

glibc User Kernel

VFS

FUSE

Lower File System (Such as ext3)

Storage Media

Fig 9. User Space File System 3.1.3 Encryption Programs Software encryption programs come in two flavors: generalized encryption programs and built-in encryption mechanisms in applications. Generalized encryption programs can encrypt and decrypt files using a variety of ciphers and encryption modes; several examples are mcrypt, openssl, and gpg. Many applications also include cryptographic options to protect the confidentiality of files. Examples include the text editor vim and Microsoft Office products such as Word and Excel. These applications either derive the key from the users system information (such as a password) or prompt for a key or passphrase at the beginning of the session. 3.2 Hardware Based Methods The secure flash drive is a relatively new phenomenon on the market today, apparently in response to data and identity theft. Two example products are Ironkey and the Kingston Data Traveler Secure. Ironkey uses hardware-based AES encryption in CBC mode with 128-bit randomly generated keys. These keys are generated inside the flash drive in a CryptoChip and are unlocked by a user password. Ten wrong attempts will trigger a self-destruct of the encryption keys. Two volumes become available when the Ironkey is inserted: one software volume and one encrypted volume where user data is stored. Password entering (or unlocking) software is installed on the software volume, and not on the host computer, yet it must be executed on the flash drive using the host operating system. The Kingston Data
Department of Computer Science 14 CUSAT

Confidential Data Storage and Deletion

Traveler Secure is similar to Ironkey, except that it uses 256-bit keys with AES encryption and allows users to store data on either the encrypted or non encrypted partition. Hardware-based encryption flash drives can employ good encryption techniques and, similarly to software block-based encryption systems, directory structure is not revealed. In terms of overhead, all encryption operations are performed on the flash drives themselves and do not consume CPU cycles and memory on the host machine. Therefore, the performance depends on the speeds of the on-flash cryptographic processing, flash data access times, and the interface used to access secure flash drives. Hard disk enclosures and extension cards (either PCI or PCMCIA) have been used for several years as a fast, transparent encryption mechanism for sensitive data. Examples include SecureDisk Hardware and RocSecure Hard Drives. These solutions intercept and encrypt/decrypt data going to and from the hard drive real-time and use a specialized USB thumb drive as the encryption key. The encryption key is generated and placed on the thumb drive by the manufacturer, and often the method of key generation (i.e., randomness technique) is not disclosed. Enclosures and extension cards can employ good encryption techniques and do not divulge any information about files or the structure of the file system on disk. Policy regarding confidentiality changing is not supported, since keys, encryption algorithms, and mode of operation cannot be changed. Secure storage is not performed on a per-file level, so the entire hard disk must be encrypted. This characteristic may not be flexible in regards to security policy.

Department of Computer Science

15

CUSAT

Confidential Data Storage and Deletion

4. Confidential Data Erasure


When confidential data have to be removed, we must be sure that once deleted, the data can no longer be restored. A full secure data lifecycle implies that data is not only stored securely, but deleted in a secure manner as well. However, typical file deletion (encrypted or not) only removes a file name from its directory or folder, while a files content is still stored on the physical media until the data blocks are overwritten. Many forensic techniques are available to the determined (and well-funded) attacker to recover the data. CMRR scanning microscopes can recover data on a piece of a destroyed disk if any remaining pieces are larger than a single 512-byte record block in size, which is about 1/125 on todays drives. Magnetic force microscopy and magnetic force scanning tunneling microscopy analyze the polarity of the magnetic domains of the electronic storage medium and can recover data in minutes. A less well-funded attacker can resort to many drive-independent data recovery techniques, which may be used on most hard drives independently of their make. The existence of these recovery techniques makes it mandatory that sensitive data be securely deleted from its storage media. Confidential data deletion can be accomplished in three ways: physical destruction of the storage medium, overwriting all of the sensitive data, and secure overwriting the key of encrypted sensitive data. Each method has its relative strengths and will be addressed in the following sections. 4.1 Physical Destruction One way of deleting sensitive data is through physical destruction. For example, the Department of Defense government document DoD states that classified material may be destroyed by numerous methods including smelting, shredding, sanding, pulverization, or acid bath. Needless to say, these methods will leave the storage medium unusable. With smelting, the hard drive is melted down into liquid metal, effectively destroying any data contained therein. Shredding grinds the hard drive down into small pieces of scrap metal that cannot be reconstructed. The sanding process grinds the hard drive platter down with an emery wheel or disk sander until the recordable surface is removed completely. Pulverization is the act of pounding or crushing a hard drive into smaller pieces through a mechanical process. An acid bath can be used for destruction of data on hard drive platters. A 58% concentration of hydriodic acid will remove the recordable surface of the platter.

Magnetic degaussing is another option that erases data by exposing a hard drive platter to an inverted magnetic field, which leaves data unrecoverable by software or laboratory attempts. This method also renders the storage media unusable. Physical destruction methods provide great confidentiality. On the other hand, the granularity of data destruction is the entire drive. For example, we cannot securely delete only one file using these methods. Therefore, this method does not support flexible security policies. Many of the discussed physical destruction methods require specialized equipment (which may not be easy to obtain) and potential physical removal of the storage media (which
Department of Computer Science 16 CUSAT

Confidential Data Storage and Deletion

may not be easy to perform), so physical destruction may not be straightforward to perform. Conversely, since physical destruction can destroy large amounts of data in a relatively short amount of time, the performance in this sense is quite good. 4.2 Data Overwriting Another way to remove confidential data is to overwrite the data. NIST recommends that magnetic media be degaussed or overwritten at least three times. Pass 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Data Written Random Random Random Random 01010101 10101010 10010010 01001001 00100100 00000000 00010001 00100010 00110011 01000100 01010101 01100110 01110111 10001000 10011001 10101010 10111011 11001100 11011101 11101110 11111111 10010010 01001001 00100100 01101101 10110110 11011011 Random Random Random Random Hex Code Random Random Random Random 0 55 0 55 0 AA 0 AA 0 92 0 49 0 49 0 24 0 24 0 92 0 00 0 00 0 11 0 11 0 22 0 22 0 33 0 33 0 44 0 44 0 55 0 55 0 66 0 66 0 77 0 77 0 88 0 88 0 99 0 99 0 AA 0 AA 0 BB 0 BB 0 CC 0 CC 0 DD 0 DD 0 EE 0 EE 0 FF 0 FF 0 92 0 49 0 49 0 24 0 24 0 92 0 6D 0 B6 0 B6 0 DB 0 DB 0 6D Random Random Random Random

01010101 10101010 01001001 00100100 10010010 00000000 00010001 00100010 00110011 01000100 01010101 01100110 01110111 10001000 10011001 10101010 10111011 11001100 11011101 11101110 11111111 01001001 00100100 10010010 10110110 11011011 01101101

01010101 10101010 00100100 10010010 01001001 00000000 00010001 00100010 00110011 01000100 01010101 01100110 01110111 10001000 10011001 10101010 10111011 11001100 11011101 11101110 11111111 00100100 10010010 01001001 11011011 01101101 10110110

0 55 0 AA 0 24 0 92 0 49 0 00 0 11 0 22 0 33 0 44 0 55 0 66 0 77 0 88 0 99 0 AA 0 BB 0 CC 0 DD 0 EE 0 FF 0 24 0 92 0 49 0 DB 0 6D 0 B6

Table 1. Peter Gutmanns 35-Pass Overwrite Technique

Department of Computer Science

17

CUSAT

Confidential Data Storage and Deletion

The Department of Defense document DoD suggests an overwrite with a character, its compliment, then a random character, as well as other software-based, methods that refer to nonvolatile electronic storage as listed in Table 2. ID C D E H Erasure Method Overwrite all addressable locations with a character. Overwrite all addressable locations with a character, its complement, then a random character and verify. Overwrite all addressable locations with a character, its complement, then a random character. Overwrite all locations with a random pattern, with binary zeros, and then with binary ones.

Table 2. Software-Based Methods of Erasing Data on Nonvolatile Storage, defined in the National Industrial Security Program Operating Manual Peter Gutmann developed a 35-pass data overwriting scheme to work on older disks that use error-correcting-encoding patterns, referred to as run-length-limited encodings. The basic idea is to flip each magnetic domain on the disk back and forth as much as possible without writing the same pattern twice in a row and to saturate the disk surface to the greatest depth possible. Peter Gutmanns 35-pass overwrite technique is demonstrated in Table 1. Three main methods exist to delete data securely from electronic storage media. These methods involve software applications, file systems, and hard disk mechanisms. Their characteristics and relative strengths are discussed in the following sections. 4.2.1 Software Applications Three main software methods exist for overwriting sensitive data: Overwrite the contents of a file. Delete the file normally, and then overwrite all free space in the partition. Erase the entire partition or disk.

The first method is probably the quickest if only a few small files are to be securely overwritten. Many utilities, both free and commercial, are available to perform this operation. Two common UNIX utilities are shred, made by the Free Software Foundation, Inc., and wipe. The shred utility will overwrite a files content with random data for a configurable number of passes (default 25). However, shred will not work on file systems that do not overwrite data in place. The shred utility will not overwrite a files metadata. In contrast, the wipe utility will write over file data using the 35-bit patterns recommended by Peter Gutmann. It will also attempt to remove filenames by renaming them, although this does not guarantee that the old filename (or metadata) will be overwritten. The wipe utility has file system limitations similar to those of shred. Overwriting all the free space in the partition is more of an afterthought method and might be employed after files have been deleted the normal way. One example is scrub, a Unix open-source utility, which erases free space in a partition by creating a file that extends
Department of Computer Science 18 CUSAT

Confidential Data Storage and Deletion

to all the free space. A user needs to remember to remove the file after the application is done. Erasing the entire partition or disk will securely delete all confidential information on the partition or disk such as data, metadata, and directory structures. One such software utility is Dariks Boot and Nuke, or DBAN, which is a self-contained boot floppy that wipes a hard drive by filling it with random data. Depending on the size of the drive, the erasure process can take a long time. Neither the software file-erasure nor the free-space-erasure methods will write over previously deleted metadata. Therefore, these methods can still leak confidential information. On the other hand, partition overwriting software will erase all data and metadata, as well as the structure of the file system. Flexibility of confidentiality policy settings varies among these methods due to different granularities of deletion. For example, it is possible to erase only sensitive files with software file erasure, while partition overwriting securely removes all files and metadata, regardless of their confidentiality requirements. All three methods are relatively easy to use. The user needs only to input a command in order for the secure erasure to take place. However, the user still needs to initiate secure erasure explicitly. The level of performance can vary with software file erasure, since the user has to wait for only chosen files (hopefully small) to be securely overwritten. The other two methods may incur a considerable wait time, depending on the size of the free space and storage partition. 4.2.2 File System Two examples of data overwriting file systems are FoSgen and Purgefs, which are stackable file systems built in FiST. Purgefs can overwrite file data and metadata when deleting or truncating a file. Alternatively, to increase efficiency, the purge delete option can be chosen using a special file attribute, for which only files with such an attribute will be purge-deleted. Purgefs will delete data one or more times, and supports the NIST standards and all NISPOM overwrite patterns (Table 2). FoSgen consists of two components: a file system extension and the user mode shred tool. FoSgen intercepts files that require overwriting and moves them to a special directory. The shred tool, invoked either manually or periodically, eventually writes over the data in the special directory. The authors of FoSgen have also created patches to add secure deletion functionality to the ext3 file system. The first patch adds one-pass secure deletion functionality, and the second patch supports multiple overwrites and securely deletes a files metadata. Both implementations work in all three of ext3s journaling modes, and erase either a specially marked files data or all files. Overwriting file systems can confidentially erase files and metadata using a variety of methods and passes. Users can specify the files and the number of passes and writing patterns for security policies. These file systems are easy to use because a user only needs to mount the file system with specific options. Unfortunately, depending on the file size, overwriting files may incur a heavy performance penalty.

Department of Computer Science

19

CUSAT

Confidential Data Storage and Deletion

4.3 Encryption with Key Erasure The third way to delete data securely is to encrypt the data and then securely erase the key. The encryption key is often securely deleted using overwriting methods. This combination allows for much faster secure deletion, in that only a small key is overwritten instead of the entire file (which could be very large). The downside is the extra encryption/decryption overhead of regular file operations until the file is deleted. Not many specialized solutions exist. One solution is built on top of the versioning file system, ext3cow, and is based on the all-or-nothing (AON) transform. AON is defined as a cryptographic transform that, given only partial output, reveals nothing about its input. AON is leveraged in the secure versioning file system to make decryption impossible if one or more of the ciphertext blocks belonging to a file (or a file version) is deleted. No commonly used solution of encryption with key erasure that we are aware of exists for general-use file systems. The policy and performance characteristics of any encryption method with the addition of key erasure are inherited from the base encryption method. The confidentiality characteristic is also inherited from the base encryption method, with one caveat: the encrypted data may not stay deleted forever if the encryption method used to initially encrypt the data is ever broken. For example, this may occur if a weakness is ever found in the encryption method, or exhaustive search of the key space becomes possible. Also, if the encryption key is protected by a password and the password is merely forgotten, the strength of the secure deletion is directly correlated to the strength of the password. It is best to delete the encryption key(s) securely through physical destruction or overwriting methods. The ease-of-use characteristic is degraded in that the user must destroy the key explicitly

Department of Computer Science

20

CUSAT

Confidential Data Storage and Deletion

5. Other Challenges
When confidential data have to be removed, we must be sure that once deleted, the data can no longer be restored. A full secure data lifecycle implies that data is not only stored securely, but deleted in a secure manner as well. However, typical file deletion (encrypted or not) only removes a file name from its directory or folder 5.1 Hard Disk Issues Two hard-disk-specific issues we must consider in relation to confidential data deletion include bad sector forwarding and storage-persistent caches. Bad sectors are disk locations that cannot be accessed consistently, developed during the normal use of a hard disk. Bad sector forwarding is performed transparently at the hardware level, in which the firmware identifies and remaps a bad sector to a reserved area hidden from the user through the hard-disk defects table (G-List). In other words, the defective sector is replaced with a sector on a different part of the hard disk. The defective sector cannot be accessed again by the hard disk itself. Figure 10 demonstrates bad sector forwarding. Data sector 2 has gone bad and has been detected by the hard disk firmware. The hard disk firmware remaps sector 2 to the reserved area sector 0. Now whenever a read or write operation is performed on sector 2, the operation will be mapped to reserved area sector 0. The problem with bad sector forwarding is that the sector might still be partially readable with only a small number of error bytes. This presents a problem if a bad sector contains a key or IV that could still be read using other forensic methods. SDS systems and type-safe disks can address this problem. Unfortunately the ATA specification does not have a command to turn off bad sector forwarding, so vendor-specific ATA commands must be used. In addition to bad sector forwarding, persistent caches have been placed in disk storage systems to improve performance. These caches may not only defer writing to the actual physical media, but may also aggregate multiple writes to the same location on the disk as a single write. In this case, the write cache of the disk must be disabled.
Bad Sector Reserved Area 0 Data G-List 0 1 2 3 1 2 3 0 1 2 Physical Layer Logical Layer

Fig 10. Demonstration of bad sector forwarding

Department of Computer Science

21

CUSAT

Confidential Data Storage and Deletion

5.2 Data Life Time Problem The data lifetime problem addresses the phenomenon of various copies of sensitive data, such as passwords or encryption keys, being scattered all over a computer system during normal system operation. These locations include numerous buffers (such as string buffers, network buffers, or operating system input queues), core dumps of memory, virtual memory, swap, hibernation values, and unintended leakage through application logs or features. The attack model in this seminar assumes that any attacks to recover sensitive data are staged after the computer has been powered off, so volatile leakage of data such as buffers, queues, and memory are beyond the scope of this survey. However, the recent cold boot suite of attacks demonstrate that encryption keys and other data can be recovered from DRAMs used in most modern computers in between cold reboots. The authors suggest four ways to partially mitigate the attack: continuously discarding or obscuring encryption keys in memory; preventing any sort of memory dumping software from being executed on the physical machine; physically protecting the DRAM chips; and making the rate of memory decay faster. Hibernation files and swap are generally stored on the hard disk and may not go away once the system is powered down. Some block-based may be used to encrypt swap partitions and hibernation files. Hardware encryption enclosures and extensions and encrypted hard drives can protect both swap and hibernation as the data is decrypted upon load transparently from the operating system.

Department of Computer Science

22

CUSAT

Confidential Data Storage and Deletion

6. Conclusion
This seminar took a look at the methods, advantages, and limitations of confidential storage and deletion methods for electronic media in a non distributed, single-user environment, with a dead forensic attack model. Confidential data-handling methods are compared using characteristics associated with confidentiality, policy, ease-of-use, and performance. Clearly, a combined solution that can store and remove confidential information should have the following ideal characteristics: High confidential storage and deletion granularity Acceptable performance overhead in terms of storage and deletion Enhanced security policy support to enable key revocation, encryption algorithm/mode of operation change and mitigation, and erasure technique Confidential storage and erasure of file and directory metadata Easy to use with minimal user awareness.

Department of Computer Science

23

CUSAT

Confidential Data Storage and Deletion

7. References
1. Diesburg, S. M. and Wang, A. A. 2010. A survey of confidential data storage and

deletion methods. ACM Comput. Surv. 43, 1, Article 2 (November 2010), 37 pages. http://doi.acm.org/10.1145/1824795.1824797 2. Ferguson, N. 2006. AES-CBC + Elephant diffuser: A disk encryption algorithm for Windows Vista. http://www.microsoft.com/downloads/details.aspx?FamilyID=131dae03-39ae-48be-a8d68b0034c92555&DisplayLang=en 3. Parno, Bryan. The Trusted Platform Module (TPM) and Sealed Storage. TPM Documentation. June 21st, 2007. 4. Behrouz A. Forouzan. 2008. Cryptography & Network Security. New Delhi : Tata McGraw-Hill Publishing Company Limited

Department of Computer Science

24

CUSAT

You might also like