Destroying Data ... is it possible

From SecuriWiki

Written by Justin Wells and Ronald Edmeades.
Files are volatile: it is trivially easy to make information 'vanish' for the average computer user, as many a student can testify the day before a deadline. However, data is physically persistent, extremely persistent, for those who need to guarantee that sensitive information is unrecoverable. Digital copies of data are perfect and do not degrade over time or in number, and there are a multitude of storage platforms for data to be stored in a variety of ways. Abstracted data is difficult to trace, and hence difficult to throughly destroy [1].
Some more concerns about new types of hard disks

Contents

What do we mean by 'delete' anyway?

In general terms, when we use the delete key on a file, click the delete option from a menu or drag across to the recycle bin and select 'empty', it is (for the average user) enough to render such information unrecoverable. However, with a populace that is gaining in computer literacy it is becoming common knowledge[2] that when an Operating System 'deletes' a file, it is in actual fact removing reference to it from the underlying file system index table (however it may be implemented), as opposed to taking the file out of the cabinet and shredding it (Note: this is true only for FAT, "however ext2/3 fs will often overwrite deleted file data, unlike ntfs or fat" [3]. Thus a reasonably competent person can, with the right tools [4][5], recover physically present 'deleted' information with quite a high success rate[6]- a worrying thought if one considers just how much sensitive personal and secret information can be present on any one machine that you have ever accessed.

A common misconception is that the 'format' command equates to 'erase' and thus is an all-encompassing destroyer of data. Whilst it does make the data 'vanish' according to the file system, it does not actively destroy or overwrite the data physically present on the disk.

"Even if you do a `low-level format' and write over the full drive with zeros, there's still a chance that an individual with the right (albeit expensive) equipment can recover data from the drive." (Bruce Schneier, 2005)[7]

Formatting a drive will replace the current boot sector of the device with a standard set, as well as create a new blank index for the rest of the disk, thus indicating that the data blocks are ready to be written to (the drive has been given a format specification to follow[8]). This simple oversite means that formatting may well help to preserve data, placing it out of sight of the user, and thus insulating it from modification.

"Once deleted, file content does not change until it is overwritten. On file systems with good clustering properties, deleted files can remain intact for years. Deleted file information is like a fossil. A skeleton may be missing a bone here or there, but the fossil does not change until it is destroyed" (Farmer & Venemar, 2004)[9].

In this way machines that change roles during their production lifetime may well contain sensitive data from past lives for many years after the event if storage volume is large and system usage is relatively static (such as a webserver).

Data Remainance[10]

There is a plethora of storage devices available for sensitive data to be left on, ranging from hard drives to USB memory keys, network drives, Internet caches and RAM chips. This is further compounded by the fact software applications and Operating Systems don't do exactly what you expect with your data, files are backed up in temporary storage, mirrored in caches and generally spread about the place like cheap butter.

Factors contributing to data remainance:

  • Applications create temporary copies of files
  • Clipboard tools store cut and copy data in a seperate location to the file being worked on
  • Chat applications can log conversations locally or at server
  • Personal Information Managers generally contain a wealth of sensitive information about a person, yet they are unaware of exactly where this data is stored
  • Media applications keep statistics and lists of media played, and how often
  • Deleted information can reside in memory until the pc is shutdown, if hibernation mode is used, or there is a lack of RAM this data may be stored in virtual ram of hibernation files

(Deb Shinder, 2005)[11].

It is difficult to operate a PC and not leave behind some sort of data trail, however 'recoverability' is dependent on the lengths an attacker is willing to go to recover your information.

How long does deleted data last?

The answer of course depends on multiple variables, from the hardware being used, the file system, the level of usage on the system, the type of data deleted and much, much more. For instance volatile memory ([12] RAM) is called precisely this because of its property of losing all data when reset[13]. However, depending on your level of paranoia, it is eminently possible that the RAM chips are physically recovered and bathed in liquid nitrogen[14], thus allowing the contents to be recovered (this is however not very plausible unless your name is Bond). It is also important to note that whilst the data that computers read is digital, the hardware storage implementation is very much analog, thus allowing attack vectors which the majority of IT professionals are unaware of.

“Memory Chips have undocumented diagnostic modes that allow access to values smaller than a bit. With modified electronic circuitry, signals from disk read heads can reveal older data as modulations on the analog signal" [15]

However, far more common is the presence of data in old blocks which have yet to be overwritten, or have been only partially overwritten. This is where (as mentioned above) a file system removes the index of a file from its allocation table, but does nothing with the actual data. For example in the FAT[16] filesystem files are written to clusters, clusters are 512kb in size and a file of only a few kb can occupy that entire block of space, with no other files being permitted to reside there. Thus if a file fills a cluster (or several contiguous clusters) and is deleted, a smaller file can be written to the newly freed cluster, of a few kilobytes. The cluster is now marked as allocated, yet the original data by and large still physically exists and the smaller file acts like an unwitting marker, signalling 'don't write data here', Detailed FAT Slack Space explanation . Further complication exists as some file allocation strategies will not write to the first available block of free space, but to the next available block of free space (essentially always attemping to write to 'virgin' memory first). This means that deleted files could exist physically in storage for a very long time, particularly given the current size of hard drives which means there is always `virgin' space to write to.

Measuring Persistence of deleted data

A 20 Week study performed by Dan Farmer and Wietse Venema is available here: Forensic Discovery Section 7.3 and Forensic Analysis Presentation.
Farmer and Venma took an average set of servers and created hash signatures of each data block each day, analysing the number of signatures that change over a period of months.
At the end of the analysis period the results showed that on an 'average' file server with moderate usage, deleted information had a survival half-life of approximately 35 days.
Farmer and Venma also explore volatile memory, particuarly caches and buffers, and in a series of experiments discover that memory analysis can yield a great deal of information about system behaviour[17].

Achieving Absolute Data Deletion

According to Bruce Schneier Mark Hinge Whitedust Absolute Data deletion, there are three types of people to protect data from:

  • Casual Snoopers
  • Experienced Hackers
  • Dedicated Experts

It is to these `levels' that we should work to when rendering sensitive information unrecoverable. For the average computer user aiming to prevent casual snoopers from recovering information it is enough to ensure that files are rendered inaccessible by the Operating System, such as emptying the recycle bin, flushing Internet cache, etc. For those times when you are throwing out an old hard disk, or require company or sensitive information to be inaccessible to experienced hackers it is generally effective to overwrite the data area repeatedly with random data. Peter Gutman states in his 1996 paper on Secure Deletion of Data from Magnetic and Solid-State Memory

“Even with a DC (direct current), traces of the previously recorded signal may persist until the applied DC field is several times the media coercivity"

To the layman this means that magnetic information on physical media is absurdly difficult to remove, so for those who have military secrets or Metallica mp3's on their system, paranoia ranges from sensible to extreme solutions. The sensible solution of writing a pattern over the data area, its compliment and then many (7 - 35) passes of random data to the data area, as per US DoD Green Book recommendations and Gutmann, is in stark contrast to actions goverened by absolute paranoia which would likely result in storage media being taken outside and placed under thermite and /or in an acid bath until said device is reduced to its component (atomic) parts. [18].

Software Shredders

There are hundereds of software shredders available for download. Whilst most are effective in preventing and deterring casual snoopers, others are built on flawed assumptions about the operation of file systems and storage media, potentially leaving sensitive data still accessible to an experienced hacker and giving the user a false sense of security.
'Shredders' typically write random data repeatedly over the location of a file, with 'wipers' writing data into the slack space of the entire storage medium.

Freeware tools [19]

The numerous issues[20] with software 'shredders' and 'wipers' result primarily from a lack of understanding of the target architecture[21]. For example, improper overwriting of slack space or, as Peter Gutmann points out, hard disks usage of RLL, Run Length Limited encoding to prevent adjacent one's being written, to ensure that drive does not lose track of the position of its data. This could lead to assumptions being made on the part of software authors who are under the impression that the data being written to the disk is actually being written as it appears[22].

Be aware that defragmentation of the drive means that copies of the file being 'shredded' may well have existed elsewhere, with no record of their previous location remaining. Additionally, encrypting data leaves plaintext copies of data. To ensure that a file no longer exists it is necessary to search the storage media for it, and there is no guarantee that partial fragments of the file will be detected during the search.

ATA Protected Storage Area and Host Protected Areas are not typically accessible from BIOS or Operating System. Indeed, hardware utilities may not even access these areas [23]

Also be aware that an application may 'grab' space (malloc) which it intends to write to, but has not yet. Consequently, this area will be ignored as allocated by any file wiping ultilities, and it is therefore necessary to ensure that wiping or shredding is performed by a seperate OS (such as knoppix or Helix from a bootable CD). Bad (damaged) clusters are marked and are hence ignored, though they may still contain information that could be recovered via in-depth analysis.

Journalling file systems (especially NTFS) are particuarly vulnerable to 'spreading' data.

"Alternate data streams cannot be be removed, unless the parent file or directory is destroyed. Unfortunately most file wiping utilities only deal with the primary data stream and do not wipe the alternate data streams, thus leaving data intact" Kurt Seifried

Currently some Windows based file deletion applications fail to completely destroy alternate data streams [24] namely PGP.

Manual Deletion

Magnetic media is analogue and leaves a 'residue' of previous polarisation, for example writing a logical 1 over a 0 leaves a magnetic trace of the previous 0, or a dip in the field. Similarly, writing a logical 1 over a previous 1 causes a minor boost in the field. These minor boosts and dips in the field are within the tolerance of magnetic head readers, but more sensitive equipment can infer previous data from the minor changes which this brings about[25].

Shop bought magnets are not a fool proof way of degaussing a hard drive (in fact they are most ineffective). It is necessary to use a very powerful low frequency (for deep penetration) [http://www.datadev.com/v94.html degausser}, however due to the density of current hard disk media even this provides no guarantee of total erasure.

This leaves the final solution, which is to physically destroy the storage device[26]. Yet smashing it to pieces does not prevent a patient person from rebuilding the disk, as Jospeh Snodgrass found out.

The only foolproof methods that remain are scratch, acid or fire. Scratching the magnetic surface of a disk requires removal of its entire surface with a very fine grained sander, in order to stop data escape 'between the gaps'. Acid baths have been found to be very effective in destroying the upper magnetic layers of a disc, but require special handling. Recommended temperature for destroying data with fire is 660 degrees celcius (curie temperature at which magnetism is lost, not to mention physical structure! [27]).