In previous articles, I discussed the Legal Hold, the Subpoena or CID, the custodian Identification process and Chain of Custody.
Now we need to discuss a very important part of the collection process. In terms of best practices, any data collection should be handled very carefully to avoid changing the data in any way. However, the primary way to collect data is to use a team of forensic experts who will create “forensic images” or collect the data in a “forensically sound” way.
A forensic image is a bit-by-bit image of a particular hard drive. It can be a computer or a server hard drive. It basically means that they collect every tiny bit of data on the hard drive as it existed. The goal is to do a single collection and get everything we need for any future analysis.
It isn't always necessary to image an entire computer or server hard drive. Sometimes we will only need specific file directories that the custodian has pointed us to during their custodial interview. Still, the data is collected very carefully using forensic procedures.
The forensic team will use software like EnCase, FTK Imager or one of several other tools. The files we receive are called “evidence files” and they have file extensions similar to *.ad1, *.e01, *.ex01, *.l01 or *.lx01.
If you are working with a client that is trying to save money or time by letting their IT department perform the data collection and their IT contact is not familiar with forensic images for litigation matters, they may try to use software they already have on hand, like Ghost software, to create a computer image. Be sure to interject and let everyone know that a Ghost image is not the same thing as a forensic image. Fortunately, there are more and more corporations that have their own internal forensic teams to perform data collections.
When you receive the forensic data, be sure to also request a copy of the forensic team's tracking log. It will contain information about exactly what data was collected and from which custodians.
Interesting articles Amy.
I thought I might add a few comments.
There is a key distinction between collecting data for the purposes of eDiscovery and collecting data for a digital forensic investigation of some form.
In digital forensics you are looking at metadata and other information that relates to files that would not normally be visible to a user of the computer. You would also be looking at data that has been deleted, encrypted or perhaps intentionally hidden in some way. In order to do this the process of taking a ‘forensic image’ developed which as you describe involves taking every ‘bit’ of data from a storage device so nothing is missed. Unfortunately 99% of the data you collect is of no use. Unfortunately, with the massive increase in the size of hard disks it is becoming less practical to acquire a full ‘forensic image’ of disks and the advent of relatively cheap Network Attached Storage (NAS) devices makes the situation even worse. For instance, two years ago I undertook an investigation on a device that had 27TB of data.
On a related point, the files created that form the ‘forensic image’ are in a special format that require specialist software to access. You mention *.E01 files and *.L01 files and their variants as “evidence files” but they are two different types of ‘container’. The E01 files consist of the entire bit-by-bit image of the source device, i.e. they contain deleted space, operating system files etc. On the other hand the L01 files are ‘logical’ containers whose contents have been selected from files that are visible to any user of the computer system but, and this is important, they do not contain a full image of the source device. Furthermore, an E01 file can be used to replicate the computer hard disk whereas an L01 is just a collection of files. The use of ‘logical containers’, i.e. a selection of the data, is becoming more common due to the increases in data volume I mentioned earlier and for some other technical reasons, for instance, you cannot easily ‘image’ a NAS device let alone data held on the Cloud.
In practice the process of e-Discovery is not normally concerned with deleted data, the contents of special operating system files or the majority of non-document files and therefore there is no need for a ‘forensic image’ as such. Instead the process tends to involve searching for particular documents and files that contain particular information, often using ‘keywords’, then collecting just those responsive documents and placing them in a form where they can be reviewed. The expense and time involved to create a ‘forensic image’ of all custodian machines would be prohibitive leading to a decision to selectively collect data from a limited number of custodians which then leads to a potential challenge of incomplete discovery.
The new approach to both eDiscovery and digital forensics is to use a tool that can search and collect data for both purposes. It can be run on an unlimited number of custodian machines at the same time and collect responsive files into a secure and encrypted container for eDiscovery purposes. It can also collect the majority of the operating system files and deleted files that would be used in a digital forensic investigation. This approach removes the limitations of data volume and also has the knock-on effect of reducing the amount of data that is fed into the review/analysis stages thereby reducing costs and the time it takes to deliver results.