Disk imaging is the process of making a bit-by-bit copy of a disk. Imaging (in more general terms) can apply to anything that can be considered as a bit-stream, e.g. a physical or logical volumes, network streams, etc.
The most straight-forward disk imaging method is reading a disk from start to end and writing the data to a Forensics image format. This can be a time consuming process especially for disks with a large capacity.
A common technique to reduce the size of an image file is to compress the data. On modern computers, with multiple cores, the compression can be done in parallel reducing the output without prolonging the imaging process. Since the write speed of the target disk can be a bottleneck in imaging process parallel compression can reduce the total time of the imaging process. Guymager was one of the first imaging tools to implement the concept of multi-process compression for the Encase image file format. This technique is now used by various imaging tools including Tableau Imager (TIM)
Other techniques like storing the data sparse or empty-block compression can reduce the total time of the imaging process and the resulting size of new non-encrypted (0-byte filled) disks.
Error tolerance and recovery
Smart imaging is a combination of techniques to make the imaging process more intelligent.
- Selective imaging
- Decryption while imaging
Deduplication is the process of determining and storing data that occurs more than once on-disk, only once in the image. It is even possible to store the data once for a corpus of images using techniques like hash based imaging.
Selective imaging is a technique to only make a copy of certain information on a disk like the $MFT on an NTFS volume with the necessary contextual information.
Decryption while imaging
Encrypted data is worst-case scenario for compression. Because the encryption process should be deterministic a solution to reduce the size of an encrypted image is to store it non-encrypted and compressed and encrypt it on-the-fly if required. Although this should be rare since the non-encrypted data is what undergoes analysis.