"bulk_extractor is a program that extracts features such as email addresses, credit card numbers, URLs, and other types of information from digital evidence media. It is a useful forensic investigation tool for many tasks such as malware and intrusion investigations, identity investigations and cyber investigations, as well as analyzing imagery and password cracking. The program provides several unusual capabilities including:bulk_extractor operates on disk images, files or a directory of files and extracts useful information without parsing the file system or file system structures. The input is split into pages and processed by one or more scanners. The results are stored in feature files that can be easily inspected, parsed, or processed with other automated tools. bulk_extractor also creates histograms of features that it finds. This is useful because features such as email addresses and internet search terms that are more common tend to be important."
- It finds email addresses, URLs and credit card numbers that other tools miss because it can process compressed data (like ZIP, PDF and GZIP files) and incomplete or partially corrupted data. It can carve JPEGs, office documents and other kinds of files out of fragments of compressed data. It will detect and carve encrypted RAR files.
- It builds word lists based on all of the words found within the data, even those in compressed files that are in unallocated space. Those word lists can be useful for password cracking.
- It is multi-threaded; running bulk_extractor on a computer with twice the number of cores typically makes it complete a run in half the time.
- It creates histograms showing the most common email addresses, URLs, domains, search terms and other kinds of information on the drive.
Via Top 20 Free Digital Forensic Investigation Tools for SysAdmins, which offers this handy tip: "You will also see a decimal value in the first column of the text file that, when converted to hex, can be used as the pointer on disk where the entry was found (i.e. if you were analyzing the disk manually using a hex editor for example, you would jump to this hexadecimal value to view the data)."
/misc | Feb 29, 2016