Hash Set Analysis using Linux Ubuntu 12.04

Jan 11, 2013




Recently while examining an image of a computer, I came across the need to determine if the image contained a set of specific files.  For me those specific files were a series of pictures.  I was faced with two options, I can either manually go through all of the folders to search for the pictures, or I can search for those specific pictures using some sort of automated way.  I chose the automated way.

Whether it is a picture, a video, or even a piece of malware, if you are to able mount your suspect image in Linux, you can hash all of the files in the image and then search for any specific hash.  The only requirement is that you must have the hash of the file that you are searching for.  This process becomes even more useful, when you have a large set of hashes that you need to compare against your image.

In this article we are going to mount a volume contained in an image and hash all of the files in the volume.  We will then search for specific hashes to see if they are contained in the volume.  We will be doing it using Linux Ubuntu 12.04.  You can use a Live CD of Ubuntu to accomplish this task, but for the purposes of this article, I used an examination computer with Ubuntu 12.04 installed on it.

The plan:

The plan is to mount a volume, hash all of the files, and then use grep to search for specific hashes against the resulting file containing the hashes of our volume.

Installing the tools:

The one tool that we will need not included in Ubuntu by default is mmls.  Mmls is part of the set of command line tools from the sleuthkit.  The sleuthkit can be downloaded from the Ubuntu Software Center.  So let’s head over to the Ubuntu Software Center.

Click on the Dash Home circle, located on the top left of your screen, type in “software” and click on the Ubuntu Software Center icon that will appear.

After the Ubuntu Software Center opens, you will see a search box on the top-right corner of your screen.  Type “sleuthkit” and click on the install button.  You will be prompted for your root password.  Enter your root password and wait for the program to install.

Now that we have the program that we need, close the Ubuntu Software Center.

The test:

To recreate the test I will be using a DD image of a 256MB flash drive.  I named the image hashset_test.dd.  Prior to being imaged, the drive was populated with 4 random files and one png picture titled “evidence.png”.  This will be the file we will attempt to find in the image.  This is what the picture looks like.

Now let’s prepare a working folder for our files.  Go to your desktop, right click on your desktop and select “create new folder”, name it “Test”.

Find a DD image or make an image of a small drive containing at least one file that you wish to find.  Copy the image into the “Test” folder on the desktop.  Also copy the file that you wish to find.  We will be hashing that file shortly.

Let’s get started.  Open a Terminal Window.  In Ubuntu you can accomplish this by pressing Ctrl-Alt-T at the same time or by going to the Dash Home and typing in “terminal”.

Once the terminal window is open, we need to navigate to the previously created Test folder on the desktop.  We will use the CD command to change directory into the desktop.  Type the following into the terminal.

cd /home/carlos/Desktop/Test/

Replace “carlos” with the name of the user account you are currently logged on as.  After doing so, press enter.

The dollar sign after Test indicates that “Test” is your current directory, exactly what we wanted.  Let’s see if we have the DD image and the evidence.png picture in our current directory.  For that we will use the LS command, which stands for list (files).  Type “ls -l” and press enter.  LS is the list files command.  The flag -l uses a long listing format.

Notice that we are in the Test directory and yes, we do have the DD image and the evidence.png file in our current directory.

The first thing that we have to do is hash our evidence.png file.  Type the below command into the terminal, and press enter.

md5sum evidence.png > targetfile.md5.txt

Md5sum in the command to hash the evidence.png file using the md5 algorithm.  The “>” is a terminal character that can redirect standard output content to a file.  In this instance, we will use that character to redirect md5sum’s output to a file named targetfile.md5.txt into our current directory, which is the Test folder.

If you got your cursor back without errors, then the command that you entered was carried out as ordered.  You can now open the txt file with a text viewer.  The md5 of my evidence.png file is 147a0a801acb75152296bd2940d3a170.  From now on I will refer to the evidence.png file as our target file.

Before we move on to the next step of mounting the volume inside of the image, we need to determine the starting sector of the volume inside of the image.  To do that, we will use the sleuthkit tool mmls.  Mmls is a tool that can display the partition layout of a volume system (partition tables).  Type the following into the terminal and press enter.

mmls hashset_test.dd

Replace “hashset_test.dd ” with the name of your image.  After doing so, press enter.  These are my results.

Notice that the volume inside of my DD starts at sector 2048.  The offset must be specified in bytes, so now you must take the starting sector offset, in this instance 2048, and multiply it by 512 bytes. From this we obtain 1048576.  We now have the information that we need to mount the volume inside of the image.  But before we do that, we need to designate a location where we can temporarily mount the volume.  To do that, we need to create a mount directory.  To keep things simple, let’s create a directory called DD in the root of the mnt folder.  Type the below command into the terminal and press enter.  Type your root password (if needed).

sudo mkdir /mnt/dd

Again, if you got your cursor back, then everything went well.  The DD directory was created at /mnt/dd and your current directory is still /home/carlos/Desktop/Test

We now get to mount the volume inside of the DD.  Mount the volume with the below command.

sudo mount -o ro,offset=1048576 hashset_test.dd /mnt/dd/

Mount is the command to mount a filesystem.  The -o flag specifies the options for mounting.  In this instance we opted to mount it as a “ro” read-only file system and we also told mount to look at byte offset 1048576, which is the beginning of the volume.  The options following the -o flag must be separated only by a comma.  Sudo gives mount superuser privileges for the operation.  Press enter, and type your root password (if needed).

Now navigate to the DD directory.  We will again use the CD command to change directory into the DD directory.  Type the following into the terminal and press enter.

cd /mnt/dd/

I got these results.

Type “ls -l” and press enter.  The flag -l uses a long listing format

Notice that the volume in now mounted.  Visible are all of the files and directories on the root of the volume.  You can see that my volume contains our target file, two random files, and one folder titled “random_more” containing two more random files.

It is now time to hash all of the files inside of our volume.   Type the below command into the terminal and press enter.

find . -type f -exec md5sum “{}” “;” > /home/carlos/Desktop/Test/hashedfiles.md5.txt

Find is the command to search for files in a directory hierarchy.  The “.” tells find to search recursively from the current directory.  “Type f” tells find to search for regular files.  “Exec md5sum” will hash any regular file found by replacing this string ({}) with a found file.  The (;) is an escape character that quits the -exec command.  The “>” is the aforementioned terminal character that will redirect standard output to a file.  In this instance, we will use that character to redirect the command’s output to a file named hashedfiles.md5,txt into the /home/carlos/Desktop/Test/ directory.  The braces and the semicolon are enclosed in quote marks to protect them from the shell.

When you get your cursor back, then all files in the volume have been hashed.  Now open the txt file and see the hashes of all of the files.  These are my results.

Notice that our target file was seen by find and was hashed.  It is now time to use grep to see if we can find the hash of our target file in the hashedfiles.md5.txt file.  Cd back into the Test directory and type the following into the terminal, followed by enter.

grep -i 147a0a801acb75152296bd2940d3a170 hashedfiles.md5.txt

Grep is the command to print lines matching a pattern.  The flag -i is for case insensitive.  This means that grep which will search for the hash inside of the file, regardless of the fact that you may have upper and/or lower letter characters in your hash.  The 32 character alphanumerical word “147a0a801acb75152296bd2940d3a170” is the md5 hash of the target file.  Hashedfiles.md5.txt is the file containing all of the hashes in the volume.  Below are my results.

Grep found the target file’s hash in the hashedfiles.md5.txt, which means that our suspect volume contains the target file.

If you happen to have multiple hashes that you need to check against the volume, create a txt file with all of your hashes in it, one hash per line.  I created a txt file containing the target file’s hash and two more random hashes.  I named the txt file multiplehashes.md5.txt.

To compare the hashes contained in the multiplehashes.md5.txt against the Hashedfiles.md5.txt file, type the below command and press enter.

grep -if multiplehashes.md5.txt hashedfiles.md5.txt

Grep is the command to print lines, the flag -i is for case insensitive, and the flag -f is to tell grep to look for its input inside of a file.  In this instance that file is the multiplehashes.md5.txt containing multiple hashes.  Hashedfiles.md5.txt is the file containing all of the hashes in the volume.

These are my results

Grep compared the three hashes contained in multiplehashes.md5.txt against the Hashedfiles.md5.txt.  It was unable to find the random hashes, but was able to find the hash of our target file.  It printed the line to standard output and displayed the location of the file in our volume.  And there you have it.


This process is a free and quick way to compare your hash sets against the files in any volume that can be mounted in Linux.

If this procedure worked for your case, and you are able to use it in the course of your investigation, we would like to hear from you.  Please post your comments or email the author of this article at carlos@epyxforensics.com.

Post by Pete McGovern

Comments are closed.