Linux-Forensics-Workshop-Manual
Linux-Forensics-Workshop-Manual
Our Sponsors
©2023 2/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Case Brief
You have been called to analyze a compromised Linux Hadoop Cluster. The cluster includes one
Name Node (master) and two Data Nodes (Slaves). There is a suspicion that they all have been
compromised, but no proof to that. The activity has been noticed to happen between Oct. 5th,
2019 and Oct. 8th, 2019.
Deliverable(s)
1. How did the threat actor gain access to the system?
2. What privileges were obtained and how?
3. What modifications were applied to the system?
4. What persistent mechanisms on each compromised system were being used?
5. Could this system be cleaned/recovered?
6. Recommendations
Outcome(s)
At the end of this lab, you will have the required skills to deal with a compromised Linux
system, were you will be capable of doing:
©2023 3/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Table of Contents
Case Brief 3
Deliverable(s) 3
Outcome(s) 3
Table of Contents 4
Task #0: Environment Preparation 6
0.0 Connecting to the Playground 6
0.1 Preparing Working Environment 6
Task #1: Verification and Mounting 8
1.1 Verifying the Evidence 8
1.2 Mounting the Evidence 10
1.3 Checking Status of Mounted Evidence 12
Task #2: Gathering General System Information 14
2.1 System Navigation 14
2.2 Timezone Information 15
2.3 Network Information 16
Table 2.1 - Cluster Network Settings 17
2.4 Drive Information 18
Task #3: Users, Groups, and Home Directories 19
3.1 User Information 19
3.2 Group Information 19
Table 3.1 - Usernames, Groups, Home Directories, etc 20
3.3 Home Directories 21
Task #4: Working with The Sleuth Kit (TSK) 23
4.1 Listing Files 23
4.2 Finding Files 25
4.3 Extracting Files 26
4.4 Deleted Files 27
Task #5: Data Recovery / File Carving 28
5.1 Dumping EXT4 Journal 28
5.2 Targeted Data Recovery 29
5.3 Try to Recover All Deleted Files 29
5.4 Deleted Exploit 30
Task #6: Finding the Persistence Mechanism 31
6.1 Searching Based on Reference 31
6.2 Searching Based on Date Range 31
6.3 Checking Files Content 32
Task #7: Checking System Logs 33
7.1 Installed Packages 33
©2023 4/39
DFRWS LINUX FORENSICS WORKSHOP 2023
©2023 5/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Playground Credentials
Server https://192.168.1.10____
Username user___
Password workshop
Username tsurugi
Password tsurugi
Then change your working directory to the newly created directory, as follows:
$ cd cases
©2023 6/39
DFRWS LINUX FORENSICS WORKSHOP 2023
This will be where we will store all our cases, but for now let us create another directory for our hdfs case
as following:
$ mkdir hdfs
Again, change your working directory to the newly created directory, as follows:
$ cd hdfs
Make sure you’re inside the hdfs directory. This could be done with the ‘pwd’ command as follows and
as seen in figure 2.
$ pwd
Now let’s create some mount points to be used later to mount our forensic images. Don’t worry, more on
that later and also a directory to hold our results that we pull from these forensic images. This can be
seen in figure 3.
Hint: If you ever want to check why a command was used this way or what the options being
used mean? Then use one of the following:
1. man command-name
2. command --help
3. command -h
a. Example: $ man mmls
©2023 7/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Make sure you get a success message for all three forensic images (this may take some time depending
on the specs of your VM, so please be patient). An example can be seen in figure 1.1.
Now, let us check our drives and what volumes does each one of them include. Let’s start with the Master
and the results can be seen in figure 1.2.
$ mmls HDFS-Master.E01
We can see the results for Slave 1 and Slave 2 in figure 1.3.
©2023 8/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Figure 1.3 - Partition and Volume Layout for both Slave1 and Slave2 Servers
The volumes we are interested in are the ones with index number 002. As you can see, these volumes
are all described as Linux (0x83), so most probably they hold a Linux file system. Another thing to note is
that they all start at a sector offset of 2048. This will be needed later when we get to the mounting part of
the workshop.
©2023 9/39
DFRWS LINUX FORENSICS WORKSHOP 2023
If you list the contents of the ewf1 directory, you should see a file named “ewf1”. You’ll see a similar file in
both ewf2 and ewf3 as seen in figure 1.5.
$ sudo ls -lh /mnt/ewf1 /mnt/ewf2 /mnt/ewf3
Whenever you need help or more info. on a command, just check the man pages 😉
😉
Great, so we have everything prepared, now let us get into business! Make sure you are within the hdfs
directory “/home/tsurugi/cases/hdfs”, you can double check your current location using pwd
Before we mount the volumes we will need the offset to the volume of interest, which we saw in both
figure 1.2 and figure 1.3 to be 2048. Now this value is in sectors and at the beginning of both of those
figures, we can see a line saying “Units are in 512-byte sectors”. So we need to multiply the offset by 512.
©2023 10/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Mounting the forensic images can be done using the mount command as seen below and the result seen
in figure 1.6.
sudo mount -o ro,noexec,noatime,offset=$((512*2048)) /mnt/ewf1/ewf1 master
Let us do some checking first using fsstat to see why that happened. We will need the
offset to the volume of interest, but fsstat can deal directly with sectors, so we just need
to pass the offset as seen below and seen in figure 1.7.
$ sudo fsstat -o 2048 /mnt/ewf1/ewf1 | head -n 19
Please read all the details, they are important, but for being as brief as possible here, check the line under
“Last Mounted at”. It says that it was not unmounted properly, and this might happen when the system
was not shutdown properly. Therefore, this could mean that there is some data in the journal that was not
written to volume, which will usually happen once the volume comes back online.
©2023 11/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Okay, enough talking, let’s adjust our command with the noload/norecovery option which can be seen in
figure 1.8.
sudo mount -o ro,noexec,noatime,norecovery,offset=$((512*2048)) /mnt/ewf1/ewf1
master
Super! No errors this time! Therefore, make sure to repeat the same for all of the other volumes on the
Slave1 and Slave2 forensic images.
Let us do one more check to find which loop device are these forensic images
connected to. This can be done using the findmnt command as seen in figure 1.10.
$ findmnt | grep cases
So from the results we can see that the master case is mounted on /dev/loop0, slave1 to /dev/loop1, and
slave2 to /dev/loop2. This will be very useful later when we get to use TSK in task #4.
©2023 12/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Now let’s see what we have now inside the “master” directory. I’m going to use “tree” this time to do that,
but feel free to use other stuff, such as “ls”. Make sure to do this for the other two directories we have
(slave1 and slave2). This can be seen in figure 1.11.
$ tree -L 1 master
©2023 13/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Let us also verify the Linux flavor we are dealing with. This can be done by checking the /etc/os-release
file as seen in figure 2.1.
As you can see from the results of our master server, we are dealing with an Ubuntu Linux 16.04.3 LTS
code name Xenial.
©2023 14/39
DFRWS LINUX FORENSICS WORKSHOP 2023
You can double check the results by examining the /etc/localtime file for the timezone. So when checking
it using the file command, you should find the path to the zone information being used on the system.
We are going to use this information later, especially when generating our timeline, so make sure you
document it on your technical notes document.
PLEASE READ ME: not all of the workshop has screenshots for a reason, which is we need you to pay
😉
attention and ask questions and not just copy and paste commands. Therefore, we will purposely leave
questions and commands unanswered, but they will be together, so enjoy the ride
©2023 15/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Let’s check the hostname of the server which can be found in the /etc/hostname file as seen in figure 2.3.
Another important location to check would be the /etc/hosts file, which could be used to define static host
lookups (static dns settings). Using cat again, we can see some cool results in figure 2.4.
©2023 16/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Based on the settings found from the master server configuration files (make sure you check slave1 and
slave2 too), we can conclude all the results into Table 2.1.
©2023 17/39
DFRWS LINUX FORENSICS WORKSHOP 2023
It seems that the system is configured to use UUIDs instead of directly using the volume numbers, which
is a great way to configure Linux systems (ask your instructor for more information if you want). So we
need to find a way to map the UUIDs to the drives and the volumes on them. Now, since we are actually
investigating a forensic image, we cannot use the mount command or the /etc/mtab file because it will be
empty, plus we cannot use the /proc/mounts pseudo file either. Therefore we will be using the cfdisk tool
to find the UUIDs of each partition.
Use the command as seen below and then use the arrows to find the partition with the UUID of the ext4
file system as seen in figure 2.6. This will be the root partition of this system.
$ sudo cfdisk /mnt/ewf1/ewf1
Make sure you follow the same process for the other servers.
Note: on live systems, you can check the /dev/disk directory, where you can list the info by id, label,
path, and by UUID.
©2023 18/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Now, let's check it out using the command below. As you can see in figure 3.1, this will list only the users
that have a bash shell configured to use at login. Please note that on some systems this might be sh or
dash, etc. So make sure you know the environment you are investigating.
$ cat master/etc/passwd | grep bash
From the results we can see that there are only two users who have shell access, root and hadoop.
Q3.1) Is this the same for all of the other systems? If the answer is NO, which system
has a different output and what did you find?
Q3.2) Explain what you found that is not following the standard Linux FHS?
Let's check if they have passwords to login. This can be found in the shadow file, which is found under
the /etc directory. So use the information you found to check the /etc/shadow file for each system. You
can do that as seen below. Note: make sure to replace username with the username you found in the
previous steps.
$ cat master/etc/shadow | grep username
In the shadow file, the second column (: is the separator), if you find a * it means no password is there,
but if you find a long string? Then a password exists for that user.
©2023 19/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Q3.4) What are the groups that the user hadoop belongs to?
Q3.5) Who has sudo access, or in other words, is in the sudo group?
Use all the information you gathered to list on each system which user was found, their corresponding
user ID, home directory, and finally the groups they belong to. Add all this to the table 3.1 found below. An
example has already been made for you to follow.
slave1
slave2
©2023 20/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Use the command below to list all the contents in the user hadoop’s home directory. Make sure to use the
correct command line options to list all files/directories even those that are hidden as seen in figure 3.2.
$ sudo ls -lha master/home/hadoop/
Explore the directory and its contents, especially those bash files, the ssh directory, temp, the viminfo and
definitely that file which is named 45010!
©2023 21/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Spend some time checking the contents of the .bash_history (a file that is used to store a history of all
🙂
the commands used on a Linux system). The dot at the beginning of the file, denotes that this is a hidden
file. I’m sure you found something interesting!
Q3.7) Can you correlate that with the contents found on the slave servers and did you
find anything weird?
Use the command below to show the contents of the .bash_history file of the hadoop user on the slave1
server, but make sure to use the tee command with it too as seen below.
$ sudo cat slave1/home/hadoop/.bash_history | tee
results/slave1-hadoop-bash_history.txt
The tee command allows us to copy the content we dumped on the terminal to a file at the same time. Do
the same for all other users to gather information about each one of them and what they did on those
systems.
From the bash history for the user hadoop on slave1, it seems the user edited the passwd file using vim
(totally weird!!) and rm a file of interest before logging out of the system. A snippet of what has been
found can be seen in figure 3.3.
Use the history to dig deeper with your investigation, since they could be a good map to check what and
where to look for activity. After that move to the next task.
©2023 22/39
DFRWS LINUX FORENSICS WORKSHOP 2023
😀
the hadoop user directory found on the master server. There are two ways to do that, let’s start with the
hard way
Using the info. as we previously found, we can now use the fls command to list the files and directories
within the root of the file system mounted to /dev/loop0 as you can see in figure 4.1.
$ sudo fls -l /dev/loop0
The inode number is the number you will find in the 2nd column. The inode number for the “lost+found”
directory is 11 and for the boot directory is 2621441, and so on and so forth.
Now, from the results we can see that the home directory has the inode number 2359297. We can now
use that with the fls command to list the contents of that directory as seen in figure 4.2.
$ sudo fls -l /dev/loop0 2359297
One more time using the inode number of the hadoop directory which is 2359298 to list the content of the
directory, which can be seen in figure 4.3.
©2023 23/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Q4.1) What was the inode number for the file named “known_hosts” which is found
within the .ssh directory? Use the same method as we have done so far.
The easiest way to find the inode of a file is to use the “-i” option with the ls command as seen in the
command below.
$ sudo ls -lhi master/home/
©2023 24/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Now I want you to go back and look at the contents of the hadoop user home directory but focus on the
.viminfo.tmp file. Use can use the pipe “|” with the grep command to help narrow down your search as
seen below.
$ sudo fls -l /dev/loop0 2359298 | grep viminfo
Q4.2) How many files did you find that have that name and why? Could you explain?
Use the inode# of that file and search for what file does it belong to, using the same approach above.
Q4.4) What did you find the inode number 2367367 of the (realloc) file truly belongs to?
©2023 25/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Reminder: the command above will concatenate the output of the file to standard output (stdout) and
using the tee command we can also copy the output to a file of our name, which was
master-hadoop-viminfo.txt in the command above.
Let’s do another example but this time for the 45010 file. This time we will be using the > to redirect the
output instead of the tee command since this is a binary file and we won’t benefit from dumping it’s
content to the terminal. Most of the content will be not human readable, so let’s do it as seen below.
$ sudo icat /dev/loop0 2367351 > results/master-hadoop-45010.bin
Now check the file’s type and it’s strings content with both the file and strings commands:
$ file results/master-hadoop-45010.bin
Q4.5) What type of file did you find this file to be?
Q4.6) Did you find anything referring to this file being an exploit? Show proof.
Let’s compare the hashes for the file we just extracted and the file found directly in the
master/home/hadoop/ directory. This can be done using the following command:
$ md5sum results/master-hadoop-45010.bin master/home/hadoop/45010
©2023 26/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Please use the same methods mentioned above to do other experiments and make sure you’re
comfortable with using the TSK commands we’ve covered until now. Do not move forward without
understanding everything before this message. In the end, our goal is to learn, not just to run commands
and find the answer to the questions!
©2023 27/39
DFRWS LINUX FORENSICS WORKSHOP 2023
The BAD NEWS, unfortunately on an EXT4 file system, once the file is deleted, the metadata that points
to the file is zeroed out and there are no longer any pointers pointing back to the volume.
Therefore, let us go with option (a) and use the journal to help us recover the files. To extract the journal,
we will be using debugfs and asking it to dump the file with the inode #8, which is the inode number for
the file system’s journal. This can be done as:
$ sudo debugfs -R 'dump <8> ./results/slave1-journal' /dev/loop1
You should end up with a 128MB file (size of the EXT4 journal) as seen in figure 5.1.
Now we want to search for files that were deleted between October 5th 2019 and October 8th 2019 based
on the case brief that was given to us. Therefore, let us define a variable with that value:
AFTER=$(date -d"2019-10-05 00:00:00" +%s)
©2023 28/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Q5.1) What was the path “/home/hadoop” used in the options above for? (hint: man
ext4magic)
Now, let us perform the recovery step itself instead of just listing the files that are recoverable, which
could be done as seen below:
$ sudo ext4magic /dev/loop1 -a $AFTER -b $BEFORE -f /home/hadoop -j
results/slave1-journal -r -d results/slave-recovery1/
Q5.2) Were you able to recover the 45010 file? Show proof with validation that the
recovery was successfully done. (hint: use sha256sum)
Use:
$ sudo tree -L 1 results/slave-recovery2/
Please check the man pages for the ext4magic tool, this is truly an excellent tool with so many more
features/capabilities, so what are you waiting for? Go check them out!
©2023 29/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Q5.4) Can you search and find out what this file is? (hint: use the search “45010 exploit”
phrase). Explain your findings.
😕
Now that you have found out what it is. We need another proof which is to find if this exploit has truly
been executed on these systems or not!
We are going to leave this to you to figure out. Use all the knowledge you have learned so far to find the
answer.
©2023 30/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Use either the tee command or the > to redirect the output to a file named master-find1.txt in the results
directory.
We can also search for what files have been accessed after the 45010 file using the -anewer option as
seen below.
sudo find master/ -type f -anewer master/home/hadoop/45010
Again save the output to your results directory in a file named master-find2.txt.
We know that the incident happened between October the 5th and the 8th, we can use the find command
again, but this time with the -newermt option as seen below.
sudo find slave1/ -type f \( -newermt "2019-10-05" -and ! -newermt "2019-10-08" \) | tee
results/slave1-find.txt
The brackets and the -and are to make sure that we search for anything newer than October 5th and
October 8th. Make sure you repeat the same steps for slave2 and you spend some time on your findings
before continuing your investigation.
©2023 31/39
DFRWS LINUX FORENSICS WORKSHOP 2023
If we check its content, we will find that this file is dealing with sockets. Spend some time doing some
Googling and asking questions and then answer the question below.
Use the debugfs command to find when this file was created. An example of how to use it can be seen
below. Make sure you replace the word inode# with the inode number for the cluster.php file.
$ sudo debugfs -R ‘stat <inode#>’ /dev/loop0
Now, check the other find results for both slave1 and slave2. Assuming you found another file that had the
word “cluster” in it.
Q6.2) Which system did you find it on and what was the content of this file?
Do some research and then explain your findings and what is this doing or achieving before you move on
to the next task.
©2023 32/39
DFRWS LINUX FORENSICS WORKSHOP 2023
🙂
In this task we move our focus completely on logs and log analysis. We will be using very simple
techniques, so don’t worry . Log files on a Linux system are found under the /var/log directory. This
directory contains many different log files depending on the system you are investigating, but in general
on a Linux Ubuntu flavor they will include the following:
- Main log file is syslog (named messages on other systems)
- User activity logs can be found in auth.log, wtmp, btmp, lastlog, faillog, etc
- Kernel logs can be found in kern.log (dmesg on live systems)
- Others depending on what systems/applications are installed on the system you may find
Apache, MySQL, PHP, Mail, etc log files
We can use the tail command to show us the last couple of entries within the file as seen in the command
below.
$ sudo tail master/var/log/dpkg.log
If we add the -n20 or -n30 option, we can even see the last 20 line entries or last 30 line entries,
depending on the number of entries you choose. If you examine the results, you will see that the php and
its related packages were installed. Before we move on to another file, explore the contents using the less
command and see if you can find any other activity within our time period. This can be done as seen
below and then using the up and down arrows to scroll through the file.
$ sudo less master/var/log/dpkg.log
Q7.1) Did you find any weird installations happening during the investigation time
period?
Q7.2) When was the php package installed and what other packages were installed with
it (list at least three of them)?
Make sure you check the same locations on the other systems, don’t skip those, these steps might just
show you one part, so don’t miss the others.
©2023 33/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Now again but with head (hope you notice the difference):
$ sudo last -f master/var/log/wtmp | head -n 10
😊
Now the btmp file (failed login attempts). Do the same as before without using head and with sudo
powers. Please spend some time carefully going through this log file, it is very important .
$ sudo last -f master/var/log/btmp | head -n 20
Q7.5) Why are there so many failed login attempts, what do you think is happening?
Explain your answer.
Q7.6) does it match the activity that you saw in the previous log (btmp)?
Q7.7) Was the user successful in obtaining access using this method? Explain with
proof.
Search for the line that has the text “Failed password for invalid user magnos”. (README)
©2023 34/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Find out what happened immediately after that and see if it all makes sense now.
Q7.8) Explain what happened and at what time.
Let us also check the lastlog file using the “strings” command:
$ strings master/var/log/lastlog
Q7.10) Does it match the IP Address you saw in any of the previous log files? Please
document all files that you found that IP address in, you will need them for your final
report.
We are unaware of a tool to read the lastlog file offline, but use the binary structure for the lastlog file to
read its records. The structure can be found below:
● 4 bytes → timestamp
● 32 bytes → terminal
● 256 bytes → hostname
By now, you should have found how this threat actor gained access to the master system and hopefully
into other systems too. Let’s move on and generate a timeline to make sure this all can be correlated
together and makes sense!
©2023 35/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Q8.1: What does the linux parser used in the command above search for?
©2023 36/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Master:
sudo psort.py -z UTC -o L2tcsv -w master.csv master.timeline "date > '2019-10-05
00:00:00' AND date < '2019-10-08 00:00:00'"
Slave1:
sudo psort.py -z UTC -o L2tcsv -w slave1.csv slave1.timeline "date > '2019-10-05
00:00:00' AND date < '2019-10-08 00:00:00'"
Slave2:
sudo psort.py -z UTC -o L2tcsv -w slave2.csv slave2.timeline "date > '2019-10-05
00:00:00' AND date < '2019-10-08 00:00:00'"
Note(s):
1. Spaces were added to the command above to help you understand it, otherwise it is not needed.
2. Use whatever tool or spreadsheet application to go through your timeline. For quick checks, I
usually use Eric Zimmerman’s Timeline Explorer, but it’s up to you.
©2023 37/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Deliverables:
1. How did the threat actor(s) gain access to the system?
ANSWER:
3. What are the modifications that have been applied to the system?
ANSWER:
5. What needs to be done to restore the cluster to a fully cleaned and working
environment?
ANSWER:
©2023 38/39
DFRWS LINUX FORENSICS WORKSHOP 2023
Reference(s)
● Linux DFIR Team, https://linuxdfir.ashemery.com
©2023 39/39