Thursday, September 25, 2014

The differences between a physical image and a logical extraction

There's a reason we want a physical image

All blog posts to date
This post is a request from a reader.  Thanks for the request!  If you, the reader, ever have a topic you would like to see me dive into, message me.

The question was what data do you have when you obtain a physical image instead of a logical extraction.  Great question.  First, to define a couple of working terms here.  A physical image will be the image you would obtain when following this guide on a previous blog post or using a similar tool, such as a Cellebrite UFED Physical.  A logical extraction of data is a set of data extracted using a forensic app.  For this blog, I'll reference AFLogical by viaForensics, which is a free tool you can find here and you can follow instructions for using it here.

(Please note.  In no way am I trying to bash viaForensics here.  viaForensics is a great company and I admire their work.  I'm referencing this tool as a free logical extraction tool you can download and use while pointing out the weaknesses of using logical extractions.  The fact that the tool is free should be an indication that this tool is not their premiere tool.  They have far more powerful tools and their professional services are among the best in the industry.)

So with all of the above out of the way, here we go ...

Data obtained with a physical image

The answer is everything in storage on the device.  You get every file, every database, every picture, plus also all of the slack.  For a writeup on slack space, check out this page by viaForensics.  Simply with a physical image, you get everything in storage.

There is a good reason why we always want a physical image.  Examining a physical image takes specialty tools, and I go over the basics in this blog post.  If you want to look at data records, such as text messages, you do not have a simple file to examine with all of the records.  You need to find the file storing these records, which is most likely a database, and examine the database file.  The examination process is not straightforward, but you obtain the most data.

What you do not obtain is live running memory.  Sometimes live running memory can contain important data, including decrypted data if the data in storage is encrypted.  I do not intend to go over how to image live memory simply because it is a very complicated process which sometimes does not work.

Data obtained using a logical record extraction tool

A logical record extraction tool is an app which installs on the device.  As I discussed in my post on live imaging, the imaging process requires an exploit.  In that previous post, the exploit allows for root privileges.  Root access is required to image the device, and root access is also required to read files in the /data partition, which is where user records are stored.  A logical record extraction tool does not require root access.  A logical record extraction tool uses Android APIs to extract records from the device and save them to external storage.  These APIs allow a programmer to write an app to request certain records.  The APIs do not return the actual database files but they do return the records.  For a guide on this process, check out this programming guide on how to programatically read SMS from the inbox.  Look specifically at this code snippet (from the website, I cleaned it up some to make it more readable):
if (cursor != null)
    count = cursor.getCount();
    if (count > 0)

      long messageId = cursor.getLong(0);
      long threadId = cursor.getLong(1);
      String address = cursor.getString(2);
      long contactId = cursor.getLong(3);
      String contactId_string = String.valueOf(contactId);
      long timestamp = cursor.getLong(4);

      String body = cursor.getString(5);

      if (!unreadOnly)
        count = 0;

      SmsMmsMessage smsMessage = new SmsMmsMessage(
      context, address, contactId_string, body, timestamp,
      threadId, count, messageId, SmsMmsMessage.MESSAGE_TYPE_SMS);

      return smsMessage;

This source code has permissions to read the SMS database.  The program goes through the database row by row and extracts the message ID, thread ID, address, contact ID, and timestamp.  All of this data goes into an “SmsMmsMessage” object.  A programmer can use this object to save the message ID, thread ID,  address, contact ID, and timestamp to a file, which effectively means all SMS records are retrieved and exported.

Here is the problem.  The APIs will give you a certain set of data.  There may be more data associated with these records which the APIs do not return.  The above code, for example, does not return any location related data associated with the message or any metadata associated with the contact or the phone number.  These extra data records will be in the database file which you can read if you obtain a physical image of the device.

The APIs also will not return any deleted records.  When an SMS message is deleted, the database file no longer retains the message.  However, if you have a physical image, you may be able to find the deleted message in slack space.  The APIs only return what records they are programmed to return; they cannot return records floating in slack space.

The logical record extraction process is incapable of extracting files in the /data partition.  You need root access to extract the actual files.  The APIs only return the records, not the files.

Also, there may not be APIs available to return data from third party apps, ranging from Facebook to third party messaging apps to web browsing apps.  If there is not an API, the data can not be retrieved using a logical record extraction app.  With a physical image, you can examine the database files associated with these apps and examine the database files.


In summation, you want a physical image.  The logical extraction tool is a good tool to use if you need a quick look at text messages or call logs, and it also is a good tool to use if you are unable for whatever reason to obtain a physical image of the device.  If you are doing a detailed examination of the device, you will need a physical image.

The logical extraction tools have their purposes.  I am not here to denigrate those tools by any means.  I am here to point out their limitations.

Thank you to one of my readers for suggesting this post.  If you, the reader, have a good topic you would like to see a full post on, shoot me a message and I'd be glad to oblige.

Questions, comments, suggestions, or experiences?  Requests for posts?  Leave a comment below, or send me an email.

Reverse Engineering an Android App File

It is okay to be frustrated

All blog posts to date
The Android operating system has all kinds of great apps. I use Netflix, YouTube, Facebook, and the Chrome browser all the time. The development environment for writing Android apps is easy and free, so it attracts some great developers and all kinds of innovation.

The problem is, great developers and all kinds of innovation are not all that the development environment attracts. It depends on which report you go with (this one, this one, this one, or many other excellent reports by trusted security firms), but every security researcher who looks at mobile malware agrees on one thing: the Android operating system is the number one mobile operating system for malware. Malware may be spyware which steals personal data, ransomware which “locks” the device until the user forks over money to some hacker, or a particularly annoying variant which uses up expensive services like premium text messages or large volumes of data and forces the user to pay exorbitant fees to their service provider.

So how do you know if an app is malware? There are malware scanners out there which work with varying effectiveness. (By the way, I totally suggest if you use Android you download a virus scanner, just in case. I personally use Lookout because one of the nice features is a find-my-phone feature, which sometimes is quite handy in the morning when I can't hardly find anything. If only there were a find-my-keys app …)

If you desire, you can reverse engineer an app install file to its source code to determine if there is any malware present. Obviously pouring over code requires programming experience, or at least programming knowledge. If you have it, you may enjoy this exercise.

Introduction to Android app install files

Android app install files are packaged with the extension .apk. If you have downloaded an app, the .apk file is on your phone in the directory /data/app. You can retrieve it in a few ways:

  • if you've imaged the phone, retrieve it from the image using FTK imager
  • if you are root, you can copy the file from /data/app to /sdcard
  • you can install a file manager, like Astro File Manager or ES File Manager, and use the built in app management to backup the app to your /sdcard directory. I personally use ES File Manager for this functionality.

System apps, like Gmail, Browser, Calendar, and other default apps, are in the system partition at /system/app. The easiest way to retrieve those, in my opinion, is to use adb. You can use adb shell to navigate to /system/app and find the name of the file you wish. Exit the adb shell and return to your computer and type the following:
adb -d pull /system/app/<filename>.apk
This command pulls the file to your working directory.

APK files are just zip files. Once the APK file is on your computer, you can rename the file to include a .zip extension and navigate around.

Within the APK file is a file classes.dex. This is the actual app binary. If you navigated through the system/app directory, you may have noticed a bunch of files with the .odex extension. These are the classes.dex file from the associated APK file optimized for the version of the Android OS. If you've ever done a factory reset or installed an update, you know upon the first boot that you have to wait for a while as you see a screen indicating that all of your apps are being optimized. The app optimization process results in creating these .odex files.

Also within is a directory called res. This is the resources directory, including images and other files used by the app.

Another directory in APK files is META-INF. Within here is the digital signature of the app. When the app is compiled, it is digitally signed for authenticity.

When a developer writes an Android app, there is a file called the Android Manifest. Now you'll see there there is a file called manifest.xml, but if you load it in a text editor you won't be able to read much. The manifest includes details about the app, including intents called, broadcast signals sent, and permissions called. The permissions are very important. For example, if you have a simple app, such as a simple game, but the app has the permission to record audio or send text messages, something could be fishy here. Of course, it is possible that the game can take voice commands and send your high score to your friends to brag, so you never know. If there are odd permissions in an app, that should raise some red flags.

There's a ton more you can learn about Android apps than this. What is important to know for now is how to retrieve an app, what the classes.dex file is, and what the manifest is.

Reverse engineering the manifest

So you have an app file. The first thing I like to do is retrieve the manifest. Copy the app to a working directory in your Linux machine and navigate there in a shell. Type the following:
aapt l -a filename.apk > manifest.txt

aapt is a debug tool included with adb.  If adb is not included in your system path, neither will aapt most likely. The command above translates the unreadable manifest.xml file in the APK file to a human readable format and outputs it to the manifest.txt file. Note: what you get out of this is NOT the original manifest. You will need the original source code to retrieve the manifest as it was prior to compiling.

Open the manifest.txt file in a text editor. Look for your permissions. You can do a text search for permissions. You'll see entries along the lines of the following:
    E: uses-permission (line=1238)      A: android:name(0x01010003)="android.permission.BATTERY_STATS" (Raw: "android.permission.BATTERY_STATS")    E: uses-permission (line=1239)      A: android:name(0x01010003)="android.permission.ACCESS_NETWORK_STATE" (Raw: "android.permission.ACCESS_NETWORK_STATE")    E: uses-permission (line=1240)      A: android:name(0x01010003)="android.permission.ACCESS_WIFI_STATE" (Raw: "android.permission.ACCESS_WIFI_STATE")
If you see any suspicious permissions, take note.

Reverse engineering the classes.dex file to source

To reverse engineer the classes.dex file and read it, you'll need a couple of programs which are both installed in Santoku. If you are using Santoku Linux, you're good. Otherwise, download and install dex2jar and JD GUI. Dex2jar is a tool which converts an Android classes.dex file to a Java JAR archive file, and JD GUI allows you to read the JAR file as Java source. Install links are here and here. Install these both.

In the terminal, type the following:
d2j-dex2jar filename.apk
jd-gui filename-dex2jar.jar
The first line creates your jar file, and the second opens the jar file in JD GUI.

In JD GUI, you'll see how the app source is organized. If you have no java experience, you'll probably be lost navigating around, but if you have java experience you'll figure this out quickly. Regardless of your java experience, reverse engineering app source is a royal pain.

Now let's say you see something like this:

or this:

The screenshots are from JD GUI and the Netflix app. Anyone who knows much about programming knows that a, b, c, d, e, and such make terrible class and variable names. Variables and classes should be descriptive. What happened here is the Netflix developers use code obfuscation. Before the app is compiled, a tool goes through the source and renames variables and classes to useless names like a, b, c, and such. They do this as a service to you, just in case you didn't think reverse engineering was already frustrating. When you see obfuscation like this, often your best indication of what is going on are functions you cannot rename (like getCacheDir and getAbsolutePath), and strings. The code obfuscation does not change the functionality of the app, but if the obfuscation changes the text of strings, then functionality is altered.


So what strategies do I suggest in reverse engineering source? Honestly I do not suggest a strategy. I suggest figuring things out and finding what works well for you. It can be frustrating but extremely insightful. I've reverse engineered apps before and whenever you find something suspicious, you find a thread that you keep pulling until you might actually find something malicious. Remember to correlate what you find with the manifest. And as always, if you need help, please comment or reach out to me.

Questions, comments, suggestions, or experiences?  Frustrations related to reverse engineering efforts?  Leave a comment below, or send me an email.