Thursday, September 25, 2014

Reverse Engineering an Android App File

It is okay to be frustrated

All blog posts to date
Introduction Acquisition Analysis
Introduction Imaging an Android Device Examining the image
Picking a Toolkit Live imaging an Android device Some hidden artifacts in a physical image
Why not load ClockworkMod or TWRP to image a device? Using Autopsy to examine an Android image
Identifying your Userdata Partition Some artifacts in the /data/system/ directory
Some non-root methods to learn about a device Viewing SQLite Databases
A quick note on imaging newer Android devices Facebook for Android Artifacts
Using Windows to Live Image an Android device Interpreting data from apps
Obtaining all files in the data partition without a physical image Waze for Android forensics
App Reversing Other Topics
Reverse Engineering an Android App File The differences between a physical image and a logical extraction
Fun with Apktool Dirty cow
Deep dive into an app Imaging and examining an Android car stereo
Unpacking boot and recovery kernels

The Android operating system has all kinds of great apps. I use Netflix, YouTube, Facebook, and the Chrome browser all the time. The development environment for writing Android apps is easy and free, so it attracts some great developers and all kinds of innovation.

The problem is, great developers and all kinds of innovation are not all that the development environment attracts. It depends on which report you go with (this one, this one, this one, or many other excellent reports by trusted security firms), but every security researcher who looks at mobile malware agrees on one thing: the Android operating system is the number one mobile operating system for malware. Malware may be spyware which steals personal data, ransomware which “locks” the device until the user forks over money to some hacker, or a particularly annoying variant which uses up expensive services like premium text messages or large volumes of data and forces the user to pay exorbitant fees to their service provider.

So how do you know if an app is malware? There are malware scanners out there which work with varying effectiveness. (By the way, I totally suggest if you use Android you download a virus scanner, just in case. I personally use Lookout because one of the nice features is a find-my-phone feature, which sometimes is quite handy in the morning when I can't hardly find anything. If only there were a find-my-keys app …)

If you desire, you can reverse engineer an app install file to its source code to determine if there is any malware present. Obviously pouring over code requires programming experience, or at least programming knowledge. If you have it, you may enjoy this exercise.

Introduction to Android app install files<

Android app install files are packaged with the extension .apk. If you have downloaded an app, the .apk file is on your phone in the directory /data/app. You can retrieve it in a few ways:

  • if you've imaged the phone, retrieve it from the image using FTK imager
  • if you are root, you can copy the file from /data/app to /sdcard
  • you can install a file manager, like Astro File Manager or ES File Manager, and use the built in app management to backup the app to your /sdcard directory. I personally use ES File Manager for this functionality.

System apps, like Gmail, Browser, Calendar, and other default apps, are in the system partition at /system/app. The easiest way to retrieve those, in my opinion, is to use adb. You can use adb shell to navigate to /system/app and find the name of the file you wish. Exit the adb shell and return to your computer and type the following:
adb -d pull /system/app/<filename>.apk
This command pulls the file to your working directory.

APK files are just zip files. Once the APK file is on your computer, you can rename the file to include a .zip extension and navigate around.

Within the APK file is a file classes.dex. This is the actual app binary. If you navigated through the system/app directory, you may have noticed a bunch of files with the .odex extension. These are the classes.dex file from the associated APK file optimized for the version of the Android OS. If you've ever done a factory reset or installed an update, you know upon the first boot that you have to wait for a while as you see a screen indicating that all of your apps are being optimized. The app optimization process results in creating these .odex files.

Also within is a directory called res. This is the resources directory, including images and other files used by the app.

Another directory in APK files is META-INF. Within here is the digital signature of the app. When the app is compiled, it is digitally signed for authenticity.

When a developer writes an Android app, there is a file called the Android Manifest. Now you'll see there there is a file called manifest.xml, but if you load it in a text editor you won't be able to read much. The manifest includes details about the app, including intents called, broadcast signals sent, and permissions called. The permissions are very important. For example, if you have a simple app, such as a simple game, but the app has the permission to record audio or send text messages, something could be fishy here. Of course, it is possible that the game can take voice commands and send your high score to your friends to brag, so you never know. If there are odd permissions in an app, that should raise some red flags.

There's a ton more you can learn about Android apps than this. What is important to know for now is how to retrieve an app, what the classes.dex file is, and what the manifest is.

Reverse engineering the manifest

So you have an app file. The first thing I like to do is retrieve the manifest. Copy the app to a working directory in your Linux machine and navigate there in a shell. Type the following:
aapt l -a filename.apk > manifest.txt

aapt is a debug tool included with adb.  If adb is not included in your system path, neither will aapt most likely. The command above translates the unreadable manifest.xml file in the APK file to a human readable format and outputs it to the manifest.txt file. Note: what you get out of this is NOT the original manifest. You will need the original source code to retrieve the manifest as it was prior to compiling.

Open the manifest.txt file in a text editor. Look for your permissions. You can do a text search for permissions. You'll see entries along the lines of the following:
    E: uses-permission (line=1238)      A: android:name(0x01010003)="android.permission.BATTERY_STATS" (Raw: "android.permission.BATTERY_STATS")    E: uses-permission (line=1239)      A: android:name(0x01010003)="android.permission.ACCESS_NETWORK_STATE" (Raw: "android.permission.ACCESS_NETWORK_STATE")    E: uses-permission (line=1240)      A: android:name(0x01010003)="android.permission.ACCESS_WIFI_STATE" (Raw: "android.permission.ACCESS_WIFI_STATE")
If you see any suspicious permissions, take note.

Reverse engineering the classes.dex file to source

To reverse engineer the classes.dex file and read it, you'll need a couple of programs which are both installed in Santoku. If you are using Santoku Linux, you're good. Otherwise, download and install dex2jar and JD GUI. Dex2jar is a tool which converts an Android classes.dex file to a Java JAR archive file, and JD GUI allows you to read the JAR file as Java source. Install links are here and here. Install these both.

In the terminal, type the following:
d2j-dex2jar filename.apk
jd-gui filename-dex2jar.jar

The first line creates your jar file, and the second opens the jar file in JD GUI.

In JD GUI, you'll see how the app source is organized. If you have no java experience, you'll probably be lost navigating around, but if you have java experience you'll figure this out quickly. Regardless of your java experience, reverse engineering app source is a royal pain.

Now let's say you see something like this:

or this:

The screenshots are from JD GUI and a major commercial app. Anyone who knows much about programming knows that a, b, c, d, e, and such make terrible class and variable names. Variables and classes should be descriptive. What happened here is the developers use code obfuscation. Before the app is compiled, a tool goes through the source and renames variables and classes to useless names like a, b, c, and such. They do this as a service to you, just in case you didn't think reverse engineering was already frustrating. When you see obfuscation like this, often your best indication of what is going on are functions you cannot rename (like getCacheDir and getAbsolutePath), and strings. The code obfuscation does not change the functionality of the app, but if the obfuscation changes the text of strings, then functionality is altered.


So what strategies do I suggest in reverse engineering source? Honestly I do not suggest a strategy. I suggest figuring things out and finding what works well for you. It can be frustrating but extremely insightful. I've reverse engineered apps before and whenever you find something suspicious, you find a thread that you keep pulling until you might actually find something malicious. Remember to correlate what you find with the manifest. And as always, if you need help, please comment or reach out to me.

Questions, comments, suggestions, or experiences?  Frustrations related to reverse engineering efforts?  Leave a comment below, or send me an email.


  1. "

    d2j-dex2jar filename.apkjd-gui filename-dex2jar.jar

    Displays as one line

  2. Here, in Brazil, we say: "Putz, this is a fucking guy!". Congratulations on the excellent blog !!!

  3. I can’t believe focusing long enough to research; much less write this kind of article. You’ve outdone yourself with this material without a doubt. It is one of the greatest contents. Best Android Car Stereos