Decompilation – Page 3 – JEB in Action

JEB 2.3 and MIPS Decompilation

We recently released our latest decompiler for MIPS 32-bit binary code. It is the first interactive decompiler in a series of native code analysis modules that will be released this year with JEB 2.3.

If you haven’t done so: feel free to download the demo, or if you own a Pro or Embedded license, ask for the beta 2.3 build.

The 2.3 branch contains tons of under-the-hood updates, required to power the decompilation modules — as well as the future advanced static and dynamic analysis modules that we have on our roadmap. Changes such as:

A generic code parsing framework for interactive disassembly and analysis of code objects.
A generic decompilation framework using a custom Intermediate Representation as well as a partially-customizable decompilation pipeline.
API additions to allow third-party to develop things as simple as instrumentation tools for the decompilers, or as complex as IR refining plugins to thwart custom obfuscation.

MIPS is the first native decompiler we made publicly available, and while the beta can be a bit rough around the edges, we believe it will be of a tremendous help to any reverser pouring though lines of embedded firmware or application code.

Decompiling MIPS

MIPS programs exhibit a level of complexity that experienced reverse-engineers may feel overwhelmed or unprepared to deal with. Unlike well understood and well practiced x86, even the simplest of operations do not seem to “stand out” in a MIPS disassembly. Not to mention other intricacies inherent to a RISC instruction set, such as unaligned reads and writes; or counter-intuitive idioms closely tight to the MIPS architecture itself, such as the branch delay slots.

Have a look at this “trivial” piece of code:

A trivial, yet “unreadable” chunk of MIPS code.

If you’ve never reversed MIPS code, you may experience a temporary brain-freeze moment. This code contains typical MIPS idioms:

$gp building for globals access
convoluted arithmetic, usually 16-bit based
delay-slots (unused here)

The pseudo code is simply this:

 for(i = 0; i < 64; ++i) {
    array[i] = i;
 }

Here is the full routine disassembly and decompiled code:

Unannotated decompiled code. Note the presence of canary-checking code introduced by the compiler.

How about something more complex. Below, you will find the raw decompilation (non annotated) of the domain-generating algorithm used in Mirai:

DGA of Mirai for MIPS; decompiled output is unannotated.

JEB allows you to set types, prototypes, rename, comment, create custom data structures, etc. in order to clean up the disassembly and the pseudo-code.

Augmented Disassembly

Not everything warrants use of a decompiler. Navigating the disassembly to get the overall sense of a piece of code is a common task. Unfortunately, raw MIPS assembly can be tricky to read for a bunch of reasons. The main problem lies in the presence of memory access relative to the dynamically computed $gp register. Those non-static references prevent straightforward determination of callsites or data references (eg, string references).

What is the target callsite of the JALR (=call subroutine) instruction?

In order to resolve those references and produce readable assembly code, disassemblers have several strategies. The cheapest one is to resort to pattern matching or instruction(s) matching and make inference ¹. This strategy can provide fast results, however, it is extremely limiting, and would perform poorly on non-standard or obfuscated code.

The strategy used by our code analyzer is to emulate the intermediate representation (IR) resulting from the conversion of the MIPS code. That controlled simulation is fast and allows the resolution of the most complex cases. Currently, the results are shown in the assembly comments.

See the examples below:

Advanced analysis resolving a target branch site held in $t9.

Advanced analysis resolving pre- and post-execution register data.

Type libraries and Syscalls

JEB 2.3 ships with type libraries for several platforms, including the GNU Linux API for Linux MIPS 32-bit systems. Soon we will also release the signatures of common libraries.

Combined with the advanced analyzer, the controlled simulation step described above also resolves MIPS system calls. The resolution of $v0, holding the syscall number, is resolved during simulation – therefore handling complex obfuscation cases; under the hood, a virtual method reference is created to represent the syscall as a standard routine. See the example below:

Syscall #4013 (time) resolved during the advanced analysis phase.

Conclusion

We presented some of the most interesting features of our new MIPS decompiler specifically, and more generally, JEB 2.3. It is still in beta mode and actively developed, feel free to try it out and let us know your feedback. Other decompilers will be released in the coming weeks/months, stay tuned.

For example, prologues such as “lui $gp / addiu / addu” are common and could be looked for statically. ↩

Red October Malware for Android

Blue Coat Systems recently released a paper about the Inception APT (also dubbed Cloud Atlas, it may be connected to the Red October APT). One component of this APT is an Android trojan, masquerading as a Whatsapp update package. It is able to record audio calls, as well as gather, encrypt and exfiltrate user information.

The 4 strings partially written in Hindi that have been speculated on are those:

For researchers wanting to have a peak inside the APK, we are providing JEB decompiled Java code for one such sample.

Download is here: cloudatlas-android-malware-decompiled.zip

FinFisher FinSpy Mobile app for Android decompiled

The fully decompiled code and assets of 421and.apk can be found here: FinSpyMobileAndroid-decompiled.zip (no password).

This particular APK, although not the latest, is not obfuscated and easily reveals most capabilities of the malware:

Location tracker
Information stealer (calendar, contact list, text messages, Whatsapp databases, etc.)
Remotely controlled through encrypted communication over SMS and data

A great recap of the full story can be read on Netzpolitik. Real time updates are on Twitter.

Using the AST Tagging API

JEB version 1.5.201404100 introduces new methods to the AST IElement objects, attachTag() and retrieveTag(). These methods allow an API user to tag elements of Abstract Syntax Trees. When a tagged tree is rendered (that is, when decompiled Java code is being generated), tags are processed and provided to the user alongside the decompiled code, with associated text coordinates (line, column). Within the API documentation, a “located tag” is referred to as a mark.

One example use case: Tagging nodes of an AST can be useful if the yielded source code is of specific interest, and potentially require follow-up human analysis.

The example below shows how one can navigate a Class tree, looking for specific calls to findViewById:

def processTree(e):
  if isinstance(e, Call) and e.getMethod().getName() == 'findViewById' and ... :
    print 'Tagging Call element:', e #e.getMethod().getName()
    e.attachTag('testTag', 'Calling interesting findViewById')
  if e:
    # recursively process sub-elements
    for e1 in e.getSubElements():
      processTree(e1)

sig = ...
ast = self.jeb.getDecompiledClassTree(sig)  # assume the class was decompiled
processTree(e)

The Class tree can be rendered by calling the newly introduced overloaded decompile(sig, is_class, regenerate, marks) method:

marks = []
decompiled_class = self.jeb.decompile(sig, True, False, marks)
print marks

Remember to set regenerate to False since you want to avoid re-decompilation (doing so would generate a new, tag-less AST).

The marks array will contain the precise locations (lines and columns) of each tag within the decompiled_class text buffer.

Hopefully, this simplistic example showed you how to use the new AST tagging methods. Happy reversing and code analysis.

Decompiled Java code for Android MisoSMS

Yesterday was eventful on the Android malware front. After Mouabad reported by Lookout, FireEye reported MisoSMS. It might also have been reported by Sophos at roughly the same time.

The malicious application is used in several campaigns to steal SMS and send them to China, according to FireEye’s blog post.

Many of you would like to examine and study its code, that’s why I uploaded an archive with the source code decompiled by JEB 1.4, as well as a cleaned-up manifest. Link: MisoSMS_JEB_decomp_20131217

Decompiling Android Mouabad

Lookout has an interesting article about Android Mouabad. Yet another Korean SMS malware!

The APK fully decompiled by JEB 1.4 can be found here: mouabad_JEB_decomp_20131217.zip. I haven’t refactored or commented the code, these are raw decompiled classes.

Sample MD5 68DF97CD5FB2A54B135B5A5071AE11CF is available on Contagio.

Android Skullkey decompiled sources

Want to have a look at Android Skullkey?

I uploaded the Java code decompiled with JEB 1.4 here: skullkey_JEB_decomp_20131205.zip
The interesting bits are in com.hk515. Have fun!

Sample MD5 48c3903b3e91f4c5fbc65b640647f7d7 is available on Contagio.

Decompiled Java Code Manipulation using JEB API – Part 3: Defeating Reflection

This is the third and final part of our series of blogs showing how to use JEB’s API to manipulate decompiled Java syntax trees. (See Part 1, Part 2.)

Today, we will show how to leverage the API to defeat the main obfuscation scheme used to protect the infamous OBad trojan: generalized reflection.

Want a visual TL;DR? Have a look at the following demo video.

Download the scripts: 1, 2, 3
Demo video

Reflection is a powerful feature used by interpreted languages to query information about what is currently loaded and executed by the virtual machine. Using reflection, a program can find references to classes and items within classes, and act on those items, for example, execute methods.

Example:

java.lang.System.exit(3);

can be rewritten to:

java.lang.Class.forName("java.lang.System")
        .getMethod("exit", Integer.TYPE).invoke(null, 3);

Combined with string encryption, reflection can seriously bloat and obscure a program, slowing down the reverse-engineering process.

Based on the above code example, it appears that a solution to remove reflection code automatically or semi-automatically is both useful and realistic. Let’s see how we can to achieve this goal using JEB’s AST API.

As said in the introduction, consider the Android trojan OBad. The video shows the use of a preliminary script to decrypt the scripts. Refer to the script ObadDecrypt.py linked above; we won’t cover the details of string decryption here, as they have been covered in a previous blog.

After string decryption, OclIIOlC.onCreate looks like the following piece of Java code:

    protected void onCreate(Bundle arg9) {
        Object v1_2;
        Object v9;
        super.onCreate(arg9);
        if(Class.forName("android.os.Build").getField("MODEL").get(null).equals(new String(CIOIIolc.IoOoOIOI(CIOIIolc.cOIcOOo(IoOoOIOI.cOIcOOo("cmA4"), CIOIIolc.cOIcOOo("ÞÜëÇØÚâØÞÜÄØåØÞÜéê×èêÉÛè".getBytes())))))) {
            System.exit(0);
        }

        Context v0 = OcooIclc.cOIcOOo;
        Class v2 = MainService.class;
        try {
            v9 = Class.forName("android.content.Intent").getDeclaredConstructor(Class.forName("android.content.Context"), Class.class).newInstance(v0, v2);
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        try {
            Class.forName("android.content.Intent").getMethod("addFlags", Integer.TYPE).invoke(v9, Integer.valueOf(268435456));
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        Context v1_1 = OcooIclc.cOIcOOo;
        try {
            Class.forName("android.content.Context").getMethod("startService", Class.forName("android.content.Intent")).invoke(v1_1, v9);
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        v1_1 = this.getApplicationContext();
        try {
            v1_2 = Class.forName("android.content.Context").getMethod("getPackageManager", null).invoke(v1_1, null);
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        ComponentName v0_2 = this.getComponentName();
        try {
            Class.forName("android.content.pm.PackageManager").getMethod("setComponentEnabledSetting", Class.forName("android.content.ComponentName"), Integer.TYPE, Integer.TYPE).invoke(v1_2, v0_2, Integer.valueOf(2), Integer.valueOf(1));
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        this.finish();
    }

Let’s remove reflection. The steps are:

Recursively walk the AST and check the statements
Note: we could enumerate all elements instead, however, the script is intended as a demo, not an exhaustive solution to this problem
Look for Call statements (eg, foo()), or Assignments whose right part is a Call (eg, r = bar())
Check if the Call matches the following nested Call pattern:
Class.forName(…).getMethod(…).invoke(…)
Note: the script does not take care of class instantiation or field access through reflection, for the same reasons stated above.
Extract the reflection information:
1. Fully-qualified name of the class
2. Method name and prototype
3. Invocation arguments
Create a new method, register it to the DEX object
Create a Call element that accurately mirrors the reflection calls
Note: There are limitations, especially considering the return type
Finally, replace the reflection call by the direct method call

A deobfuscated / “unreflected” version of the source code above looks like:

    protected void onCreate(Bundle arg9) {
        // ...
        // ...

        try {
            v9.addFlags(Integer.valueOf(268435456));
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        Context v1_1 = OcooIclc.cOIcOOo;
        try {
            v1_1.startService(v9);
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        v1_1 = this.getApplicationContext();
        try {
            v1_2 = v1_1.getPackageManager();
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        ComponentName v0_2 = this.getComponentName();
        try {
            v1_2.setComponentEnabledSetting(v0_2, Integer.valueOf(2),
                    Integer.valueOf(1));
        }
        catch(Throwable v0_1) {
            throw v0_1.getCause();
        }

        this.finish();
    }

We can now remove the try-catchall. This is what the third script is doing.

At this stage, the value and potential of the AST API package should have been clearly demonstrated. We encourage users of JEB to experiment with it, tweak our sample scripts, create their own, and eventually contribute by sending us their useful deobfuscation and/or optimization scripts. We will be happy to make them available to our user base through the Resources page.

Decompiled Java Code Manipulation using JEB API – Part 2: Decrypting Strings

This is part 2 of our series of blogs showing how to use JEB’s API to manipulate decompiled Java syntax trees. (Missed Part 1?)

Let’s see how the API can be leveraged to decrypt strings, and plug the decrypted strings back into the source code.

Download the script
Demo video

As shown in the video, we are going to focus on a protected version of Cyanide. The strings are encrypted, and that the decompiled Java code does not look pretty:

    ...
    protected void onCreate(Bundle arg5) {
        super.onCreate(arg5);
        this.setContentView(2130903040);
        if(new File(MainActivity.鷭(-387, -15, 608)).exists()) {
            MainActivity.鷭(MainActivity.鷭(-389, 52, 159));
            MainActivity.鷭(MainActivity.鷭(-333, 37, 17));
            MainActivity.鷭(MainActivity.鷭(-407, 53, 629),
                    MainActivity.鷭(-395, -15, 0), this);
            MainActivity.鷭(MainActivity.鷭(-398, 53, 92),
                    MainActivity.鷭(-386, -15, 586), this);
            MainActivity.鷭(MainActivity.鷭(-402, 52, 102),
                    MainActivity.鷭(-378, -15, 665), this);
            MainActivity.鷭(MainActivity.鷭(-368, 37, 119));
            MainActivity.鷭(this);
            return;
        }

        ...
        ...

MainActivity.鷭(x, y, z) is the decryptor method. The parameters indirectly reference a static array of bytes, that contains the encrypted strings for the class.

Our script is going to do the following:

Search the encrypted byte array
1. Enumerate the fields of the class
2. Look for a byte[] field marked private static final
3. Verify that this field is referenced in <clinit>, the static {…} initializer for the class
4. The field should also be referenced in another method: the decryptor
Check the structure of <clinit>
1. It should look like: encrypted_strings = new byte[]{………}
2. Retrieve the encrypted bytes
The decryptor was analyzed in a previous blog post
1. The decryptor constants need to be extracted manually (let’s keep the script simple)
Then, for every method of the class, we will:
1. Enumerate the statements and sub-elements of the AST recursively
2. Look for Call elements
3. If the Call matches the decryptor method, we extract the argument provided to the Call
4. We use these arguments to decrypt the string
5. Finally, we replace the Call by a newly created Constant element that represent the decrypted string

(Note: The JEB python script is just a little over 100 lines, and took less than 1 hour to write. It could be greatly improved, for instance, the decryptor constants could be found programmatically, but this added complexity is out of the scope of this introductory blog post.)

Here what the deobfuscated code snippet looks like:

    ...
    protected void onCreate(Bundle arg5) {
        super.onCreate(arg5);
        this.setContentView(2130903040);
        if(new File("/data/last_alog/onboot").exists()) {
            MainActivity.鷭("rm /data/last_alog/*");
            MainActivity.鷭("cat /system/etc/install-recovery.sh > /system/etc/install-recovery.sh.backup");
            MainActivity.鷭("su", "/system/etc/su", this);
            MainActivity.鷭("supersu.apk", "/system/etc/supersu.apk", this);
            MainActivity.鷭("root.sh", "/system/etc/install-recovery.sh", this);
            MainActivity.鷭("chmod 755 /system/etc/install-recovery.sh");
            MainActivity.鷭(this);
            return;
        }
        ...
        ...

In part 3, we will show how to defeat a complex obfuscation scheme used by many bytecode protectors: reflection.

Decompiled Java Code Manipulation using JEB API – Part 1: Removing Junk Code

This is the first post of a 3-part blogs series that will demonstrate the new features of JEB’s jeb.api.ast API package.

Classes of this package allow read and write access on the Abstract Syntax Trees (AST) of the decompiled Java code produced by JEB. In a nutshell, it allows power users to implement complex deobfuscation schemes or their own optimizations strategies.

Download the script
Demo video

Let’s jump straight to the crux of the matter: this piece of code has been obfuscating by a well-known Android/Java protector software:

/*.method public constructor (Context, AttributeSet, I)V
          .registers 8
          const/4                 v3, 0x1
          const/4                 v2, 0x0
          invoke-direct           View-&amp;amp;gt;(Context, AttributeSet, I)V, p0, p1, p2, p3
          new-instance            v0, Paint
:E
          packed-switch           v3, :90
:14
          packed-switch           v2, :A0
:1A
          goto                    :14
:1C
          invoke-direct           Paint-&amp;amp;gt;()V, v0
          iput-object             v0, p0, TileView-&amp;amp;gt;b0431бб0431б0431:Paint
:26
          packed-switch           v2, :B0
:2C
          packed-switch           v2, :C0
:32
          goto                    :2C
:34
          sget-object             v0, xxxkkk$xkkxkk-&amp;amp;gt;b04310431б0431б0431:[I
          invoke-virtual          Context-&amp;amp;gt;obtainStyledAttributes(AttributeSet, [I)TypedArray, p1, p2, v0
          move-result-object      v0
          const/16                v1, 0xC
          invoke-virtual          TypedArray-&amp;amp;gt;getInt(I, I)I, v0, v2, v1
          move-result             v1
:4C
          packed-switch           v2, : D0
:52
          packed-switch           v2, :E0
:58
          goto                    :52
:5A
          packed-switch           v2, :F0
:60
          packed-switch           v3, :FC
:66
          packed-switch           v3, :10C
:6C
          goto                    :66
:6E
          packed-switch           v2, :11C
:74
          packed-switch           v3, :12C
:7A
          goto                    :74
:7C
          packed-switch           v2, :13C
:82
          sput                    v1, TileView-&amp;amp;gt;bб0431ббб0431:I
          invoke-virtual          TypedArray-&amp;amp;gt;recycle()V, v0
          return-void
          .packed-switch 0x0
              :E
              :1C
          .end packed-switch
          .packed-switch 0x0
              :1C
              :E
          .end packed-switch
          .packed-switch 0x0
              :34
              :26
          .end packed-switch
          .packed-switch 0x0
              :34
              :26
          .end packed-switch
          .packed-switch 0x0
              :5A
              :4C
          .end packed-switch
          .packed-switch 0x0
              :5A
              :4C
          .end packed-switch
          .packed-switch 0x0
              :60
          .end packed-switch
          .packed-switch 0x0
              :4C
              :6E
          .end packed-switch
          .packed-switch 0x0
              :4C
              :6E
          .end packed-switch
          .packed-switch 0x0
              :7C
              :4C
          .end packed-switch
          .packed-switch 0x0
              :4C
              :7C
          .end packed-switch
          .packed-switch 0x0
              :82
          .end packed-switch
.end method*/

JEB decompiles this Dalvik bytecode to the following Java code:
    public TileView(Context arg5, AttributeSet arg6, int arg7) {
        super(arg5, arg6, arg7);
    label_3:
        switch(1) {
            case 0: {
                goto label_3;
            }
            case 1: {
                goto label_6;
            }
        }

        while(true) {
            switch(0) {
                case 0: {
                    goto label_6;
                }
                case 1: {
                    goto label_3;
                }
            }
        }

    label_6:
        this.b0431бб0431б0431 = new Paint();
    label_8:
        switch(0) {
            case 0: {
                goto label_11;
            }
            case 1: {
                goto label_8;
            }
        }

        while(true) {
            switch(0) {
                case 0: {
                    goto label_11;
                }
                case 1: {
                    goto label_8;
                }
            }
        }

    label_11:
        TypedArray v0 = arg5.obtainStyledAttributes(arg6,
                xkkxkk.b04310431б0431б0431);
        int v1 = v0.getInt(0, 12);
    label_15:
        switch(0) {
            case 0: {
                goto label_18;
            }
            case 1: {
                goto label_15;
            }
        }

        while(true) {
            switch(0) {
                case 0: {
                    goto label_18;
                }
                case 1: {
                    goto label_15;
                }
            }
        }

    label_18:
        switch(1) {
            case 0: {
                goto label_15;
            }
            case 1: {
                goto label_21;
            }
        }

        while(true) {
            switch(1) {
                case 0: {
                    goto label_15;
                }
                case 1: {
                    goto label_21;
                }
            }
        }

    label_21:
        switch(0) {
            case 0: {
                goto label_24;
            }
            case 1: {
                goto label_15;
            }
        }

        while(true) {
            switch(1) {
                case 0: {
                    goto label_15;
                }
                case 1: {
                    goto label_24;
                }
            }
        }

    label_24:
        TileView.bб0431ббб0431 = v1;
        v0.recycle();
    }

As one can see, dummy switches as well as pseudo-infinite loops have been inserted within the original Java code, in order to produce flow obfuscation. Using the AST API, we’re going to implement a JEB plugin that cleans the obfuscated code.

A dummy switch construct looks like the following:
switch(X) {
case X:
  goto next;
case Y:
  goto label_fake1:
case Z:
  goto label_fake2:
}
...
next:

The above piece of code is equivalent to:
goto next;
...
next:

Which can be reduced to a single label:
next:

In order to find the dummy switches, the JEB script is going to do the following:

Recursively enumerate the statements of a method body, looking for SwitchStm elements
Check that the switch is a dummy switch:

The switched expression must be a Constant
The case block associated with that constant must start with a Goto statement


Replace the switch by the goto
Find the first Label that follows the (now replaced) switch
If a label is found, is at the same block level as the switch, and is the label pointed to by the goto that replaced the switch, then all the expressions between the goto and the label can be discarded
Finally, apply standard JEB optimizations that remove the remaining useless gotos and labels

This algorithm fits in a less than a 100-line Python script. Download the script, experiment with it, and get accustomed to the API.
The cleaned-up code is this very simple, more readable method:
    public TileView(Context arg5, AttributeSet arg6, int arg7) {
        super(arg5, arg6, arg7);
        this.b0431бб0431б0431 = new Paint();
        TypedArray v0 = arg5.obtainStyledAttributes(arg6,
                xkkxkk.b04310431б0431б0431);
        int v1 = v0.getInt(0, 12);
        TileView.bб0431ббб0431 = v1;
        v0.recycle();
    }

In Part Deux, we will show how the AST API can be leveraged to decrypt encrypted strings of a protected piece of code.