Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,74 @@ python emulate.py app.apk "Lcom/example/Decryptor;->decrypt" -v --debug --limit

Outputs are the collected return values per invocation; useful for bulk string/config extraction during malware triage or heavily obfuscated apps.

### Offline recovery of staged Android malware payloads

A recurring Android malware pattern is a **small Java stub + stripped JNI loader + high-entropy asset**. If the APK contains a native library with one abnormally large JNI export, encrypted strings, and an `assets/` blob that doesn't match its file extension, you can usually recover the next stage **without executing the sample**.

#### OLLVM-style native XOR string recovery

A common native pattern is a one-time init block that decrypts strings **in place** byte-by-byte:

```c
if (init_done == 0) {
DAT_00142f50 ^= 0xd7;
DAT_00142f51 ^= 0xb4;
DAT_00142f52 ^= 0xa6;
init_done = 1;
}
```

Practical workflow:
- In **Ghidra**, increase the decompiler timeout if one JNI function is tens of kilobytes long and initially fails to decompile.
- Parse assignments like `DAT_xxxx ^= 0xNN` from the decompiler output to build an **address → XOR key** map.
- Apply that map to the corresponding bytes from the ELF `.data` / `.rodata` section using Python, LIEF, or raw `readelf` offsets.
- Recovered strings often expose **asset names, class names, JNI signatures, crypto primitives, and backend URLs** needed to unpack the next stage.

This is especially useful when the native loader decrypts its strings only on the **first invocation**, leaving plaintext only in process memory but never on disk.

#### Rebuilding filename-derived AES asset decryptors

Once native strings reveal constants such as an asset name, `SHA-1`, `SHA-256`, and `AES/CBC/PKCS5Padding`, recreate the decryptor offline and validate the result with file magic:

```python
from Crypto.Cipher import AES
import hashlib
ct = open('asset.bin','rb').read()
seed = b'asset_name2'
key = hashlib.sha1(seed).digest()[:16]
iv = hashlib.sha256(seed).digest()[:16]
pt = AES.new(key, AES.MODE_CBC, iv).decrypt(ct)
pt = pt[:-pt[-1]]
open('stage2.bin','wb').write(pt)
```

Triage hints:
- Test seeds derived from **asset names, parent folder names, or adjacent literals**.
- If plaintext starts with `PK\x03\x04`, treat it as a **ZIP container** and inspect every entry before focusing on a single `classes.dex`.
- Reuse the same derivation pattern against **nested assets**; packers frequently keep the same KDF across stages.

#### StringFog and similar DEX string obfuscators

If JADX shows many Base64-looking constants and a helper like `StringFogImpl.decrypt(String)`, extract the key and replay the transform over all candidate strings. One common variant is **Base64 decode + repeating-key XOR**:

```python
import base64
def sf(s):
key = b'UTF-8'
ct = base64.b64decode(s)
return bytes(c ^ key[i % len(key)] for i, c in enumerate(ct)).decode()
```

Recovered plaintext commonly reveals **Firebase paths, Telegram bot logic, phishing URLs, WebView resources, and operator config** that do not appear anywhere else in the APK.

#### Treat split APKs, loaders, and web assets as one payload chain

After decrypting a staged container:
- Inspect **`bootstrap.dex` / `installer.dex` / `payload_split*.apk` / HTML assets** together instead of analyzing each file in isolation.
- Expect **`DexClassLoader`** or equivalent runtime assembly logic even if each split APK looks incomplete on its own.
- Check both **native code** and **phishing WebView assets** for separate backend URLs; the payment-theft infrastructure may be distinct from the RAT C2.
- Search config files for **subscription timestamps, HMAC/signature fields, wallet addresses, or miner toggles** because MaaS builders often hide monetization logic in JSON rather than code.

## References and Further Reading

- [DaliVM: Python Dalvik emulator for static string decryption](https://github.com/fatalSec/DaliVM)
Expand All @@ -126,6 +194,7 @@ Outputs are the collected return values per invocation; useful for bulk string/c
- This talk discusses a series of obfuscation techniques, solely in Java code, that an Android botnet was using to hide its behavior.
- Deobfuscating Android Apps with Androidmeda (blog post) – [mobile-hacker.com](https://www.mobile-hacker.com/2025/07/22/deobfuscating-android-apps-with-androidmeda-a-smarter-way-to-read-obfuscated-code/)
- Androidmeda source code – [https://github.com/In3tinct/Androidmeda](https://github.com/In3tinct/Androidmeda)
- [Fake RTO Challan Checker Part 2: Cracking the Payload, Mapping the Operator, and Why This Is Worse Than I Thought](https://medium.com/@singhbkn07/fake-rto-challan-checker-part-2-cracking-the-payload-mapping-the-operator-and-why-this-is-3eb78e512d7f)

{{#include ../../banners/hacktricks-training.md}}

Expand Down