diff --git a/src/mobile-pentesting/android-app-pentesting/manual-deobfuscation.md b/src/mobile-pentesting/android-app-pentesting/manual-deobfuscation.md index 6467e61bfaa..4bca1d98367 100644 --- a/src/mobile-pentesting/android-app-pentesting/manual-deobfuscation.md +++ b/src/mobile-pentesting/android-app-pentesting/manual-deobfuscation.md @@ -116,6 +116,74 @@ python emulate.py app.apk "Lcom/example/Decryptor;->decrypt" -v --debug --limit Outputs are the collected return values per invocation; useful for bulk string/config extraction during malware triage or heavily obfuscated apps. +### Offline recovery of staged Android malware payloads + +A recurring Android malware pattern is a **small Java stub + stripped JNI loader + high-entropy asset**. If the APK contains a native library with one abnormally large JNI export, encrypted strings, and an `assets/` blob that doesn't match its file extension, you can usually recover the next stage **without executing the sample**. + +#### OLLVM-style native XOR string recovery + +A common native pattern is a one-time init block that decrypts strings **in place** byte-by-byte: + +```c +if (init_done == 0) { + DAT_00142f50 ^= 0xd7; + DAT_00142f51 ^= 0xb4; + DAT_00142f52 ^= 0xa6; + init_done = 1; +} +``` + +Practical workflow: +- In **Ghidra**, increase the decompiler timeout if one JNI function is tens of kilobytes long and initially fails to decompile. +- Parse assignments like `DAT_xxxx ^= 0xNN` from the decompiler output to build an **address → XOR key** map. +- Apply that map to the corresponding bytes from the ELF `.data` / `.rodata` section using Python, LIEF, or raw `readelf` offsets. +- Recovered strings often expose **asset names, class names, JNI signatures, crypto primitives, and backend URLs** needed to unpack the next stage. + +This is especially useful when the native loader decrypts its strings only on the **first invocation**, leaving plaintext only in process memory but never on disk. + +#### Rebuilding filename-derived AES asset decryptors + +Once native strings reveal constants such as an asset name, `SHA-1`, `SHA-256`, and `AES/CBC/PKCS5Padding`, recreate the decryptor offline and validate the result with file magic: + +```python +from Crypto.Cipher import AES +import hashlib +ct = open('asset.bin','rb').read() +seed = b'asset_name2' +key = hashlib.sha1(seed).digest()[:16] +iv = hashlib.sha256(seed).digest()[:16] +pt = AES.new(key, AES.MODE_CBC, iv).decrypt(ct) +pt = pt[:-pt[-1]] +open('stage2.bin','wb').write(pt) +``` + +Triage hints: +- Test seeds derived from **asset names, parent folder names, or adjacent literals**. +- If plaintext starts with `PK\x03\x04`, treat it as a **ZIP container** and inspect every entry before focusing on a single `classes.dex`. +- Reuse the same derivation pattern against **nested assets**; packers frequently keep the same KDF across stages. + +#### StringFog and similar DEX string obfuscators + +If JADX shows many Base64-looking constants and a helper like `StringFogImpl.decrypt(String)`, extract the key and replay the transform over all candidate strings. One common variant is **Base64 decode + repeating-key XOR**: + +```python +import base64 +def sf(s): + key = b'UTF-8' + ct = base64.b64decode(s) + return bytes(c ^ key[i % len(key)] for i, c in enumerate(ct)).decode() +``` + +Recovered plaintext commonly reveals **Firebase paths, Telegram bot logic, phishing URLs, WebView resources, and operator config** that do not appear anywhere else in the APK. + +#### Treat split APKs, loaders, and web assets as one payload chain + +After decrypting a staged container: +- Inspect **`bootstrap.dex` / `installer.dex` / `payload_split*.apk` / HTML assets** together instead of analyzing each file in isolation. +- Expect **`DexClassLoader`** or equivalent runtime assembly logic even if each split APK looks incomplete on its own. +- Check both **native code** and **phishing WebView assets** for separate backend URLs; the payment-theft infrastructure may be distinct from the RAT C2. +- Search config files for **subscription timestamps, HMAC/signature fields, wallet addresses, or miner toggles** because MaaS builders often hide monetization logic in JSON rather than code. + ## References and Further Reading - [DaliVM: Python Dalvik emulator for static string decryption](https://github.com/fatalSec/DaliVM) @@ -126,6 +194,7 @@ Outputs are the collected return values per invocation; useful for bulk string/c - This talk discusses a series of obfuscation techniques, solely in Java code, that an Android botnet was using to hide its behavior. - Deobfuscating Android Apps with Androidmeda (blog post) – [mobile-hacker.com](https://www.mobile-hacker.com/2025/07/22/deobfuscating-android-apps-with-androidmeda-a-smarter-way-to-read-obfuscated-code/) - Androidmeda source code – [https://github.com/In3tinct/Androidmeda](https://github.com/In3tinct/Androidmeda) +- [Fake RTO Challan Checker Part 2: Cracking the Payload, Mapping the Operator, and Why This Is Worse Than I Thought](https://medium.com/@singhbkn07/fake-rto-challan-checker-part-2-cracking-the-payload-mapping-the-operator-and-why-this-is-3eb78e512d7f) {{#include ../../banners/hacktricks-training.md}}