Android Malware
Source Code Analysis
Ramón Costales
Juan Tapiador
1 / 33
Motivation & Objectives
Manually collect and analyze samples to
obtain:
Is Android malware production more
complex and becoming an industry? ▪ Dataset insights
▪ Code size
▪ Code quality
▪ Development costs
▪ Comparison
2 / 33
Dataset
Malware Types, Tags, No. Samples, Permissions,
Capabilities, VirusTotal Detections
3 / 33
Acquisition & Analysis
Acquisition Manual Analysis
Github Searches
01.
Underground Forums 97 Samples
02.
3,538,683 SLOCs
Malware Databases: vx-underground,
03. theZoo, sppen…
Following web search links
04.
4 / 33
Malware Types
Malware Type No. Malware Malware Type No. Malware
Samples Samples
RAT 31 Trojan-SMS 3
Spyware 13 Dropper 2
Trojan-Spy 9 Trojan-Backdoor 2
Keylogger 8 Backdoor 1
Trojan-Banker 7 Downloader 1
Rootkit 5 Password-Stealing-Ware 1
Locker 4 Scareware 1
Ransomware 4 Trojan 1
Phishing 3 Trojan-Wiper 1
5 / 33
Malware Tags
Malware Tag No. Malware Samples Malware Tag No. Malware Samples
Spyware 72 Locker 19
Botnet 60 Keylogger 17
Backdoor 44 Mailfinder 12
C2 44 Wiper 12
Billing-Fraud 40 Password-Stealing-Ware 11
Trojan 35 Phishing 9
RAT 34 Encryption-Ransomware 8
Downloader 31 Screen-Locking-Ransomware 8
Elevated-Privilege-Abuse 27 Overlay 7
6 / 33
Samples By Year
7 / 33
Malware Permissions
Permission No. Malware Samples Permission No. Malware Samples
INTERNET 66 ACCESS FINE LOCATION 35
RECEIVE BOOT COMPLETED 55 RECEIVE SMS 33
WRITE EXTERNAL STORAGE 51 RECORD AUDIO 31
READ SMS 47 READ EXTERNAL STORAGE 31
ACCESS NETWORK STATE 45 CAMERA 31
READ PHONE STATE 44 CALL PHONE 26
READ CONTACTS 44 READ CALL LOG 24
SEND SMS 39 ACCESS COARSE LOCATION 21
WAKE LOCK 35 SYSTEM ALERT WINDOW 20
8 / 33
Malware Capabilities
01 02
Steal Information Control the Device
▪ Upload and List Files ▪ Download and Delete Files
▪ List Installed Apps ▪ Install, Uninstall and Open Apps
▪ Get Tasks ▪ Encrypt and Decrypt Files
▪ Input Capture ▪ Lock the Device
▪ Screenshot ▪ Hide the App Icon
▪ Read SMS ▪ Remote Shell
▪ Read Contacts ▪ Draw Over Other Apps
▪ Camera ▪ Make Phone Calls
9 / 33
VirusTotal Detections By Year
1 Locker andr0id_l0cker.MX 54.1%
2 Trojan-Banker Cerberus.d 54.1%
3 1, 2
3 RAT AndroRAT 53.2%
Average: 8.026%
10 / 33
VirusTotal Detections By Type
11 / 33
Code Size
No. Files, SLOCs, No. Functions, Languages
12 / 33
Files By Year
1
2
3
1
1 RAT AhMyth 6,024
Average: 256.74 2 Backdoor Bootloader-Backdoor 3,700
3 RAT Arbitrium 2,467
13 / 33
Files By Type
14 / 33
SLOCs By Year
1
2
1 3
1 Backdoor Bootloader-Backdoor 1,037,561
Average: 36,481.27 2 RAT AhMyth 761,610
3 Trojan-Banker Cerberus 220,173
15 / 33
SLOCs By Type
16 / 33
Functions By Year
1
2
1
3
2
3
1 Ransomware Covid-Locker 5,846
Average: 233.85 2 Keylogger Lokiboard-mod 1,350
3 Keylogger Lokiboard 1,283
17 / 33
Avg. Functions Per File
1 RAT BetterAndroRAT 26.20
2 RAT Dendroid 24.09
1
3 RAT rdroid 23.14
2
3
Average: 7.33
18 / 33
Programming Languages
1 Backdoor Bootloader-Backdoor 51
2 RAT AhMyth 29
1
3 RAT Arbitrium 20
Average: 8.75
19 / 33
Development Costs
Effort, Development Time, Team Size
20 / 33
Effort
2 1
1 3
1 Backdoor Bootloader-Backdoor 3,523.92
Average: 115.98 2 RAT AhMyth 2,547.01
3 Trojan-Banker Cerberus 692.01 21 / 33
Development Time
1
2
1 3
1 Backdoor Bootloader-Backdoor 55.69
Average: 8.32 2 RAT AhMyth 49.23
3 Trojan-Banker Cerberus 30 22 / 33
Team Size
1
2
1 3
1 Backdoor Bootloader-Backdoor 63.27
Average: 4.31 2 RAT AhMyth 51.74
3 Trojan-Banker Cerberus 23.06
23 / 33
Code Quality
Complexity, Maintainability, Density of Comments
24 / 33
Complexity By Values
25 / 33
Complexity By Year
1 Rootkit Adore 8
2 Ransomware SARA 6
1 3 Trojan-Backdoor DarkSilent 4
Average: 2.29
26 / 33
Maintainability By Values
27 / 33
Maintainability By Year
1 Trojan FakeFacebook 92.036
2 Trojan-SMS MalRecipe 72.904
1
3 Rootkit Adore 70.559
2
3
Average: 48.60
28 / 33
Density of Comments
1 Trojan-Spy Flashlight 81.597
2 RAT AndroidSurveillance 72.194
1
3 RAT Gypte 65.265
2
3
Average: 14.81%
29 / 33
Android vs
non-specific malware
Code Size, Development Cots, Code Quality
30 / 33
Limitations Conclusions
▪ Non-representative dataset ▪ Increase in code size
○ Few samples ▪ Increase in development costs
○ Collection bias ▪ Decrease in code quality
▪ Estimates, not reality ▪ Larger sizes and costs than
▪ Ever changing malware landscape non-specific malware, fewer quality
▪ UCC-J tool ▪ Inconclusive results
▪ Too early
31 / 33
Future Work
Code reuse, Malware vs
Compilation & Repeat in due
clones and Regular More insights
Execution time
plagiarism software
32 / 33
Thanks !
33 / 33