Fingerprinting Malicious Code Through Statistical Opcode Analysis
by Daniel Bilar
This paper discusses a detection mechanism for malicious code through statistical analysis of opcode distributions. 67 malware executables were sampled and their opcode frequency distribution statistically compared with the aggregate statistics of twenty non-malicious samples. It was found that malware opcode distributions differ statistically significantly from non-malicious software. Furthermore, rare opcodes seem to be a stronger predictor, explaining 12-63% of frequency variation.