Moments ago, Neel Mehta, a researcher at Google posted a mysterious message on Twitter with the #WannaCryptAttribution hashtag:

The cryptic message in fact refers to similarity between samples that have shared code between themselves. The two samples Neel refers to post are:

  • A WannaCry cryptor sample from February 2017 which looks like a very early variant
  • A Lazarus APT group sample from February 2015

The similarity can be observed in the screenshot below, taken between the two samples, with similar code highlighted:

So, what does it all mean?

I know about Wannacry, but what is Lazarus?

We wrote about the Lazarus group extensively and presented together with our colleagues from BAE and SWIFT at the Kaspersky Security Analyst Summit (SAS 2017). See:

Among other things, the Lazarus group was responsible for the Sony Wiper attack, the Bangladesh bank heist and the DarkSeoul operation.

[youtube https://www.youtube.com/watch?v=9Vh2n6nC0t4?ecver=2]

We believe Lazarus is not just “yet another APT actor”. The scale of the Lazarus operations is shocking. The group has been very active since 2011 and was originally disclosed when Novetta published the results of its Operation Blockbuster research. During that research, which we also participated in, hundreds of samples were collected and show that Lazarus is operating a malware factory that produces new samples via multiple independent conveyors.

Is it possible this is a false flag?

In theory anything is possible, considering the 2015 backdoor code might have been copied by the Wannacry sample from February 2017. However, this code appears to have been removed from later versions. The February 2017 sample appears to be a very early variant of the Wannacry encryptor. We believe a theory a false flag although possible, is improbable.

What conclusions can we make?

For now, more research is required into older versions of Wannacry. We believe this might hold the key to solve some of the mysteries around this attack. One thing is for sure — Neel Mehta’s discovery is the most significant clue to date regarding the origins of Wannacry.

Are we sure the early February variant is the precursor to the later attacks?

Yes, it shares the same the list file extension targets for encryption but, in the May 2017 versions, more extensions were added:

> .accdb
> .asm
> .backup
> .bat
> .bz2
> .cmd
> .der
> .djvu
> .dwg
> .iso
> .onetoc2
> .pfx
> .ps1
> .sldm
> .sldx
> .snt
> .sti
> .svg
> .sxi
> .vbs
> .vcd

They also removed an older extension: “.tar.bz2” and replaced it with just “.bz2”
We strongly believe the February 2017 sample was compiled by the same people, or by people with access to the same sourcecode as the May 2017 Wannacry encryptor used in the May 11th wave of attacks.

So. Now what?

We believe it’s important that researchers around the world investigate these similarities and attempt to discover more facts about the origin of Wannacry. Looking back to the Bangladesh attack, in the early days, there were very few facts linking them to the Lazarus group. In time, more evidence appeared and allowed us, and others, to links them together with high confidence. Further research can be crucial to connecting the dots.

Has anyone else confirmed this?

Yes, Matt Suiche from Comae Technologies confirmed the same similarity based on Neel’s samples:

Can you share the YARA rule used to find this?

Yes, of course. Here you go:

rule lazaruswannacry {

meta:

description = "Rule based on shared code between Feb 2017 Wannacry sample and Lazarus backdoor from Feb 2015 discovered by Neel Mehta"
date = "2017-05-15"
reference = "https://twitter.com/neelmehta/status/864164081116225536"
author = "Costin G. Raiu, Kaspersky Lab"
version = "1.0"
hash = "9c7c7149387a1c79679a87dd1ba755bc"
hash = "ac21c8ad899727137c4b94458d7aa8d8"

strings:

$a1={
51 53 55 8B 6C 24 10 56 57 6A 20 8B 45 00 8D 75
04 24 01 0C 01 46 89 45 00 C6 46 FF 03 C6 06 01
46 56 E8
}

$a2={
03 00 04 00 05 00 06 00 08 00 09 00 0A 00 0D 00
10 00 11 00 12 00 13 00 14 00 15 00 16 00 2F 00
30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
38 00 39 00 3C 00 3D 00 3E 00 3F 00 40 00 41 00
44 00 45 00 46 00 62 00 63 00 64 00 66 00 67 00
68 00 69 00 6A 00 6B 00 84 00 87 00 88 00 96 00
FF 00 01 C0 02 C0 03 C0 04 C0 05 C0 06 C0 07 C0
08 C0 09 C0 0A C0 0B C0 0C C0 0D C0 0E C0 0F C0
10 C0 11 C0 12 C0 13 C0 14 C0 23 C0 24 C0 27 C0
2B C0 2C C0 FF FE
}

condition:

((uint16(0) == 0x5A4D)) and (filesize < 15000000) and
all of them
}

Source: Secure List