Detection Engineering with FLAWS & Falco

blog_falco_and_cloudtrail.png

About The Project

I’m currently studying for my Certified Kubernetes Security Specialist (CKS) certification. As apart of this certification, training courses recommend looking into runtime security provided by Falco. Falco is a Cloud Native Computing Foundation project created by Sysdig that allows for cloud, container and Kubernetes based log alerting. While training courses such as “A Cloud Guru” do a good job of covering container and host based log ingestion with Falco, I wanted to experiment with CloudTrail data as well. This blog will cover experimenting with Falco ingesting CloudTrail data from flaws.cloud.

About The Data Set

Leveraging AWS data for public events (such as CTFs/presentations/blog posts/etc…) usually comes with a fair amount of data sanitation. Ensuring that no sensitive AWS identifiers, IPs, etc… is within said CloudTrail data can be an annoying task. Fortunately, there’s an excellent resource called flaws.cloud. flaws.cloud (and flaws2.cloud) is a cloud based CTF made by Scott Piper for end-users to identify and compromise misconfigurations in an intentionally vulnerable AWS environment. This data set is over 2 gigabytes of raw plain text CloudTrail data full of attacks (and attempts at attacks).

This is an excellent resource for those interested in Cloud Security, and Scott Piper has done the field a huge favor by making the flaws.cloud CloudTrail logs publicly available. As an educator in the field, thank you Scott for this unique data set!

Setting up Falco

As previously mentioned Falco is created by Sysdig, and is a “Open Source security tool for containers, Kubernetes and cloud”. Installation is straight forward, and well documented here. Falco leverages either an eBPF probe or a kernel module to obtain system events to then trigger alerts on. The diagram below depicts a high level architecture of Falco’s data flow. For a deeper understanding of the kernel module and associated libraries take a look into the official documentation here.

falco-flow.png

Out of the box, Falco comes with a handful of rules for both host based detection, AWS and Kubernetes based alerts. These can be seen in /etc/falco. Default rules loaded when Falco starts are within falco_rules.yaml.

-rw-r--r--. 1 root root  12314 Jun 24 04:11 aws_cloudtrail_rules.yaml
-rw-r--r--. 1 root root   1136 Aug  9 08:51 falco_rules.local.yaml
-rw-r--r--. 1 root root 136382 Aug  9 08:51 falco_rules.yaml
-rw-r--r--. 1 root root  12376 Sep 19 21:31 falco.yaml
-rw-r--r--. 1 root root  29834 Jun 24 04:13 k8s_audit_rules.yaml
drwxr-xr-x. 1 root root     44 Sep 18 21:02 rules.available
drwxr-xr-x. 1 root root     50 Sep 19 21:07 rules.d

Any custom rules made by an end-user should be added to falco_rules.local.yaml per-official documentation guidance. To enable rule sets, such as the aws_cloud_trail_rules.yaml, the rules must be placed within the rules.d directory. The rules files loaded by Falco can be verified at runtime as shown in the command output below.

[dllcoolj@thonkpad flaws_cloudtrail_logs]$ cat flaws_cloudtrail02.json > test.json && falco  
Mon Sep 19 21:51:02 2022: Falco version 0.32.2
Mon Sep 19 21:51:02 2022: Falco initialized with configuration file /etc/falco/falco.yaml
Mon Sep 19 21:51:02 2022: Loading plugin (cloudtrail) from file /usr/share/falco/plugins/libcloudtrail.so
Mon Sep 19 21:51:02 2022: Loading plugin (json) from file /usr/share/falco/plugins/libjson.so
Mon Sep 19 21:51:02 2022: Loading rules from file /etc/falco/falco_rules.yaml:
Mon Sep 19 21:51:02 2022: Loading rules from file /etc/falco/falco_rules.local.yaml:
Mon Sep 19 21:51:02 2022: Loading rules from file /etc/falco/rules.d/aws_cloudtrail_rules.yaml:

CloudTrail Specific Configuration

Beyond system level events, further functionality for log ingested is enabled through the use of plugins via shared objects. This where CloudTrail alerting functionality comes into play. Falco’s configuration file (/etc/falco.yaml) requires additional modifications to have the CloudTrail plugin enabled. The CloudTrail-Falco plugin can read data from SQS, an S3 bucket or a local file. For production purposes, SQS is worth looking into and Sysdig provides Terraform modules and example configurations.

For testing purposes, we’ll be reading from a local file. To read from a file, the following modifications are required in the falco.yaml configuration file to specify where to look for the flaws JSON data and to load the CloudTrail log. It’s important to note that the CloudTrail plugin can read in the gzip data and you don’t have to decompress it. This is especially helpful when reading CloudTrail data from a S3 bucket which is gzip’d by default. I also enabled JSON output in the falco.yaml file. You don’t need to do this, but it helps for parsing data with the jq utility which will be demonstrated later on.

  - name: cloudtrail
    library_path: libcloudtrail.so
    init_config: ""
    open_params: "/home/dllcoolj/Downloads/flaws_cloudtrail_logs/test.json" #<-- specify your path here

  - name: json
    library_path: libjson.so

load_plugins: [cloudtrail, json] # <-- specify the plugins here.

Falco Rules Engine

The Falco rules engine leverages a straight-forward syntax and ingests YAML files that contains rules to alert on. The example below shows a YAML rule for listing S3 buckets. The condition is what needs to match in order to trigger said alert, and print the content stored in the output directive. Priority can allow detection engineers to assign criticality to a given alert, and route said alert appropriately. Enabled as the name implies, ensures rules are set to be loaded when Falco runs. Finally, tags can be used to bulk enable/disable rule sets at runtime. This can be useful if you’re shipping a default “Falco build” to numerous environments that may not all be in AWS and don’t want to separate out numerous custom rule files.

 
 - rule: List Buckets
  desc: Detect listing of all S3 buckets.
  condition:
    ct.name="ListBuckets" and not ct.error exists
  output:
    A list of all S3 buckets has been requested.
    (requesting user=%ct.user,
     requesting IP=%ct.srcip,
     AWS region=%ct.region,
     host=%ct.request.host)
  priority: WARNING
  enabled: false
  tags:
    - cloud
    - aws
    - aws_s3
  source: aws_cloudtrail

The Falco rule engine supports interesting rule building primitives called macros and lists. Macros are rule conditions that can be reused inside rules making syntax easier to read. Lists are named collections of items to be used within rules. Taking the previously examined S3 list buckets rule example, a macro for S3 API events in us-east-1 or us-east-2 would look as follows:

 
 - list: us_region
   items: [us-east-1, us-east-2]

 - macro: s3_list_commands
   condition: ct.name="ListBuckets" or ct.name="ListObjects" or ct.name="ListObjectsV2" or ct.name="HeadBucket" or ct.name="HeadObject" 

 - rule: List Buckets in US Regions
  desc: Detect listing of all S3 buckets.
  condition:
     s3_list_commands and not ct.error exists and ct.region in (us_region)
  output:
    A list of all S3 buckets has been requested.
    (requesting user=%ct.user,
     requesting IP=%ct.srcip,
     AWS region=%ct.region,
     api_call=%ct.name)
  priority: WARNING
  enabled: true
  tags:
    - cloud
    - aws
    - aws_s3
  source: aws_cloudtrail

The official documentation breaks down Falco rule writing very well for system level events as well as generic syntax. However, for CloudTrail specific rule syntax, the CloudTrail plugin github repo is where you’ll find documentation on rule syntax to use to parse CloudTrail logs. A critical field to be aware of when writing rules is the ct.error field. If you’ve built anything in AWS that leverages multiple-services and requires appropriate permissions, you likely didn’t get the permissions correct the first time. The ct.error field allows the detection engineer to not alert on errors or create rules that alert on errors with lower severity. This is important to avoid senseless noise that would contribute to alert fatigue.

Tesing Falco w/ CloudTrail Data

Falco can run as a daemon, or be invoked manually. Since we’re testing, I’ll be manually invoking Falco. The configuration I have set above is reading “test.json” from a specific directory in my Downloads folder. This is so I can easily dump different CloudTrail flows into test.json and re-run falco manually as shown in the snippet below.

$> cat flaws_cloudtrail02.json > test.json && falco

The default rules will trigger on MFA being disabled for the root account, and a new trail being created producing the json we see below.

{"output":"12:43:34.000000000: Warning A new trail has been created. (requesting user=AWSServiceRoleForCloudTrail, requesting IP=cloudtrail.amazonaws.com, AWS region=us-east-1, trail name=summitroute-logs)","priority":"Warning","rule":"CloudTrail Trail Created","source":"aws_cloudtrail","tags":["aws","aws_cloudtrail","cloud"],"time":"2019-02-11T16:43:34.000000000Z", "output_fields": {"ct.region":"us-east-1","ct.request.name":"summitroute-logs","ct.srcip":"cloudtrail.amazonaws.com","ct.user":"AWSServiceRoleForCloudTrail","evt.time":1549903414000000000}}

{"output":"12:43:34.000000000: Warning A new trail has been created. (requesting user=AWSServiceRoleForCloudTrail, requesting IP=cloudtrail.amazonaws.com, AWS region=us-east-1, trail name=summitroute-logs)","priority":"Warning","rule":"CloudTrail Trail Created","source":"aws_cloudtrail","tags":["aws","aws_cloudtrail","cloud"],"time":"2019-02-11T16:43:34.000000000Z", "output_fields": {"ct.region":"us-east-1","ct.request.name":"summitroute-logs","ct.srcip":"cloudtrail.amazonaws.com","ct.user":"AWSServiceRoleForCloudTrail","evt.time":1549903414000000000}}

{"output":"12:29:30.000000000: Critical Multi Factor Authentication configuration has been disabled for root (requesting user=flaws, requesting IP=2.7.223.252, AWS region=us-east-1, MFA serial number=arn:aws:iam::811596193553:mfa/root-account-mfa-device)","priority":"Critical","rule":"Deactivate MFA for Root User","source":"aws_cloudtrail","tags":["aws","aws_iam","cloud"],"time":"2019-03-03T16:29:30.000000000Z", "output_fields": {"ct.region":"us-east-1","ct.request.serialnumber":"arn:aws:iam::811596193553:mfa/root-account-mfa-device","ct.srcip":"2.7.223.252","ct.user":"flaws","evt.time":1551630570000000000}}

{"output":"12:29:30.000000000: Critical Multi Factor Authentication configuration has been disabled for root (requesting user=flaws, requesting IP=2.7.223.252, AWS region=us-east-1, MFA serial number=arn:aws:iam::811596193553:mfa/root-account-mfa-device)","priority":"Critical","rule":"Deactivate MFA for Root User","source":"aws_cloudtrail","tags":["aws","aws_iam","cloud"],"time":"2019-03-03T16:29:30.000000000Z", "output_fields": {"ct.region":"us-east-1","ct.request.serialnumber":"arn:aws:iam::811596193553:mfa/root-account-mfa-device","ct.srcip":"2.7.223.252","ct.user":"flaws","evt.time":1551630570000000000}}

Events detected: 4
Rule counts by severity:
   CRITICAL: 2
   WARNING: 2
Triggered rules by rule name:
   Deactivate MFA for Root User: 2
   CloudTrail Trail Created: 2
Syscall event drop monitoring:
   - event drop detected: 0 occurrences
   - num times actions taken: 0

Now that we have some alerts, lets go further and parse the query via jq. By parsing on just the “output” field, the alert’s description can be obtained. Further parsing can be done for ARN/tags/etc… enabling the detection engineer to quickly slice and dice Falco alert data from the command line. Manually invoking falco to then parse out alert data becomes further interesting when trouble shooting rules in production environments where you’re reading from SQS or S3.

$> cat flaws_cloudtrail02.json > test.json && falco | jq '.output'

"12:43:34.000000000: Warning A new trail has been created. (requesting user=AWSServiceRoleForCloudTrail, requesting IP=cloudtrail.amazonaws.com, AWS region=us-east-1, trail name=summitroute-logs)"
"12:43:34.000000000: Warning A new trail has been created. (requesting user=AWSServiceRoleForCloudTrail, requesting IP=cloudtrail.amazonaws.com, AWS region=us-east-1, trail name=summitroute-logs)"
"12:29:30.000000000: Critical Multi Factor Authentication configuration has been disabled for root (requesting user=flaws, requesting IP=2.7.223.252, AWS region=us-east-1, MFA serial number=arn:aws:iam::811596193553:mfa/root-account-mfa-device)"
"12:29:30.000000000: Critical Multi Factor Authentication configuration has been disabled for root (requesting user=flaws, requesting IP=2.7.223.252, AWS region=us-east-1, MFA serial number=arn:aws:iam::811596193553:mfa/root-account-mfa-device)"

Alerting on S3 List Buckets

At this point, Falco is successfully running and processing data from a targeted file. Now lets enable the S3 “List Buckets” rule previously mentioned. As the API call name implies, List Buckets will list all buckets owned by the authenticated user. This can be inherently noisy as tons of bots, bug hunters, etc… poke at S3 bucket APIs to attempt to identify misconfigured and overly verbose S3 permissions. Consider how many times you’ve seen a head line of “Company-X has data leaked due to cloud misconfiguration”. However, the out of the box rule set condition requires that no cloud trail error exists via and not ct.error exists. By enabling the rule via enabled:true, Falco will now alert on all successful List Bucket API calls.

 - rule: List Buckets
  desc: Detect listing of all S3 buckets.
  condition:
    ct.name="ListBuckets" and not ct.error exists
  output:
    A list of all S3 buckets has been requested.
    (requesting user=%ct.user,
     requesting IP=%ct.srcip,
     AWS region=%ct.region,
     host=%ct.request.host)
  priority: WARNING
  enabled: true
  tags:
    - cloud
    - aws
    - aws_s3
  source: aws_cloudtrail

Remember, this data set is from a Cloud Based CTF where the objective includes reconnissance of Cloud resources. Therefore, there are a ton of List Bucket API calls which will generate a ton of alerts.

Why do this? To demonstrate Falco’s throttling mechanism. The throttling mechanism can be seen by enabling the rule below and re-running our Falco command. The default throttling mechanism will send up to 1000 notifications per-second before hitting said throttling limit. Then, a back off timer occurs where only 1 alert will be sent per-second until after a 1000 second period of no activity. This can be further explored within the configuration file falco.yml or within the documentation here. Now lets execute the command below and generate said alerts.

$> cat flaws_cloudtrail02.json > test.json && falco 

As expected, after the threshold of alerts is hit, the rate limit notification occurs as shown in the content below.

                                            ...
{"output":"20:02:37.000000000: Warning A list of all S3 buckets has been requested. (requesting user=backup, requesting IP=140.113.131.4, AWS region=us-west-2, host=<NA>)","priority":"Warning","rule":"List Buckets","source":"aws_cloudtrail","tags":["aws","aws_s3","cloud"],"time":"2019-03-30T00:02:37.000000000Z", "output_fields": {"ct.region":"us-west-2","ct.request.host":null,"ct.srcip":"140.113.131.4","ct.user":"backup","evt.time":1553904157000000000}}
{"output":"16:34:48.000000000: Warning A list of all S3 buckets has been requested. (requesting user=Level6, requesting IP=251.212.9.131, AWS region=us-east-1, host=<NA>)","priority":"Warning","rule":"List Buckets","source":"aws_cloudtrail","tags":["aws","aws_s3","cloud"],"time":"2019-04-12T20:34:48.000000000Z", "output_fields": {"ct.region":"us-east-1","ct.request.host":null,"ct.srcip":"251.212.9.131","ct.user":"Level6","evt.time":1555101288000000000}}
{"output":"20:46:57.000000000: Warning A list of all S3 buckets has been requested. (requesting user=Level6, requesting IP=0.52.31.206, AWS region=us-east-1, host=<NA>)","priority":"Warning","rule":"List Buckets","source":"aws_cloudtrail","tags":["aws","aws_s3","cloud"],"time":"2019-06-21T00:46:57.000000000Z", "output_fields": {"ct.region":"us-east-1","ct.request.host":null,"ct.srcip":"0.52.31.206","ct.user":"Level6","evt.time":1561078017000000000}}
Mon Sep 19 21:15:41 2022: Skipping rate-limited notification for rule List Buckets
Mon Sep 19 21:15:41 2022: Skipping rate-limited notification for rule List Buckets
Mon Sep 19 21:15:41 2022: Skipping rate-limited notification for rule List Buckets
Mon Sep 19 21:15:41 2022: Skipping rate-limited notification for rule List Buckets
Mon Sep 19 21:15:41 2022: Skipping rate-limited notification for rule List Buckets
Mon Sep 19 21:15:41 2022: Skipping rate-limited notification for rule List Buckets
                                            ...

Watching for Rate Limit Alerts

Falco, will log application specific events to syslog or stdout depending on how falco.yaml is configured. Additional configuration can be done for the 3rd party libraries via the “Libs_logger” directive for troubleshooting and debugging scenarios.

# Falco is capable of managing the logs coming from libs. If enabled,
# the libs logger send its log records the same outputs supported by
# Falco (stderr and syslog). Disabled by default.
libs_logger:
  enabled: true
  # Minimum log severity to include in the libs logs. Note: this value is
  # separate from the log level of the Falco logger and does not affect it.
  # Can be one of "fatal", "critical", "error", "warning", "notice",
  # "info", "debug", "trace".
  severity: debug

On my Fedora test machine, Falco’s application logs can be viewed via jouranlctl /usr/bin/falco. Sure enough, our rate limit notifications are there.

                                            ...
Sep 19 21:15:41 thonkpad falco[12184]: Skipping rate-limited notification for rule List Buckets
Sep 19 21:15:41 thonkpad falco[12184]: Skipping rate-limited notification for rule List Buckets
                                            ...

Forwarding these Falco application logs to a separate queue to investigate can be apart of a healthy detection engineering environment. After all, if you’re hitting said rate limits, perhaps tuning of the max_burst setting should be considered or the rule that’s being rate limited. After all, in a era of hybrid-cloud, containers, Kubernetes, oh-my! the last thing your security team needs are more alerts.

Beyond The Blog

The next progression here would be to deploy Falco to hosts in my homelab in an automated fashion and then ingest the data into a Elastic to then build dashboards and associated alerts.

From a detection engineering-to-home lab stand point, a great thing about the FLAWS CloudTrail data set, is that there are walk throughs on how to complete the challenge. Therefore, hunting for events can be accomplished allow end-users to identify whom compromised what and how far they proceeded in the CTF.

This is a neat spin in a world where security content tends to lean Offensive Security focused. The hard aspects of detection engineering and hunting for actionable items are now at your finger tips!

Thank you for reading, if you found this helpful or neat please slap that like button and share with your friends. For content like this, and more follow Arch Cloud Labs on YouTube and DLL_Cool_J on Twitter.