Where Did All The Terraform Testing Go?
- Brett Petrusek
- Terraform , Devops , Devsecops
- September 14, 2023
You are a Terraform master.
Compliant code flows from your fingers as if you were ChatGPT.
Lesser engineers don’t even bother peer reviewing your code. They just smack the “approve” button and move on with their day.
You don’t need a linter. You ARE a linter.
You don’t need a security scanner. You WROTE the security scanner.
You test your code with terraform plan
. That’s all you need. Right?
Wrong.
A Shocking Discovery
Recently, I made a discovery. It was quite shocking. All this time, I had just assumed everyone was doing it correctly.
I could not have been more wrong.
While building the first Experience Builder module for Terraform , I developed some automation to find public Terraform repositories that lacked linting or security scanning. I found over 140 repositories, some of which contain very commonly used modules in highly reputable GitHub orgs.
Needless to say, I was more than a little concerned.
But this was the exact reason why I created the Experience Builder – to find opportunities for aspiring DevOps engineers to contribute DevOps functionality to open source projects.
I felt like I hit the motherload with this one.
To see the full list of repositories and their findings, check out the Experience Builder Terraform repository list.
Who Needs Testing?
To put it bluntly: everyone needs testing.
Linting is straightforward and is a BARE MINIMUM for Terraform testing. The most common tool for this is the terraform
command itself. It has two subcommands, terraform fmt
and terraform validate
, that can be used to perform basic linting. A more robust option is to use tflint
, which goes beyond basic formatting and syntax checks with a long list of rules.
Security testing is a little more complicated because there are several tools available, each with pros and cons. The Experience Builder module for Terraform looks for the following tools:
My personal preference is checkov
, but it is being slowly pulled into the BridgeCrew ecosystem. This will make it more difficult to work with if BridgeCrew paywalls it or imposes other requirements to use it.
tfsec
and cloudsplooit
are being migrated to the Trivy ecosystem and will potentially share the same fate as checkov
.
Tenable already owns terrascan
, but it has remained free to use under an open source license. Let’s hope it stays that way.
Remediating the Missing Linters
Many of the identified repositories make use of GitHub Actions. I recommend building on top of the existing GitHub Actions capabilities to implement linting within these repositories.
So, what’s the big deal? You can just pick the most popular Terraform linting action from the GitHub Action Marketplace
and copy/paste it into .github/workflows
, right?
Unfortunately, there are 9 GitHub Actions that implement tflint
and 192 GitHub Actions that implement terraform
in some way.
I examined the repositories I located on GitHub and found these GitHub actions being used:
Action | Num of Occurences |
---|---|
hashicorp/setup-terraform | 33 |
clowdhaus/terraform-composite-actions/pre-commit | 26 |
clowdhaus/terraform-min-max | 18 |
clowdhaus/terraform-composite-actions/directories | 10 |
github/super-linter | 9 |
terraform-linters/setup-tflint | 7 |
oxsecurity/megalinter | 4 |
reviewdog/action-tflint | 1 |
While the clowdhaus
actions rank high, they are unfairly represented because it appears to be the action of choice for the aws-terraform-modules
GitHub org. So, to be unbiased, I am going to demonstrate how to implement both the hashicorp/setup-terraform
action and the clowdhaus/terraform-composite-actions/pre-commit
action.
Hashicorp GitHub Action
This action can be used to execute terraform fmt
and terraform validate
. They are the most basic syntax and linting checks you can run.
Not all 33 of the identified instances of hashicorp/setup-terraform
implement these commands. Many just use this action to run terraform plan
and other terraform commands.
name: "Linting"
on:
pull_request:
branches:
- master
- main
jobs:
linting:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Set up Terraform
uses: hashicorp/setup-terraform@v2
with:
terraform_version: 1.5.1
- name: Terraform Fmt Check
id: fmt
run: |
terraform fmt -recursive -check -diff $GITHUB_WORKSPACE
- name: Terraform Init
id: init
run: |
terraform init -backend=false
- name: Terraform Validate Check
id: validate
run: |
terraform validate
This is as basic as it gets. Let’s walk through each section.
- The
on:
keyword defines the criteria for when this action should be executed. In this example, it is configured to run on pull requests to the master and main branches. If the repository is properly configured, any commands that return a non-zero status from this job will result in blocking the merge to those branches. - The
jobs
sectionruns-on
theubuntu-latest
image. - The
actions/checkout@v3
simply checks out the repository into the workspace of the action. - The
Set up Terraform
action installsterraform_version
1.5.1 into the job’subuntu-latest
container. - It can now execute the terraform commands for linting, which are
terraform fmt
andterraform validate
. Theterraform init -backend=false
is a prerequisite to running thevalidate
command.
The only customization is setting the terraform_version
variable to the version you require.
If you need to start small, this is where I would recommend you begin. You can pretty much copy and paste this directly into a yaml file in .github/workflows
and be up and running in no time.
Clowdhaus GitHub Action
I will preface this section with the following statement: the Clowdhaus documentation is not the greatest. I had to perform a lot of trial and error to get a functional GitHub Workflow configuration.
This is a much more sophisticated action than the basic hashicorp/setup-terraform
.
First of all, it includes a more robust linting tool, tflint
.
Second, it can execute tfsec
, a Terraform security scanner, through the use of the terraform_tfsec
pre-commit hook. Two birds, one stone.
name: Pre-Commit
on:
pull_request:
push:
branches:
- main
- master
env:
TERRAFORM_DOCS_VERSION: v0.16.0
TFLINT_VERSION: v0.44.1
TERRAFORM_VERSION: v1.5.1
jobs:
preCommit:
name: TF pre-commit
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Pre-commit Terraform Validate ${{ env.TERRAFORM_VERSION }}
uses: clowdhaus/terraform-composite-actions/[email protected]
with:
terraform-version: ${{ env.TERRAFORM_VERSION }}
args: 'terraform_validate'
- name: Pre-commit Terraform Format ${{ env.TERRAFORM_VERSION }}
uses: clowdhaus/terraform-composite-actions/[email protected]
with:
terraform-version: ${{ env.TERRAFORM_VERSION }}
args: 'terraform_fmt'
- name: Pre-commit Terraform TFLint ${{ env.TERRAFORM_VERSION }}
uses: clowdhaus/terraform-composite-actions/[email protected]
with:
terraform-version: ${{ env.TERRAFORM_VERSION }}
tflint-version: ${{ env.TFLINT_VERSION }}
args: 'terraform_tflint'
This approach uses the pre-commit
framework with the terraform plugin
.
- The
on:
keyword defines the criteria for when this action should be executed. In this example, it is configured to run on pull requests and pushes to the main and master branches. If the repository is properly configured, any commands that return a non-zero status from this job will result in blocking the merge to the master branch. In addition, this job leverages thepre-commit
, which allows you to block pushes to mainline branches that do not pass basic linting tests. - The
jobs
sectionruns-on
theubuntu-latest
image. - The
actions/checkout@v3
simply checks out the repository into the workspace of the action. - The remaining steps use the
clowdhaus/terraform-composite-actions/[email protected]
action, which manages the installation ofterraform
andtflint
. It also provides the pre-commit framework, which allows you to call the various terraform pre-commit hooks. - The
args:
for each step are pretty self explanatory. There is one to executeterraform validate
,terraform fmt
, andtflint
, respectively.
The main advantage of using this approach is that it leverages the pre-commit
framework. There are other terraform pre-commit hooks
that allow you to perform functions such as:
- automated document generation using
terraform-docs
- infrastructure cost estimation with
infracost
- automated update of version constraints, providers, and modules with
tfupdate
Remediating the Missing Security Scanners
We listed out the most popular security scanning tools in the Who Needs Testing? section above.
A closer look at the repositories located on GitHub shows these GitHub Actions being used for security scanning:
Action | Num of Occurences |
---|---|
bridgecrewio/checkov-action | 4 |
step-security/harden-runner | 1 |
aquasecurity/trivy-action | 1 |
I am only going to demonstrate the use of the bridgecrewio/checkov-action
GitHub Action.
Checkov GitHub Action
Unlike the Clowdhaus GitHub Actions, the documentation
for the bridgecrewio/checkov-action
GitHub Action is quite good. It contains plenty of examples and explanations for the command.
Ironically, the only two repositories that had the bridgecrewio/checkov-action
GitHub Action configured are a BridgeCrew repository and a Terraform example repository used in a learning course.
I am going to take the examples from the documentation , but simplify it.
name: checkov
on:
pull_request:
branches:
- main
- master
jobs:
scan:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Checkov GitHub Action
uses: bridgecrewio/checkov-action@v12
with:
output_format: cli
- The
on:
keyword defines the criteria for when this action should be executed. In this example, it is configured to run on pull requests to the master and main branches. If the repository is properly configured, any commands that return a non-zero status from this job will result in blocking the merge to the master branch. - The
jobs
sectionruns-on
theubuntu-latest
image. - The
actions/checkout@v3
simply checks out the repository into the workspace of the action. - The
Checkov GitHub Action
step executescheckov
and returns the output to stdout (i.e. cli).
If the repository is configured properly, any failures from checkov
will block merges to the main and master branches.
It’s that simple.
Conclusion
Why doesn’t everyone do this? Take a look at the checkov
output I collected for all the repositories that failed my validations
and you’ll understand.
Almost every repository has at least one violation. Many have dozens.
It is not necessarily the number of violations. It is the complexity required to fix them.
For example, wildcard (*
) permissions are a common shortcut in AWS IAM policies because it is very difficult to get a full list of all the required IAM actions. But most companies enforce a “least privileged access” policy. Only specific actions on specific AWS resources are allowed.
But some of these are not actually violations. What can be done with those?
checkov
(and other scanning tools) have a mechanism to ignore “desired” violations by inserting a comment in the Terraform code. These comments should also contain an explanation as to why the violation can be ignored, including secondary controls or mitigating processes.
Implementing security scanning and linting is a necessary step to securing infrastructure and enforcing high quality Terraform code. It is a straightforward process to achieve and should be the goal for any DevOps engineer.
Always insist on the highest standards.
Happy Terraforming.
Appendix
Discovered Checkov Violations
This is a sampling of the highest offenders of checkov
violations. I excluded violations that had less than 100 occurrences for brevity.
Violation | Num of Occurrences |
---|---|
CKV_AWS_23: “Ensure every security groups rule has a description” | 548 |
CKV_AWS_79: “Ensure Instance Metadata Service Version 1 is not enabled” | 417 |
CKV_AWS_273: “Ensure access is controlled through SSO and not AWS IAM defined users” | 394 |
CKV_AWS_8: “Ensure all data stored in the Launch configuration or instance Elastic Blocks Store is securely encrypted” | 372 |
CKV_AWS_126: “Ensure that detailed monitoring is enabled for EC2 instances” | 348 |
CKV_AWS_135: “Ensure that EC2 is EBS optimized” | 346 |
CKV2_AWS_41: “Ensure an IAM role is attached to EC2 instance” | 315 |
CKV_AWS_130: “Ensure VPC subnets do not assign public IP by default” | 179 |
CKV2_AWS_12: “Ensure the default security group of every VPC restricts all traffic” | 174 |
CKV2_AWS_11: “Ensure VPC flow logging is enabled in all VPCs” | 171 |
CKV_AWS_260: “Ensure no security groups allow ingress from 0.0.0.0:0 to port 80” | 168 |
CKV_AWS_355: “Ensure no IAM policies documents allow “*” as a statement’s resource for restrictable actions” | 157 |
CKV_AWS_40: “Ensure IAM policies are attached only to groups or roles " | 142 |
CKV_GCP_76: “Ensure that Private google access is enabled for IPV6” | 142 |
CKV_AWS_144: “Ensure that S3 bucket has cross-region replication enabled” | 139 |
CKV2_AWS_62: “Ensure S3 buckets should have event notifications enabled” | 137 |
CKV_AWS_145: “Ensure that S3 buckets are encrypted with KMS by default” | 133 |
CKV_AWS_24: “Ensure no security groups allow ingress from 0.0.0.0:0 to port 22” | 133 |
CKV_AWS_356: “Ensure no IAM policies documents allow “*” as a statement’s resource for restrictable actions” | 132 |
CKV_AWS_39: “Ensure Amazon EKS public endpoint disabled” | 132 |
CKV_AWS_38: “Ensure Amazon EKS public endpoint not accessible to 0.0.0.0/0” | 131 |
CKV_AZURE_114: “Ensure that key vault secrets have “content_type” set” | 131 |
CKV_AWS_58: “Ensure EKS Cluster has Secrets Encryption Enabled” | 130 |
CKV_GCP_74: “Ensure that private_ip_google_access is enabled for Subnet” | 130 |
CKV_AWS_18: “Ensure the S3 bucket has access logging enabled” | 129 |
CKV2_AWS_61: “Ensure that an S3 bucket has a lifecycle configuration” | 128 |
CKV_AZURE_41: “Ensure that the expiration date is set on all secrets” | 127 |
CKV2_AWS_34: “AWS SSM Parameter should be Encrypted” | 125 |
CKV_AWS_290: “Ensure IAM policies does not allow write access without constraints” | 113 |
CKV_AWS_21: “Ensure the S3 bucket has versioning enabled” | 109 |
CKV_GCP_26: “Ensure that VPC Flow Logs is enabled for every subnet in a VPC Network” | 109 |
CKV_AWS_111: “Ensure IAM policies does not allow write access without constraints” | 101 |