Code development platform for open source projects from the European Union institutions

Skip to content
Snippets Groups Projects
user avatar
Vara Bonthu authored
5bb65ef5
History

aws-eks-accelerator-for-terraform

Main Purpose

This project provides a framework for deploying best-practice multi-tenant EKS Clusters with Kubernetes Addons, provisioned via Hashicorp Terraform and Helm charts on AWS.

Overview

The AWS EKS Accelerator for Terraform module helps you to provision EKS Clusters, Managed node groups with On-Demand and Spot Instances, AWS Fargate profiles, and all the necessary Kubernetes add-ons for a production-ready EKS cluster. The Terraform Helm provider is used to deploy common Kubernetes Addons with publicly available Helm Charts. This project leverages the official terrafor-aws-vpc and terraform-aws-eks community modules to create VPC and EKS Cluster.

The intention of this framework is to help you design config driven solution. This will help you to create EKS clusters for various environments and AWS accounts across multiple regions with a unique Terraform configuration and state file per EKS cluster.

The top-level deploy folder provides an example of how you can structure your folders and files to define multiple EKS Cluster environments and consume this accelerator module. This approach is suitable for large projects, with clearly defined sub directory and file structure. This can be modified the way that suits your requirement. You can define a unique configuration for each EKS Cluster and making this module as central source of truth. Please note that deploy folder can be moved to a dedicated repo and consume this module using main.tf file(see example file here ).

e.g. folder/file structure for defining multiple clusters

    ├── deploy
    │   └── live
    │       └── preprod
    │           └── eu-west-1
    │               └── application
    │                   └── dev
    │                       └── backend.conf
    │                       └── dev.tfvars
    │                       └── main.tf
    │                       └── variables.tf
    │                       └── outputs.tf
    │                   └── test
    │                       └── backend.conf
    │                       └── test.tfvars
    │       └── prod
    │           └── eu-west-1
    │               └── application
    │                   └── prod
    │                       └── backend.conf
    │                       └── prod.tfvars
    │                       └── main.tf
    │                       └── variables.tf
    │                       └── outputs.tf

Each folder under live/<region>/application represents an EKS cluster environment(e.g., dev, test, load etc.). This folder contains backend.conf and <env>.tfvars, used to create a unique Terraform state for each cluster environment. Terraform backend configuration can be updated in backend.conf and cluster common configuration variables in <env>.tfvars

  • eks.tf - EKS Cluster resources and Amazon EKS Addon resources

  • fargate-profiles.tf - AWS EKS Fargate profiles

  • managed-nodegroups.tf - Amazon Managed node groups resources

  • self-managed-nodegroups.tf - Self-managed nodes resources

  • kubernetes-addons.tf - contains resources to deploy multiple Kubernetes Addons

  • vpc.tf - VPC and endpoints resources

  • modules - folder contains all the AWS resource sub modules used in this module

  • kubernetes-addons - folder contains all the Helm charts and Kubernetes resources for deploying Kubernetes Addons

  • examples - folder contains sample template files with <env>.tfvars which can be used to deploy EKS cluster with multiple node groups and Kubernetes add-ons

EKS Cluster Deployment Options

This module provisions the following EKS resources

EKS Cluster Networking Resources

  1. VPC and Subnets
  2. VPC endpoints for fully private EKS Clusters
  3. NAT Gateway
  4. Internet Gateway

NOTE: VPC/Subnets creation can be disabled using create_vpc = false in TFVARS file and import the existing VPC resources. test-vpc.tfvars and test-eks.tfvars example shows how to create VPC with a unique state file and import that state file with resources into EKS Cluster creation.

EKS Cluster resources

  1. EKS Cluster with multiple networking options
    1. Fully Private EKS Cluster
    2. Public + Private EKS Cluster
    3. Public Cluster)
  2. Amazon EKS Addons -
  3. Managed Node Groups with On-Demand - AWS Managed Node Groups with On-Demand Instances
  4. Managed Node Groups with Spot - AWS Managed Node Groups with Spot Instances
  5. AWS Fargate Profiles - AWS Fargate Profiles
  6. Launch Templates - Deployed through launch templates to Managed Node Groups
  7. Bottlerocket OS - Managed Node Groups with Bottlerocket OS and Launch Templates
  8. Amazon Managed Service for Prometheus (AMP) - AMP makes it easy to monitor containerized applications at scale
  9. Self-managed Node Group with Windows support - Ability to create a self-managed node group for Linux or Windows workloads.

Kubernetes Addons using Helm Charts

  1. Metrics Server
  2. Cluster Autoscaler
  3. AWS LB Ingress Controller
  4. Traefik Ingress Controller
  5. Nginix Ingress Controller
  6. FluentBit to CloudWatch for Nodes
  7. FluentBit to CloudWatch for Fargate Containers
  8. Agones - Host, Run and Scale dedicated game servers on Kubernetes
  9. Prometheus
  10. Kube-state-metrics
  11. Alert-manager
  12. Prometheus-node-exporter
  13. Prometheus-pushgateway
  14. OpenTelemetry Collector
  15. AWS Distro for OpenTelemetry Collector(AWS OTel Collector)

Node Group Modules

This module uses dedicated sub modules for creating AWS Managed Node Groups, Self-managed Node groups and Fargate profiles. Mixed Node groups with Fargate profiles can be defined simply as a map variable in <env>.tfvars. This approach provides flexibility to add or remove managed/self-managed node groups/fargate profiles by just adding/removing map of values to the existing <env>.tfvars. This allows you to define unique node configuration for each EKS Cluster in the same account. AWS auth config map handled by this module ensures new node groups successfully join with the EKS Cluster. Each Node Group can have dedicated IAM role, Security Group and Launch template to improve the security.

Please refer to the dev.tfvars for full example.

Managed Node Groups Example

enable_managed_nodegroups = true
managed_node_groups = {
  mg_m4 = {
    # 1> Node Group configuration
    node_group_name        = "managed-ondemand"
    create_launch_template = true              # false will use the default launch template
    launch_template_os        = "amazonlinux2eks" # amazonlinux2eks or windows or bottlerocket
    public_ip              = false             # Use this to enable public IP for EC2 instances; only for public subnets used in launch templates ;
    pre_userdata           = <<-EOT
            yum install -y amazon-ssm-agent
            systemctl enable amazon-ssm-agent && systemctl start amazon-ssm-agent"
        EOT
    # 2> Node Group scaling configuration
    desired_size    = 3
    max_size        = 3
    min_size        = 3
    max_unavailable = 1 # or percentage = 20

    # 3> Node Group compute configuration
    ami_type       = "AL2_x86_64" # AL2_x86_64, AL2_x86_64_GPU, AL2_ARM_64, CUSTOM
    capacity_type  = "ON_DEMAND"  # ON_DEMAND or SPOT
    instance_types = ["m4.large"] # List of instances used only for SPOT type
    disk_size      = 50

    # 4> Node Group network configuration
    subnet_type = "private" # private or public
    subnet_ids  = []        # Define your private/public subnets list with comma seprated subnet_ids  = ['subnet1','subnet2','subnet3']

    k8s_taints = []

    k8s_labels = {
      Environment = "preprod"
      Zone        = "dev"
      WorkerType  = "ON_DEMAND"
    }
    additional_tags = {
      ExtraTag    = "m4-on-demand"
      Name        = "m4-on-demand"
      subnet_type = "private"
    }
    create_worker_security_group = true
  },
  mg_m5 = {...}
 }

Fargate Profiles Example

enable_fargate = true

fargate_profiles = {
  default = {
    fargate_profile_name = "default"
    fargate_profile_namespaces = [{
      namespace = "default"
      k8s_labels = {
        Environment = "preprod"
        Zone        = "dev"
        env         = "fargate"
      }
    }]

    subnet_ids = [] # Provide list of private subnets

    additional_tags = {
      ExtraTag = "Fargate"
    }
  },
    finance = {...}
  }

Bottlerocket OS

Bottlerocket is an open source operating system specifically designed for running containers. Bottlerocket build system is based on Rust. It's a container host OS and doesn't have additional software's or package managers other than what is needed for running containers hence its very light weight and secure. Container optimized operating systems are ideal when you need to run applications in Kubernetes with minimal setup and do not want to worry about security or updates, or want OS support from cloud provider. Container operating systems does updates transactionally.

Bottlerocket has two containers runtimes running. Control container on by default used for AWS Systems manager and remote API access. Admin container off by default for deep debugging and exploration.

Bottlerocket Launch templates userdata uses the TOML format with Key-value pairs. Remote API access API via SSM agent. You can launch trouble shooting container via user data [settings.host-containers.admin] enabled = true.

Features

  • Secure - Opinionated, specialized and highly secured
  • Flexible - Multi cloud and multi orchestrator
  • Transactional - Image based upgraded and rollbacks
  • Isolated - Separate container Runtimes

Updates

Bottlerocket can be updated automatically via Kubernetes Operator

    kubectl apply -f Bottlerocket_k8s.csv.yaml
    kubectl get ClusterServiceVersion Bottlerocket_k8s | jq.'status'

How to Deploy

Prerequisites:

Ensure that you have installed the following tools in your Mac or Windows Laptop before start working with this module and run Terraform Plan and Apply

  1. aws cli
  2. aws-iam-authenticator
  3. kubectl
  4. terraform

Deployment Steps

The following steps walks you through the deployment of example DEV cluster configuration. This config deploys a private EKS cluster with public and private subnets.

Two managed worker nodes with On-Demand and Spot instances along with one fargate profile for default namespace placed in private subnets. ALB placed in Public subnets created by AWS LB Ingress controller.

It also deploys few kubernetes apps i.e., AWS LB Ingress Controller, Metrics Server, Cluster Autoscaler, aws-for-fluent-bit CloudWatch logging for Managed node groups, FluentBit CloudWatch logging for Fargate etc.

Provision VPC (optional) and EKS cluster with enabled Kubernetes Addons

Step1: Clone the repo using the command below

git clone https://github.com/aws-samples/aws-eks-accelerator-for-terraform.git

Step2: Update .tfvars file

Update ~/aws-eks-accelerator-for-terraform/live/preprod/eu-west-1/application/dev/dev.tfvars file with the instructions specified in the file (OR use the default values). You can choose to use an existing VPC ID and Subnet IDs or create a new VPC and subnets by providing CIDR ranges in dev.tfvars file

Step3: Update Terraform backend config file

Update ~/aws-eks-accelerator-for-terraform/live/preprod/eu-west-1/application/dev/backend.conf with your local directory path or s3 path. state.tf file contains backend config.

Local terraform state backend config variables

    path = "local_tf_state/ekscluster/preprod/application/dev/terraform-main.tfstate"

It's highly recommended to use remote state in S3 instead of using local backend. The following variables needs filling for S3 backend.

    bucket = "<s3 bucket name>"
    region = "<aws region>"
    key    = "ekscluster/preprod/application/dev/terraform-main.tfstate"

Step4: Assume IAM role before creating a EKS cluster.

This role will become the Kubernetes Admin by default. Please see this document for assuming a role

Step5: Run Terraform INIT

to initialize a working directory with configuration files

terraform init -backend-config deploy/live/preprod/eu-west-1/application/dev/backend.conf

Step6: Run Terraform PLAN

to verify the resources created by this execution

terraform plan -var-file deploy/live/preprod/eu-west-1/application/dev/dev.tfvars

Step7: Finally, Terraform APPLY

to create resources

terraform apply -var-file deploy/live/preprod/eu-west-1/application/dev/<env>.tfvars

Alternatively you can use Makefile to deploy by skipping Step5, Step6 and Step7

Deploy EKS Cluster using Makefile

Executing Terraform PLAN

$ make tf-plan-eks env=<env> region=<region> account=<account> subenv=<subenv>
e.g.,
$ make tf-plan-eks env=preprod region=eu-west-1 account=application subenv=dev

Executing Terraform APPLY

$ make tf-apply-eks env=<env> region=<region> account=<account> subenv=<subenv>
e.g.,
$ make tf-apply-eks env=preprod region=eu-west-1 account=application subenv=dev

Executing Terraform DESTROY

$ make tf-destroy-eks env=<env> region=<region> account=<account> subenv=<subenv>
e.g.,
make tf-destroy-eks env=preprod region=eu-west-1 account=application subenv=dev

Configure kubectl and test cluster

EKS Cluster details can be extracted from terraform output or from AWS Console to get the name of cluster. This following command used to update the kubeconfig in your local machine where you run kubectl commands to interact with your EKS Cluster.

Step8: Run update-kubeconfig command.

~/.kube/config file gets updated with cluster details and certificate from the below command

$ aws eks --region eu-west-1 update-kubeconfig --name <cluster-name>

Step9: List all the worker nodes by running the command below

$ kubectl get nodes

Step10: List all the pods running in kube-system namespace

$ kubectl get pods -n kube-system

Deploying example templates

The examples folder contains multiple cluster templates with pre-populated .tfvars which can be used as a quick start. Reuse the templates from examples and follow the above Deployment steps as mentioned above.

Notes:

If you are using an existing VPC then you may need to ensure that the following tags added to the VPC and subnet resources

Add Tags to VPC

    Key = "Kubernetes.io/cluster/${local.cluster_name}"
    Value = "Shared"

Add Tags to Public Subnets tagging requirement

      public_subnet_tags = {
        "Kubernetes.io/cluster/${local.cluster_name}" = "shared"
        "Kubernetes.io/role/elb"                      = "1"
      }

Add Tags to Private Subnets tagging requirement

      private_subnet_tags = {
        "Kubernetes.io/cluster/${local.cluster_name}" = "shared"
        "Kubernetes.io/role/internal-elb"             = "1"
      }

For fully Private EKS clusters requires the following VPC endpoints to be created to communicate with AWS services. This module will create these endpoints if you choose to create VPC. If you are using an existing VPC then you may need to ensure these endpoints are created.

com.amazonaws.region.aps-workspaces            - For AWS Managed Prometheus Workspace
com.amazonaws.region.ssm                       - Secrets Management
com.amazonaws.region.ec2
com.amazonaws.region.ecr.api
com.amazonaws.region.ecr.dkr
com.amazonaws.region.logs                       – For CloudWatch Logs
com.amazonaws.region.sts                        – If using AWS Fargate or IAM roles for service accounts
com.amazonaws.region.elasticloadbalancing       – If using Application Load Balancers
com.amazonaws.region.autoscaling                – If using Cluster Autoscaler
com.amazonaws.region.s3                         – Creates S3 gateway

Author

Created by Vara Bonthu. Maintained by Ulaganathan N, Jomcy Pappachen

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.