There are many ways to create an Kubernetes cluster on AWS. It's important to understand what your options are in order to recognise the benefits and drawbacks of each. In this post I make a case for the method I recommend, EKSCTL.
Methods to create Kubernetes clusters
First, let's go over the options available to help create and manage Kubernetes clusters on AWS:
- Set up your own self-managed Kubernetes cluster from scratch
- Use a tool called KOPS to assist you in creating & managing an self-managed AWS Cluster
- Use the AWS Console to create & manage an AWS-managed cluster
- Use a Terraform module to create & manage an AWS-managed Cluster
- Use an Ansible module to create & manage an AWS-managed Cluster
- Use a Pulumi module to create & manage an AWS-managed Cluster
- Use EKSCTL, a tool co-authored by AWS and WeaveWorks to create & manage an AWS-managed Cluster
- Likely many others as well...
Now lets discuss these options and their benefits and drawbacks:
If you go with either of the self-managed options (the first two), you and your team are in for a lot of work and a lot of unnecessary tech debt. Self-managed Kubernetes clusters mean that you run and support the "master" nodes and you are responsible for more components and responsible for failover/redundancy. (For more information about master Kubernetes nodes and managing your own cluster read a step by step tutorial for Kubernetes implementation).
One of the core-tenets of successful DevOps is not taking on unnecessary responsibility. To succeed with that aim, we won't even consider the first two further. However, I think it's important for any fellow professional to know what KOPS is, as this was the way to deploy Kubernetes before Amazon created EKS.
Next, are all the methods to spin up an AWS managed EKS cluster.
This includes doing so manually via the AWS Console, or automatically with various tools. The model I typically follow for my clients is to create everything via Terraform (eg: VPC, subnets, NAT Gateway, etc) except for EKS; for which I use the tool Amazon helped co-author called EKSCTL. Generally, for good DevOps practice you never want anything manually setup, and because EKSCTL only configures EKS you'd need some framework surrounding EKS (eg: the VPC).
However, if you're just starting/learning, maybe you are just trying to test out EKS, you might want to start with just letting EKSCTL create your VPC for you also, which is what we'll do in our guide below!
It's important to understand why I have chosen to use and recommend EKSCTL over any other option.
EKS is a fairly complex beast to setup, and that setup changes over time. This is the one thing that I do not recommend setting up and managing via Terraform, but instead with a tool written by and recommended by AWS called EKSCTL.
For longer-term usage and support of EKS, I have encountered just about every issue you could imagine with every one of the methods to create Kubernetes clusters mentioned above. These issues usually will only surface after you've been using them a while and need to perform version updates and/or node upgrades with zero-downtime.
The biggest problem with the other methods is that they don't manage the lifecycle of EKS. They don't "speak" EKS, and they don't understand how to manage and maintain an EKS cluster with zero-downtime over time.
I've successfully set up, maintained and upgraded more than 30 EKS clusters via EKSCTL over the last 3 years, and during/before that time I have struggled with supporting yet another 20+ EKS clusters via either the AWS Console directly, or via Terraform.
Largely, what happens a year down the road is you need to upgrade EKS. With EKSCTL, it's just a series of steps and commands to EKSCTL to perform zero-downtime rollouts of new node groups. The reason EKSCTL is better in this regard is it "speaks" to Kubernetes and manages the lifecycle of events. When you destroy a node group in Terraform, it would just go destroy an autoscaling group and kill the underlying nodes; causing pods to suddenly be forced off their node potentially causing downtime because of the unexpected/unplanned nature of nodes being terminated.
With EKSCTL, when you want to remove a node group, it goes into Kubernetes and one-at-a-time cordons and drains the nodes (which basically prepares your node for maintenance or removal) until all nodes from this autoscaler are drained, then it removes the nodes and then removes the autoscaler. This method of graceful incremental removal of nodes helps manage, reduce and eliminate downtime entirely (as long as you've configured your pods for high-scalability/redundancy).
This incremental node replacement strategy is part of something Amazon has a feature to help manage for you called Managed node groups, and while you are welcome to use that (and it is an adequate tool for some engineers), I personally don't recommend it.
When you push off this concern to Amazon, you can find the removal of node groups to "stick" typically because of a Pod Disruption Budget (or PDB). AWS managed node groups are a "black box", you don't really get feedback from it. EKSCTL effectively has the same feature-set built-in that AWS Managed Node groups has, without being a black-box.
When you use EKSCTL to manage this, you get full information and configurability regarding what is happening; and if there are any hiccups or if your removal of node groups gets "stuck", you'll know right away and you can manage the situation yourself to help ensure/guarantee minimal or zero-downtime during the process. This allows you the ability to, for example, go scale up a service suddenly so it lands on some of the other nodes, before removing the pod(s) on the node you are trying to delete. This concept is not possible when you let AWS manage the node group replacement.
Another reason I find EKSCTL useful to manage your node groups instead of Amazon's Managed Node Groups, is that for non-production clusters where you can tolerate downtime you can tell EKSCTL to aggressively replace node groups, often taking mere seconds to replace potentially all the nodes and node groups in your cluster, versus often taking 30+ minutes with AWS managed-node groups doing it gracefully because it does not offer any configurability/granularity on the aggressiveness of its actions (last I checked).
- It's important to note that you can still use EKSCTL with AWS Managed Nodes, it is just not my preference.
- If you'd like to consider using AWS's self-managed nodes, read the documentation from AWS on Launching self-managed Linux Worker Nodes and Launching self-managed Windows Worker Nodes.
- If you're getting into Windows on EKS it's important to understand the nuances and limitations of such, this page gets updated regularly as new features are released/improved.
- Amazon also recently announced in December 2022 you can now use Windows in Amazon's Managed Node Groups.
Get started using EKSCTL to create your cluster
Now that you're hopefully convinced to use EKSCTL, lets dive in and try it out.
Requirements / Setup
First, we'll need to get your computer ready to use this tool. To do this, we'll need to install EKSCTL, the AWS CLI, the Kubectl CLI, and configure your CLI to have access to your AWS account (assuming that you have an AWS account already).
- Install EKSCTL
- Install AWS CLI
- Install Kubectl CLI
- Configure your AWS CLI to have access to your AWS Account
Creating an EKS cluster and a VPC
When using EKSCTL, you have really one of two options. You can either pre-create an VPC (or use an existing one) or you can let EKSCTL create that VPC.
For simplicity in this post we will be letting EKSCTL create an VPC for us. Using your favorite editor, just create a file named
basic-cluster.yaml with the following contents:
apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: basic-cluster region: us-east-1 version: "1.23" iam: withOIDC: true nodeGroups: - name: ng-1 availabilityZones: ["us-east-1a"] instanceType: m5.large desiredCapacity: 10 volumeSize: 50 - name: ng-2 availabilityZones: ["us-east-1b"] instanceType: m5.xlarge desiredCapacity: 2 volumeSize: 50
Of course, change the region, availability zones, and cluster name, and version if desired. Save this file, and then run the command:
eksctl create cluster --config-file basic-cluster.yaml
After running this command, it will begin provisioning your "infrastructure as code" EKSCTL config file in AWS. EKSCTL does this by creating some CloudFormation based on the configuration file. Once your cluster has been provisioned, it will then provision the node groups. After it finishes with the node groups, it'll setup your
kubectl current context to be talking to the cluster, and it will exit.
At this point you should be able to query your Kubernetes cluster with the command:
kubectl get nodes
If this works, congratulations! You've bootstrapped your own Kubernetes cluster on AWS via EKSCTL.
Notes and nuances
It's important for me to convey certain best-practices to you:
- It is important that you always only put one node group in a single AZ. There is no strict requirements for this, however if you don't, when you use the cluster-autoscaler to dynamically scale up the nodes in your cluster, it may "miss" when using AZ-specific resources (such as volumes, aka disks). This means if, for example, you want to start a stateful set such as a Postgresql via Helm Chart and you want it to be persistent. It will need to create a volume, and volumes in AWS are bound to a single AZ. On first provision it will work properly and randomly get located in whichever AZ it gets the pod assigned to. However, if that node dies or you need to restart this service, if there is no availability on the nodes in this AZ the autoscaler will try to scale up. The cluster autoscaler and how AWS's autoscaling groups work is that you can't specify which AZ to launch into. If you're in 2 AZ's, you have a 50% chance that the instance it launches is in the wrong AZ. If you're in 3 then a 66% chance, 4 AZ's a 75% chance, and so on. If you want the more highly-available and faster-to-scale system, you want to always stick to one AZ per node group.
- It's also critical that if you specify a "group" of instance types in one node group to launch, that you ensure they are of the same or similar resources. This strategy is REALLY useful with spot instance node groups. For example, using the instance types of
["c5a.2xlarge","c5ad.2xlarge","c5d.2xlarge","c5n.2xlarge","c5.2xlarge"]would work great because all of them have the same CPU and RAM. However, it would be really bad if you instead used
["c5a.large","c5a.xlarge","c5d.2xlarge"]. Because, similar to the above gotcha, cluster autoscaler can not "choose" which of these to launch, and it won't know exactly what it is provisioning. If you want to support many different instance types, they either need to have the same spec in one node group, or they each need to be their own node groups. If you make them their own node groups cluster-autoscaler's scaling strategy, also called an Expander, can intelligently choose whichever it needs to handle your request based on its strategy. This means, same as above, your launch may "miss". So if for example you deployed a service which needs 6 CPUs and don't have that many available on any current nodes, it will trigger your autoscaling group to add one. However, with the configuration
["c5a.large","c5a.xlarge","c5d.2xlarge"]you only have a (at best) 33% chance of launching an instance with enough CPUs, since
c5a.xlargehas only 4 CPUs, and c5a.large has only 2.
- Here's an excerpt from AWS EKS Best-Practices -
It’s critical that all Instance Types have similar resource capacity when configuring Mixed Instance Policies. The autoscaler’s scheduling simulator uses the first InstanceType in the MixedInstancePolicy. If subsequent Instance Types are larger, resources may be wasted after a scale up. If smaller, your pods may fail to schedule on the new instances due to insufficient capacity. For example, M4, M5, M5a, and M5n instances all have similar amounts of CPU and Memory and are great candidates for a MixedInstancePolicy. The EC2 Instance Selector tool can help you identify similar instance types.
- The lesson I would have you take from the above about any of the Amazon Web Services is this platform won't stop you from doing really bad things (such as allowing you to use an non-private CIDR range in your VPCs). I highly recommend that every engineer before setting up and managing Kubernetes on AWS reads and understands every part of the EKS Best Practices Guide.
Now that you've got EKS deployed, you might want to know how easy it is to upgrade Kubernetes in AWS EKS with zero downtime. I've got you covered in my next article.
Additionally, for real-world, useful and detailed EKSCTL configuration files to make your adoption easier, I recommend viewing our companion GitHub repository with various EKSCTL example configurations. This will fit much more advanced scenarios, spot instances, windows nodes, best-practices, etc.