-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Labels
Description
Problem Description
When efaEnabled: true
is set on a nodegroup, eksctl creates an additional security group for EFA communication that gets tagged with kubernetes.io/cluster/<cluster-name>: owned
. This creates a conflict with the AWS Load Balancer Controller's target group binding logic, which expects exactly one security group with this tag per ENI.
Error Message
Warning FailedNetworkReconcile 33s (xxxxx over 2d1h) targetGroupBinding
expected exactly one securityGroup tagged with kubernetes.io/cluster/kreks for eni eni-xxxxxxx,
got: [sg-xxxxxxxxx sg-xxxxxxxxx] (clusterName: xxxxx)
What were you trying to accomplish?
Create an EFA-enabled EKS nodegroup that can also use Network Load Balancer (NLB) services without conflicts.
What happened?
- Created a nodegroup with
efaEnabled: true
- eksctl created two security groups both tagged with
kubernetes.io/cluster/<cluster-name>: owned
:- The original nodegroup security group
- An additional EFA-specific security group
- When creating NLB services, the AWS Load Balancer Controller fails because it finds multiple security groups with the cluster ownership tag https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/main/pkg/networking/networking_manager.go#L571
How to reproduce it?
Config File
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: test-cluster
region: us-west-2
nodeGroups:
- name: efa-workers
instanceType: c5n.18xlarge
minSize: 1
maxSize: 3
availabilityZones: ["us-west-2a"]
efaEnabled: true
Steps
- Create cluster with EFA-enabled nodegroup:
eksctl create cluster -f config.yaml
- Deploy a service with NLB:
apiVersion: v1 kind: Service metadata: name: test-nlb annotations: service.beta.kubernetes.io/aws-load-balancer-type: "nlb" spec: type: LoadBalancer ports: - port: 80 targetPort: 8080 selector: app: test-app
- Observe the
FailedNetworkReconcile
error in AWS Load Balancer Controller logs
Logs
Anything else we need to know?
A possible solution is to create a new configuration option to control EFA security group tagging:
nodeGroups:
- name: efa-workers
efaEnabled: true
efaSecurityGroupTagging:
clusterOwnership: "shared" # or "owned", "none"
Versions
$ eksctl info
eksctl version: 0.212.0
kubectl version: v1.33.3
OS: darwin
askulkarni2