Running each pod on a separate EKS node (with node groups)

Hey everyone,

I’m using K6 operator and I’m trying to run a distributed load test on EKS with AWS auto-scaling node groups.

I’ve setup cluster autoscaler and it does autoscale my nodes if I set something like this in my K6 CRD:

resources:
      limits:
        cpu: 600m
        memory: 1Gi
      requests:
        cpu: 100m
        memory: 1Gi

However, my goal is to have every pod during my load test be run on a separate node (which means I’d like my node group to autoscale to the size of parallelism in the CRD).

I tried doing this with anti-affinity groups and couldn’t get it to work.

I then tried using separate: true as outlined in the operator documentation and I’m getting some strange behaviour.

For example, if I have parallelism set to 10, and I have a node group on EKS with a minimum of 2 and a maximum of 10, when I set separate:true it will create one additional node and give me 3 nodes. And all the other pods will remain in a “pending” state.

If I try to cancel and run it again, the same will happen. This time I’ll get one extra node which will give me a total of 4 nodes and all the other pods will remain in a pending state.

Any idea why this is happening. Would appreciate any help.

Here’s my CRD file:


apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-sample
  labels:
    app: load
spec:
  parallelism: 10
  script:
    configMap:
      name: "crocodile-stress-test"
      file: "test.js"
  separate: true
  arguments: --out statsd
  runner:
    metadata:
      labels:
        app: load
    # resources:
    #   limits:
    #     cpu: 600m
    #     memory: 1Gi
    #   requests:
    #     cpu: 100m
    #     memory: 1Gi
    env:
      - name: K6_STATSD_ADDR
        value: "statsd-service:8125"
    # affinity:
    #   podAntiAffinity:
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #       - labelSelector:
    #           matchExpressions:
    #             - key: app
    #               operator: In
    #               values:
    #                 - load
    #         topologyKey: kubernetes.io/hostname

Hi @elguaposalsero,
Welcome to the forum :wave:

It sounds like there’s an issue with EKS or autoscaler setup. separate: true should have been enough to allocate additional nodes in the scenario you described. I’d recommend to try to find out if cluster-autoscaler is healthy and what reason exactly is given for FailedScheduling:

Checking if there’s are any known issues related to specific versions of EKS and cluster-autoscaler might also help.

Hope that helps!

1 Like