Test Stuck After Time Limit Finished

Hello all,
I have test that has three different scenarios, one is ‘ramping-vus’ and the two others are ‘ramping-arrival-rate’.
All of them were supposed to finish after about 2 hours of running, but for some reason one scenario stuck at 96% forever.
Those are the logs I am getting from K6 -

running (2h57m53.3s), 0070/5261 VUs, 231025 complete and 4930 interrupted iterations
users_spawn   ✓ [ 100% ] 4992/5000 VUs  2h10m0s
messages_send ✓ [ 100% ] 000/084 VUs    2h0m25s              00.91 iters/s
spike_users   ✗ [  96% ] 000/177 VUs    1h53m52.4s/1h58m50s  004.47 iters/s

running (2h57m54.3s), 0070/5261 VUs, 231025 complete and 4930 interrupted iterations
users_spawn   ✓ [ 100% ] 4992/5000 VUs  2h10m0s
messages_send ✓ [ 100% ] 000/084 VUs    2h0m25s              00.91 iters/s
spike_users   ✗ [  96% ] 000/177 VUs    1h53m52.4s/1h58m50s  004.47 iters/s

running (2h57m55.3s), 0070/5261 VUs, 231025 complete and 4930 interrupted iterations
users_spawn   ✓ [ 100% ] 4992/5000 VUs  2h10m0s
messages_send ✓ [ 100% ] 000/084 VUs    2h0m25s              00.91 iters/s
spike_users   ✗ [  96% ] 000/177 VUs    1h53m52.4s/1h58m50s  004.47 iters/s

running (2h57m56.3s), 0070/5261 VUs, 231025 complete and 4930 interrupted iterations
users_spawn   ✓ [ 100% ] 4992/5000 VUs  2h10m0s
messages_send ✓ [ 100% ] 000/084 VUs    2h0m25s              00.91 iters/s
spike_users   ✗ [  96% ] 000/177 VUs    1h53m52.4s/1h58m50s  004.47 iters/s

running (2h57m57.3s), 0070/5261 VUs, 231025 complete and 4930 interrupted iterations
users_spawn   ✓ [ 100% ] 4992/5000 VUs  2h10m0s
messages_send ✓ [ 100% ] 000/084 VUs    2h0m25s              00.91 iters/s
spike_users   ✗ [  96% ] 000/177 VUs    1h53m52.4s/1h58m50s  004.47 iters/s

running (2h57m58.3s), 0070/5261 VUs, 231025 complete and 4930 interrupted iterations
users_spawn   ✓ [ 100% ] 4992/5000 VUs  2h10m0s
messages_send ✓ [ 100% ] 000/084 VUs    2h0m25s              00.91 iters/s
spike_users   ✗ [  96% ] 000/177 VUs    1h53m52.4s/1h58m50s  004.47 iters/s

As you can see, ‘spike_users’ is supposed to finish after 1h58m50s but the test already running for 2h57m58.3s and it is just going up with no change.

Any idea why is it happening?

Hmm this might be a bug in k6 :thinking: Can you please share some more details, to help us diagnose it? For example:

  1. Which k6 version are you using and what is your OS?
  2. Do you use any xk6 extensions?
  3. Can you consistently reproduce this problem with a smaller script, e.g. fewer VUs and 2 minutes duration instead of 2 hours? Or was this a one-off bug?
  4. What were your script options, or at least the scenarios config specifically?
  5. Can you share a sanitized version of your script, or at least a general description of what the buggy scenario was doing? Even a general description of the protocols used (e.g. HTTP/gRPC/WebSockets/etc.) might be useful.

Thank you for your response @ned ,

  1. We use K6 with the master docker image, right now it points at k6 version 0.40.0 and the image is running on the latest Linux ubuntu OS.

  2. No xk6 extensions usage.

  3. It looks like it happens more frequently when running long tests in duration, but I was able to reproduce the problem in 5min test duration with fewer VUs, but then it didn’t happen every single time though.

  4. We didn’t have many special adjustments in the options, we used these configs -
    export const options = {tags, scenarios, thresholds, discardResponseBodies: false, systemTags: ['status', 'name', 'method']}

  5. It might be actually the functionality of the problematic scenario that makes issues that I was not aware of.
    The problematic scenario makes ‘spikes’ of HTTP requests as well as VUs spikes, see the configuration -
    { executor: 'ramping-arrival-rate', startRate: 0, timeUnit: '1s', stages: [{target:0, duration:1m},{target:350, duration:5s}{target:0, duration:5s}{target:0, duration:1m}], preAllocatedVUs: 5, maxVUs: 500, exec: 'Messages', }

Those spikes go over and over throughout the entire test (we used ‘for loop’ to create the stages).