Incorrect counter rate when using multiple scenarios report

Hello @all,

So I’m using the hints from [1] on how to obtain per scenario reports (via thresholds).

For my untrained eyes all looks fine but not OK for the counter reports (like iterations or data_sent / data_received) for which I’ve set count>=0 threshold.

For example:

$ kubectl logs --follow k6-xxx-test-1--1-gjnmz

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: /test/main.js
     output: -

  scenarios: (100.00%) 2 scenarios, 100 max VUs, 1m0s max duration (incl. graceful stop):
           * rate1: 0.25 iterations/s for 30s (maxVUs: 10-100)
           * rate2: 1.00 iterations/s for 30s (maxVUs: 10-100, startTime: 30s)


     ✓ POST /api/v1/login returned 200
     ✓ POST /api/v1/login returned sessid
...
     ✓ DELETE /api/v1/mails/{mailId} returned 200
     ✓ DELETE /api/v1/folders/{folderId} returned 200

     DELETE /api/v1/folders/<folderId> (full)..................: avg=12.08ms  min=3.87ms   med=13.21ms  max=15.19ms  p(90)=14.08ms  p(95)=14.35ms  p(99)=14.89ms  count=68
     ✓ { scenario:rate1 }......................................: avg=12.97ms  min=3.87ms   med=13.37ms  max=15.19ms  p(90)=14.49ms  p(95)=14.9ms   p(99)=15.14ms  count=14
     ✓ { scenario:rate2 }......................................: avg=11.85ms  min=3.99ms   med=13.1ms   max=14.63ms  p(90)=14.04ms  p(95)=14.15ms  p(99)=14.53ms  count=54
...
     PUT /api/v1/drafts/<mailId> (waiting).....................: avg=18.47ms  min=7.88ms   med=17.86ms  max=79.76ms  p(90)=24.18ms  p(95)=26.91ms  p(99)=62.78ms  count=35
     ✓ { scenario:rate1 }......................................: avg=17.57ms  min=8.17ms   med=18.21ms  max=23.47ms  p(90)=20.52ms  p(95)=22ms     p(99)=23.18ms  count=7
     ✓ { scenario:rate2 }......................................: avg=18.7ms   min=7.88ms   med=17.78ms  max=79.76ms  p(90)=24.96ms  p(95)=28.38ms  p(99)=66.28ms  count=28
     checks....................................................: 97.12% ✓ 1284      ✗ 38
     ✓ { scenario:rate1 }......................................: 97.06% ✓ 265       ✗ 8
     ✓ { scenario:rate2 }......................................: 97.14% ✓ 1019      ✗ 30
     data_received.............................................: 4.3 MB 72 kB/s
     ✓ { scenario:rate1 }......................................: 898 kB 15 kB/s
     ✓ { scenario:rate2 }......................................: 3.4 MB 57 kB/s
     data_sent.................................................: 812 kB 14 kB/s
     ✓ { scenario:rate1 }......................................: 168 kB 2.8 kB/s
     ✓ { scenario:rate2 }......................................: 644 kB 11 kB/s
...
     iterations................................................: 34     0.566518/s
     ✓ { scenario:rate1 }......................................: 7      0.116636/s
     ✓ { scenario:rate2 }......................................: 27     0.449882/s
     vus.......................................................: 10     min=0       max=10
     vus_max...................................................: 10     min=10      max=10

Is this a known issue / a side effect of the unusual method of obtaining separate reports per scenario?

Just to summarize: the wrong values are the ones reported for all counter metrics in the “per scenario” row, where instead of the scenario duration the values are divided over the entire test duration.

Like: iterations on scenario rate2 where we get 0.449882/s (~ 27/60) instead of 0.9 (=27/30).

In [2] you could see how the scenarios have been set up for above report.

From the Image ID sha256 value it seems I’m running with runner-v0.0.7rc2.

Could anybody provide any hints here?

Thx,
Ioan

[1] Multiple scenarios & metrics per each - #3 by ned

[2] test scenarios

          rate1: {
            executor: 'constant-arrival-rate',
            rate: 1,
            timeUnit: '4s',
            duration: '30s',
            preAllocatedVUs: 10,
            maxVUs: 100,
            gracefulStop: '0s',
            startTime: '0s',
          },
          rate2: {
            executor: 'constant-arrival-rate',
            rate: 1,
            timeUnit: '1s',
            duration: '30s',
            preAllocatedVUs: 10,
            maxVUs: 100,
            gracefulStop: '0s',
            startTime: '30s',
          },

Hi @indreias

Yes, you’re right: those rates are calculated for the whole test, not per scenario, and it is the expected behavior right now. I personally agree that this bit in summary is a bit counter-intuitive :sweat_smile: but it’s not something that can be easily changed in k6 processing in near future.

But if needed, you can calculate the correct rate yourself and then use handleSummary callback to output the new rate. Here’s the doc for the callback:

It imports one of k6 jslibs and you can see the full code of it here:

The data parameter passed to handleSummary has metrics, and in your case among those metrcis there will be something like this:

   "iterations{scenario:rate1}": {
      "contains": "default",
      "values": {
        "count": 8,
        "rate": 0.13332495024266028
      },
      "thresholds": {
        "count>=0": {
          "ok": true
        }
      },
      "type": "counter"
    },

Given this we can re-calculate rate to a new value, adjusted per scenario:

export function handleSummary(data) {
    for (let key in options.scenarios) {
        let duration = parseInt(options.scenarios[key].duration); // this is smart enough to get rid of 's' in the end; if duration is not in seconds, it must be additionally converted to them
        let ratePerScenario = data.metrics[`iterations{scenario:${key}}`].values.count / duration; // calculate new rate with scenario duration
        data.metrics[`iterations{scenario:${key}}`].values.rate = ratePerScenario; // set new rate in the main data object to pass it to standard summary output below
    }
    return {
        'stdout': textSummary(data, { indent: ' ', enableColors: true }),
    }
}

This workaround may not be as straight-forward with another executor though :slight_smile:

Hi @olha

Many thx for your time to check my report and for proposing above workaround. I have just checked it and it works fine.

For example, when testing with 2 scenarios of 300 sec the reported rates are the expected ones, like:

     iterations.........................................................: 223    0.371658/s
     ✓ { scenario:rate1 }...............................................: 74     0.246667/s
     ✓ { scenario:rate2 }...............................................: 149    0.496667/s

Now I’m somehow worried about the fact that the scenario duration could be affected by the gracefulStop value so I like to ask if there are no other means to obtain automatically (or even compute and set in another metric) the real duration of a specific scenario.

For example I may have a scenario duration of 300 sec with gracefulStop of 60 sec but, when the scenario get executed, the real duration will be 320 sec (and not 360).

I have to recognize that I’m a beginner in K6 and maybe my above question is a silly one :innocent:

Thx,
Ioan

Hi @indreias,

I think that’s a very interesting question! You could get the real time of gracefulStop with teardown, docs for reference. teardown function is run after test runs so getting, say, the current time in teardown would include real time of gracefulStop with only very minor additional time overhead. But again, that’s a global time for the whole test.

For scenarios, it might also be useful to use k6 new and shiny execution API, specifically startTime here but sadly, there is no way to get the corresponding stopTime.

So to sum up, if you have one scenario and need to know real duration, then monitoring time via exec.scenario.startTime and teardown should be enough. If there is more than one scenario, then some cases might work with combination of teardown and exec.scenario.startTime, e.g. if gracefulStop is only in the last scenario and scenarios are strictly sequential as you had it in your previous example.

I don’t think there is anything else that can help with measuring gracefulStop at the moment… We actually have a WIP and future plans to improve similar cases, like this and this. But that’s still some time ahead.

Hope that helps :slight_smile: