Why is RPS low even when VUs are above 100+?

Hi everyone, I have a scenario where I have 500 VUs (at most) requesting to download a specific object from Minio. The API I created essentially generates a new folder (for every request) and downloads ~70MB zip folder from Minio. This is to mimic the real world, where multiple users are trying to download the same content. The test script I have essentially sends the same object to the API, which will download the same zip folder from Minio. Here is my test script:

import http from 'k6/http';
import { tagWithCurrentStageIndex } from 'https://jslib.k6.io/k6-utils/1.3.0/index.js';
import { sleep, check } from 'k6';

export const options = {
  discardResponseBodies: true,
  scenarios: {
    load: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '5m', target: 50 },
        { duration: '5m', target: 200 },
        { duration: '5m', target: 350 },
        { duration: '5m', target: 500 },
        { duration: '5m', target: 0 },
      ],
      tags: { test_type: 'load' },
      gracefulRampDown: '5m',
    },
  },
  thresholds: {
    'http_req_duration{test_type:load}': ['p(99)<180000'],
  },
};

export default async function () {
  tagWithCurrentStageIndex();
  const url = 'http://localhost:4000/api/s3/object/download';

  const data = {
    s3Path: 'AE-TEST/0234/test.zip',
  };
  const headers = {
    'Content-Type': 'application/x-www-form-urlencoded',
  };
  await http.post(url, data, {
    headers,
    timeout: '240s',
  });
  sleep(1);
}

For some reason, the RPS is around ~1-5 (at most) RPS, even when the number of VUs is 100+. I’m also using InfluxDB and Grafana to plot my results, and not until recently did I start getting these warnings:

The flush operation took higher than the expected set push interval. If you see this message multiple times then the setup or configuration need to be adjusted to achieve a sustainable rate.  output=InfluxDBv1 t=1.015443s

I’m not sure if this effects my results in Grafana, but it’s been a bother. Here are the results I get:

I could also share the code regarding the API, but I wanted to see if it had to do with something regarding my script. Thanks in advance!

Hi @ncbernar, sorry for the slow reply ;(

Looking at teh problem statement I would expect that this is just the “normal” case where you hit a performance limit (of some kind).

For an analogy let’s think of your server is a post office and people are coming to get packages.

Everything will work okay and as more people are coming more packages will be given to people. But at some point the max speed with which the worker in the post office can check that package is there, get it, check documents and let someone go will be hit.

At that point adding more people that want to get packages will just mean that they will wait longer as the worker can’t actually do this faster.

Looking at the http_req_duration it seemed like I was right at the start and then, not so much but I noticed that you are using a logarithmics scaling which is likely what is tripping both of us.

Given the rest of the info - transfering 70mb will utilise your network connection. To download it in 1s you need 70mb*8 = 560 mbps or moer than 1/2 of 1gbps connection.

Looking at the graph you seem to be at 16s at the end where you are having 30 VUs which kind of gives you exactly 1gbps of a connection.

So with that in mind it seems to me like this test will 100% be limitted by the network speed you can have between the load generator and the system under test. And the final performance of the system under test will also be limited by that.

You seem to be hitting the same problem as Influxdb reported a large number of errors when run k6 - OSS Support - Grafana Labs Community Forums but in a small enough capacity that you likely don’t need to worrry.

Hope this helps you!

1 Like

@mstoykov This is very insightful! I think this has helped my understand a bit more on what the bottleneck is. The minio server is actually on a 1gbps network connection. Thank you for the wonderful explanation and analogy!

Actually, looking back at the results, you mentioned that in the end, we are given exactly 30 VUs, resulting in a 16s request duration, but the requests per second at that point is 1 request/second. Is this the rate at which each VU sends a request, or is this the number of requests that k6 was able to send in total given 30 VUs?

requests per second at that point is 1 request/second

Can you provide how this is calculated? Also as this seems to be the end of the run there is a good chance k6 just stopped running so even though requests were taking 16s at this point only 1-2 ended in the last 1-2 seconds.

I would highly recommend to run for longer periods in all cases

Here are the results for a longer test run:

request per second at that point is 1 request/second

This result queries the http_reqs metric from k6. Here is the query:

This is what my script looks like:

import http from "k6/http";
import { tagWithCurrentStageIndex } from "https://jslib.k6.io/k6-utils/1.3.0/index.js";
import { sleep, check } from "k6";
import exec from "k6/execution";

export const options = {
  discardResponseBodies: true,
  scenarios: {
    load: {
      executor: "ramping-vus",
      startVUs: 0,
      stages: [
        { duration: "5m", target: 50 },
        { duration: "5m", target: 100 },
        { duration: "5m", target: 150 },
        { duration: "5m", target: 50 },
        { duration: "5m", target: 0 },
      ],
      gracefulRampDown: "5m",
    },
  },
  thresholds: {
    http_req_duration: ["p(99)<120000"],
  },
};

export default async function () {
  tagWithCurrentStageIndex();

  const url = "http://localhost:4000/api/s3/object/download";

  const data = {
    s3Path: "TEST_HP/2463/TEST_4TB.zip",
  };
  const headers = {
    "Content-Type": "application/json",
  };
  await http.asyncRequest("POST", url, JSON.stringify(data), {
    headers,
  });
  sleep(1);
}