Unique test data per VU without reserving data upfront

niklasbae · November 27, 2020, 1:56pm

Hi,

I have a scenario where I have a set of testdata, containing users (main users) that each has an array of variable length of subusers. Each main user will iterate through the subusers in a loop in each iteration. As the length of the array of subusers are varying a lot, the iteration time of each VU will differ. Some VUs will need many lines from the test data to be able to complete the test, while other VUs only need one line.

Reserving a fixed number of lines with the __VU*x+__ITER formula would be not so optimal, as I already know upfront that the test data need per VU will vary from 1 line to many lines.

With the per-vu-iterations executor, I could tell the VU to only run once and use __VU to assign a line i the test data array, but then I would want the test to spin up a new VU to keep the #VUs constant through the test.

Do you have any suggestions of how to solve this? I think a global __ITER variable would be helpful in this case!

//Niklas

nedyalko · November 27, 2020, 3:00pm

Hmm I’m not sure I completely understand your use case, but I have a few comments and questions…

Instead of a flat list, can you have a nested one? An array of arrays, the first level being the users (corresponding to __VU) and the second level the subusers (one for each iteration, with modulo division if you want to repeat them).

This is what is confusing me most - if you want to have a constant number of VUs, why can’t you use constant-vus?

Unfortunately, this is not going to be possible, since it’d be impossible to make performant in a distributed k6 run…

niklasbae · November 27, 2020, 4:21pm

Hi, thanks for the reply.

If any confusion about the test data, this is an example:

[
    {
        username: "test1",
        subusers: [
            "testsub1",
            "testsub2"
        ]
    },
    {
        username: "test2",
        subusers: [
            "testsub3",
        ]
    }
]

Where there can be from 100 to 10000 subusers.

I want constant VUs, but make sure that each line of “username” in the test data is used only once, and not concurrently by any VU. 1 iteration makes use of all the subusers belonging to the given test user. As the number of subusers vary, some VUs will only use one “username”, and some will use more.

I hope that clarifies the question.

mstoykov · November 30, 2020, 10:54am

Hi @niklasbae,

You basically need a shared integer that goes up between the VUs. This is unlikely (in that form) to be added to k6 but here is an k6 extension, that will help you.

As mentioned there (and in other places before) this can also be done through an external API that just returns you the next id and makes certain.

var element = array[parseInt(http.get("https://example.com/mynextid").body)]

Obviously with more checks and it also has the downside that this is one more request which takes time … but is actually portable to the cloud and to the future k6 distributed mode.

I would expect that if we ever have mutable state in k6 that is accessible from VUs it will be accessible based on something like a “group” of VUs. And in this way you can say that you want this counter to be accessible per each 500 VUs and k6 distributed mode will make certain that it chunks VUs between instances in such way.

This will also means that as long as you can run your whole test on a single instance you can also have exactly what you wanted. But also that if that isn’t possible (for resource reasons) you will get an error.

nedyalko · November 30, 2020, 11:58am

@niklasbae, can you please confirm something, just to make sure I understand you correctly? If the number of top-level users you have in your JSON is X, you want Y number of VUs to go through them, where X > Y? And every VU sequentially goes through all of the subusers of its current user and when it is done, it goes to work on the next top-level user that is not yet claimed by another VU?

And in your case, you’re making do with {executor: "per-vu-iterations", vus: X, iterations: 1} scenario, which makes X == Y, but this is not optimal since some VUs finish much earlier than others?

If so, then as @mstoykov mentioned, there isn’t currently a good way to handle the X < Y case in k6, right now. You need a something like a global iterator, and his xk6 extension can work if you only execute the script locally.

In the future, we might support something similar natively in k6. Please follow Data segmentation API framework · Issue #1539 · grafana/k6 · GitHub (and maybe Improve execution information in scripts · Issue #1320 · grafana/k6 · GitHub) for more updates. With the execution segments we have since k6 v0.27.0, we should be able to make a bounded iterator that would work both in a single-instance as well as a distributed k6 test run, with some restrictions that wouldn’t be an issue for your use case. I can’t give any estimates yet, the current v0.31.0 milestone of the first issue is mostly wishful thinking, but hopefully we’ll make something like it soon. Even an MVP version will likely cover your use case…

niklasbae · November 30, 2020, 12:19pm

Hi Ned,

That is the perfect description of the case. I’ll try to make use of the solutions made above by @mstoykov(thanks a lot!), and keep track of the mentioned issues. I think it will be suffient, if not I’ll do my best to modify the test data if possible.

Thanks for your input and contributions!

nedyalko · November 30, 2020, 12:26pm

Awesome, I added a note in both of the issues. Thanks for sharing your use case, it’ll help us prioritize these issues more highly when we know exactly what problems they are going to solve!