Idempotency of tests

kembhootha · March 15, 2022, 8:17am

Sure, this is not a k6 related question 100% but want to see if k6 has better ways of doing this.

I have a bunch of CRUD apis to be load tested and some of them are not idempotent and change the state of thr underlying persistence - a delete-all might cause the performance of a subsequent get to show up as faster than normal. While reserrkng data after k6 calls will make it work - it defeats the purpose of concurrent read and write testing. No point trating ApI groups in isolation as it doesnt reflect a real world scenario.

How can i better address this problem with K6?

nicole · March 15, 2022, 10:09am

Hi! It’s true that testing non-idempotent APIs can eventually affect the performance of the succeeding requests. However, if you’re planning your workload model such that it simulates production traffic sufficiently accurately, you’d probably see the same effect on performance in production as well. I would argue that if your goal is to mimic what happens in production, then it’s not necessarily a problem that performance may eventually decrease as, for example, a database fills up.

It’s likely you’ll want to run a few tests to explore the effect of requests to the non-idempotent APIs. For example, perhaps you can run GETs only, then run another test where you slowly start to introduce other requests that perform CRUD operations. It may be useful in this situation to control those dials separately and be able to turn each one on and off while you’re exploring the effects, so I’d recommend writing them as different scenarios in a k6 script. You can use different executors and other test parameters to mix the load the way you want and control the variables so that you can run clean tests that you can draw conclusions from.

Going further, it might also be a good idea to find out when performance starts to degrade noticeably, however you’d like to define that. Does performance degrade faster with a higher test throughput (rps), or with a sustained duration of traffic? You could try spike, stress, and soak tests to explore the limits of the application. If you can pinpoint the situations when the application begins to fail or become slow, then you can compare that against known conditions in production and build confidence around how much the application can withstand that way.

SrPerf · March 15, 2022, 10:12pm

Hola @kembhootha
As you well mention sometimes the goal of making a load test hyper realistic defeats the goal of being able to load test a system when we have some of those restrictions.
As Nicole very well mentions, the common approach is to segment the data that your scenario will work with.

First, have a static list that you rotate on the idempotent API calls. Set and forget. That part is quite straight forward.

But on the calls that are not idempotent, you must create special data sets for them, and even processes that you must run during or in between tests.

On one hand the edits may be good to have a pool of IDs that no other process requests.
Then on the creation, you have some challenges. You may not want your data in the system to grow unrealistically, then you may have to include a delete process to run during or after your scenario, for that you may want to keep at hand the created ID’s, depending on the test I would recommend a centralized DB. Or that growth of data may even be normal and you can let it grow.
The deletion calls have a similar problem but backwards. They need a pool of data pre created or duplicated to delete. An extra challenge here is that we will have to stablish a mechanism so that all the vu’s “know” what has been deleted by other vu’s and do not try to pick that. This one is tricky but doable with a centralized storage and flags. You can even make the IDs to be deleted to be unique to each VUser to also guarantee there will be no fight for data to delete. Just be careful that this doesn’t become a bottleneck if your load test is very big. You will just have to create a process after to repopulate those records during or after your scenario is done.
If for any reason you want to test the impact of idempotent processes using the same data as non-idempotent, there you have a special scenario. Depending on how realistic is this to happen, you may have to create a subset of tests that do non and idempotent processes sharing the same data. I would just recommend there to do a sub scenario with these others and run it in parallel to the big load test. Just to observe what happens on those specifics, as Nicole mentions. In those cases, I would even suggest manual tries during the load test.

For all of the above, as Nicole mentioned, you can have the mix of scenarios, and data parametrization from the specified files as this link shows: Data Parameterization

As you well mention, if these APIs are separated and you may feel load testing different data on each makes it a bit unrealistic. But in the end these APIs may share the same data sources and most probably that is where the bottleneck may appear even if they are not working on the same ID. And as mentioned you can side test this specifically if that is a real case.

Hope this helped and added to Nicole’s input
If you have more q’s just let us know
Gracias,
Leandro