Hi! I totally understand the need to be able to trust results before committing to switch tools, and a 3x difference in response times is definitely worth looking into. Here are a few reasons why this might have occured, and what you can do to rule each one out:
Resource utilization
k6 is written in Go, and JMeter is written in Java. Besides the difference in language, this also means that they fundamentally differ in how they handle virtual users. We have found, in our testing, that k6 is significantly more performant than JMeter in that it requires fewer resources to achieve the same level of load. Here is a blog post on this topic, along with links to the script so that you can confirm the results for yourself.
Higher resource utilization can lead to inaccurate load testing results as it can make the load generator the performance bottleneck, instead of the application under test.
Here are some ways to rule this out as the cause of the discrepancy:
- Monitor your load generator’s resource utilization while you run both tests.
- Verify the JVM settings. JMeter requires you to tune the JVM it’s running on as well as that of the load generator.
- Run the JMeter test in CLI mode. JMeter’s GUI is an extra overhead that is not present with k6, so it’s best to run JMeter headlessly during the test.
Throughput
20 threads in JMeter != 20 VUs in k6. You can see this in your screenshot. Within a similar duration, your JMeter test sent 8,317 requests and your k6 test sent 25,794 requests. Even though the number of “users” is the same, the load each test generated AND the resource each test required were clearly not the same either.
Test duration
The short duration of both tests (~2 mins) may be a bigger contributor than is apparent. Very short tests increases the likelihood of outliers skewing the results heavily. For example, in the JMeter test, the 99th percentile response time (over 1s) is significantly higher than the 95th percentile response time (478ms). Did the k6 results show a similar distribution?
To rule this out, run both tests over a longer period of time.
Scripting differences
What’s in the script can also affect how long it takes to execute, such as:
- think time: Did you use any timers (JMeter) or sleep (k6)? If yes, were they the same type (Gaussian, uniform random, constant)? JMeter applies timers to every sample within the scope of the timer.
If you’re not using think time, I’d suggest you try adding think time to both. Sending requests repeatedly, without think time, could do more harm than good with regards to load testing results since it’s very resource-intensive.
- Embedded resources: Did you record embedded resources in one script but not the other?
- JMeter log configuration: Verify your JMeter configuration to see what’s being logged. You can click on the Configure button of the listener you’re using to verify this. JMeter’s default log settings record more than k6’s log results settings do.
Give those suggestions a try and let us know how it goes! It’s very reasonable to question these differences, but it may take some testing to figure out. Good job for testing your test tools-- I wish more people did that!