Some metrics does not reach InfluxDb

donRumatta · November 30, 2020, 12:36pm

Hello! I have a test that outputs results to InfluxDb+Grafana.

If I run 2 iterations sequentially with 1 user all data gets to InfluxDb but when I run 2 iterations by 2 users in parallel some http_req_duration are lost (random requests, it’s just an example):

It also applies to group_duration: the request вход (the last) has count of 2 (correct). But the group ::весь процесс::вход that contains only this request (sorry it is not allowed to attach more than one picture) has count of 1.

I’ve tried to run with key -v, there are many writes to InfluxDb, but no errors.

Maybe smb knows about this issue, or where to dig further. Thank you.

mstoykov · December 2, 2020, 10:34am

Hi @donRumatta, welcome to the community forum, sorry for the late reply

Can you try running with --system-tags=proto,subproto,status,method,url,name,group,check,error,tls_version,scenario,service,rpc_type,vu

This is basically the default value with vu added. I am pretty sure that both VUs are sending data with the same timestamp and as such influxdb drops one of the metric lines. This change will add a separate tag for both which should fix the issue you are seeing.

Can you also provide influxdb and k6 versions and whether you run them both locally or in docker or something, in case this is not the problem and we need to investigate more.

donRumatta · December 7, 2020, 1:03pm

I’m also sorry for late reply: put aside this issue until coding full scenario.

I’ve tried to run k6 with system-tags from your reply and unfortunately it didn’t help. I don’t see some http_req_duration or group_duration from some VUs, for example:

Versions:

k6 v0.29.0 (2020-11-11T13:27:19+0000/d9bced3, go1.15.3, windows/amd64)
InfluxDb 1.7.1

mstoykov · December 14, 2020, 2:29pm

HI @donRumatta, sorry for the long delay.

I don’t know what is going on and it will probably be a long time before I have time to look into it.

If you can dump the traffic going to the influxdb when this happens and see if the send data is what is expected, that at least will let us know the problem is in k6 or if influxdb just decides to drop something.

dan_nm · December 16, 2020, 10:56pm

One additional issue you may be facing is by default, k6 passes vu to InfluxDB as a field instead of a tag. Fields are not treated the same way as tags when determining a unique time series in InfluxDB, so if metrics emitted by multiple users have the same timestamp, it is not enough to include vu in the --system-tags configuration. You must also remove it from the “tags as fields” configuration.

Currently, the only way I am aware of to do this is using the K6_INFLUXDB_TAGS_AS_FIELDS environment variable. The JSON configuration for this is broken, as I reported several months ago in the thread InfluxDB tagsAsFields configuration issues.

Of note, the default value for this env variable includes the iter, vu, and url tags, so to exclude vu, you would set the env variable as follows (using syntax appropriate for the OS/terminal used to execute your test scripts): K6_INFLUXDB_TAGS_AS_FIELDS=iter,url

I faced this issue with metrics for multiple users not being written to InfluxDB (using k6’s JSON output helped me confirm the timestamps were identical). Removing the vu tag from that environment variable did the trick for me. Hopefully it will work for you as well.

donRumatta · December 17, 2020, 7:31am

Ok, I’ll make a dump when finish main logic.

donRumatta · December 17, 2020, 7:32am

Thank you, I’ll try your approach.

donRumatta · January 15, 2021, 2:52pm

–system-tags and K6_INFLUXDB_TAGS_AS_FIELDS look like fixing the problem for http_req_duration for at least 2-3 VUs that I’m testing by sight. But some group_durations are still lost: maybe –system-tags and K6_INFLUXDB_TAGS_AS_FIELDS are not applied to this metric?

nedyalko · January 15, 2021, 3:10pm

They should be. Exactly what values for –system-tags and K6_INFLUXDB_TAGS_AS_FIELDS are you using?

donRumatta · January 15, 2021, 3:48pm

From previous answers:

proto,subproto,status,method,url,name,group,check,error,tls_version,scenario,service,rpc_type,vu

and

iter,url

dan_nm · January 15, 2021, 3:57pm

The vu and iter tags are definitely applied to the group_duration metric, as shown in the example below taken from the JSON output of a test I ran yesterday.

{"type":"Point","data":{"time":"2021-01-14T15:33:55.102179-06:00","value":1500.046126,"tags":{"group":"::login","iter":"0","vu":"2"}},"metric":"group_duration"}

Configuring k6 to include the vu tag in the --system-tags, and then removing it from the value of K6_INFLUXDB_TAGS_AS_FIELDS to ensure it is passed to InfluxDB as a tag should be enough to prevent metrics from separate users with the same timestamp from being dropped. At least in my experience that has been the case.