Stream error: stream ID x; CANCEL

I run my scenarios locally from CLI.

In some of my scenarios, I get the following error for higher numbers of VU:

WARN[0036] Request Failed error="get \"https://<url>": stream error: stream ID 13; CANCEL"

It only seems to happen on scenarios with large number of parallel (batch()) requests. I start getting the error somewhere around 25 VUs (each at 6 parallel requests for majority of the scenario’s duration) and error rate steeply increases with additional VUs.

It seems to happen randomly for all kinds of requests, including static content.

I failed to replicate the error on scenarios with only serial requests (no batch()) - even at 2000 VUs I didn’t get the error once.

The stream ID always seems to be an odd number for some reason.

Here is a response object captured for one such failed request:

{
  "remote_ip": "<ip>",
  "remote_port": 443,
  "url": "https://<url>",
  "status": 0,
  "proto": "",
  "headers": {},
  "cookies": {},
  "body": null,
  "timings": {
    "duration": 16953.3841,
    "blocked": 0,
    "looking_up": 0,
    "connecting": 0,
    "tls_handshaking": 0,
    "sending": 0,
    "waiting": 16953.3841,
    "receiving": 0
  },
  "tls_version": "",
  "tls_cipher_suite": "",
  "ocsp": {
    "produced_at": 0,
    "this_update": 0,
    "next_update": 0,
    "revoked_at": 0,
    "revocation_reason": "",
    "status": ""
  },
  "error": "stream error: stream ID 29; CANCEL",
  "error_code": 1000,
  "request": {
    "method": "get",
    "url": "https://<url>",
    "headers": {
      "Accept-Encoding": [
        "gzip, deflate, br"
      ],
      "Accept-Language": [
        "en-US,en;q=0.9"
      ],
      "Sec-Fetch-Dest": [
        "empty"
      ],
      "Sec-Fetch-Mode": [
        "cors"
      ],
      "Sec-Fetch-Site": [
        "same-origin"
      ],
      "Accept": [
        "application/json, text/plain, */*"
      ],
      "Cookie": [
        "loginCookie-<hash>=<cookie value>"
      ],
      "User-Agent": [
        "k6/0.28.0 (https://k6.io/)"
      ]
    },
    "body": "",
    "cookies": {
      "loginCookie-<hash>": [
        {
          "name": "loginCooki-<hash>",
          "value": "<cookie value>",
          "replace": false
        }
      ]
    }
  }
}

I was unable to find anything about this error. Is it an error with K6 only, or does it indicate a problem with my server? How can I find out more about it?

This is an HTTP/2 error, see https://tools.ietf.org/html/rfc7540#section-7:

CANCEL (0x8): Used by the endpoint to indicate that the stream is no longer needed.

I am not completely sure, but I think might mean your server is getting a bit overloaded? Or maybe it has a bug somewhere? :man_shrugging:

Generally, each VU in k6 is mostly independent, so normal non-batch requests from multiple VUs will use separate connections to the server. But with http.batch(), the VU will perform multiple requests simultaneously. And when you’re using HTTP/2, as your error suggests you are, these requests will likely be multiplexed on the same network connection. So this might be the reason you’re only seeing the error when you use http.batch().

It’s not impossible this is some bug in the Go standard HTTP library, on which k6 depends to make HTTP requests. Just to rule that out, can you share which k6 version are you using? If you’re using an older one, can you upgrade to the latest k6 v0.29.0 and confirm that the error persists? Also, which OS are you using and are you running k6 from docker or from the plain binary?

1 Like

Actually, this might be a k6 bug after all. I investigated things a bit, and it turns out we were depending on an old version of the Go HTTP/2 library. So it might be a bug that stems from that… :disappointed:

I made a note of that in https://github.com/loadimpact/k6/issues/548#issuecomment-731245076 and prioritized the issue to get rid of the extra dependency sooner, if possible. And I also created https://github.com/loadimpact/k6/pull/1734 to update the dependency, which we’ll merge in master soon. You should then be able to use the Docker master image, or build a k6 binary yourself, to see if you experience the same issue.

1 Like

The change was merged in master, can you try to use the master docker image and see if the problem disappears?

1 Like

Thanks a lot, @ned !

I am new to Docker, but this is what I ran:

docker run  -v C:/<path>/dist:/dist -i loadimpact/k6:master run dist/<test>.bundle.js

and still got the error. I suppose that means it’s not an error with K6.

I am not really sure what to make out of the error description: Used by the endpoint to indicate that the stream is no needed., neither where the Stream error: stream ID x part originates from. Do you have any potential advice in those regards?

I also noticed one more weird thing about the error. In my tests, I use the following logic to log output to console:

import { RefinedResponse } from "k6/http";
import { check } from "k6";

const isCodeSuccess = (code: number) => 199 < code && code < 400;

export const checkIfResSuccess = (res: RefinedResponse<any>) => {
  check(res, {
    "request successful": res => {
      console.log(`${res.request.method.padEnd(4)} > ${res.status} : ${res.url}`);
      
      const isSuccess = isCodeSuccess(res.status);
      if (!isSuccess) {
        console.log(JSON.stringify(res, null, 2));
      }
      return isSuccess;
    }
  });
};

export const batchCheckIfResSuccess = (res: RefinedResponse<any>[]) => {
  res.forEach(r => checkIfResSuccess(r));
};

In my last test run (the docker run with current master), I got the Stream error: stream ID x; CANCEL error exactly 400 times, but only 149 console.log entries with status code 0. 251 of the stream errors seem to not have been logged by checkIfResSuccessful, even though every batch() call is followed by batchCheckIfResSuccess(batchResponse);. When I tried copying urls printed by the stream error and searching for them in my script output, for some of the urls I found corresponding logged responses, but some of the urls were not present anywhere else in the whole script output (All of my request urls are unique, because I suffix them with `#vuId-{__VU}--iteration-{__ITER}`).

Is there any way to get additional useful output from K6, other than --http-debug and --verbose? (These two didn’t really help me.)

I use Windows to run the tests.

Any updates on this, @ned?

I am super sorry @mkurz, I seem to have missed your last message :disappointed: :man_facepalming: I can’t give any updates besides the one I already gave almost 2 months ago - we updated the HTTP/2 dependency we use, which seems to have fixed some of the errors like yours we’ve seen before for other people. This code is still not released in an official public version - it is slated to be released in k6 v0.30.0 in the next 2 weeks, so it’s still only available from the master docker image or by compiling k6 yourself.

Are you sure that you pulled the newest master image from Docker when you did your tests, and not something from before this fix? What does docker run --rm -t loadimpact/k6:master version return? And do you still see the same issue after you pull the latest image (docker pull loadimpact/k6:master) and re-run your test?

Because if that’s the case, then it’s either caused by some other undiscovered bug in k6 (or its dependencies), or a problem with your server…

I am not sure what RefinedResponse is, I assume some TypeScript wrapper around the k6 Response? Could the problem have come from that?

Alternatively, do you have any redirects in your requests? Because if you have, the Response you get would be for the last request in the redirect chain. You can prevent k6 from following redirects automatically by setting maxRedirects: 0 in the options.

Finally, instead of comparing the status, can you check the response.error and response.error_code?

Is there any way to get additional useful output from K6, other than --http-debug and --verbose ? (These two didn’t really help me.)

Besides logging stuff with console.log(), which you already seem to be doing, maybe saving all of your metrics to a JSON file with the JSON output would help? Enabling the vu, iter and ip system tags with the systemTags option will probably also be useful.

Hey @ned thanks again for your time and energy! No worries about the delay, it’s not really urgent for me… at least until problems with the server show up in practice. To your points, one by one:

Are you sure that you pulled the newest master image from Docker when you did your tests, and not something from before this fix? What does docker run --rm -t loadimpact/k6:master version return?

I since deleted the container, but I pulled and ran again (docker run -v C:\<path>\dist:/dist -i loadimpact/k6:master run /dist/scenario4-full.bundle.js), and, unfortunately, I still get the error.

docker run --rm -t loadimpact/k6:master version now returns k6 v0.29.0 (2021-01-13T16:09:02+0000/c00a85b, go1.15.6, linux/amd64).


I am not sure what RefinedResponse is, I assume some TypeScript wrapper around the k6 Response ? Could the problem have come from that?

RefinedResponse is just a TypeScript type, not a wrapper, it describes the return value of a k6/http request. I use it like this, after each request/batch request:

import http from "k6/http";

// for singular requests
response = http.get(/* request config */)
checkIfResSuccess(response);

// or for batch requests: 
batchResponse = http.batch<ArrayBatchRequest[]>([ /* request configs */]);
batchCheckIfResSuccess(batchResponse);

This way, I am hoping to log every request - at least its url and status code. But, as I mentioned, most of the URLs printed in stream error: stream ID 19; CANCEL errors never get logged by checkIfResSuccess, no idea why.


Alternatively, do you have any redirects in your requests?

I don’t think I did, but either way, setting maxRedirects: 0 seems to have not made any difference


Finally, instead of comparing the status, can you check the response.error and response.error_code ?

I tried to test this briefly, but ran into some troubles, I’ll get back to it in the morning and post the results, I need to go sleep now. I might also look into the JSON output.


One last thing: Would you be interested in the source code of the failing test? I can’t post it publicly, but I should be able to get a permission to share it with you personally, if you don’t share it with anyone else.

Either way, thank you once again, have a great time!

This is very strange… :confused: I guess it could be caused by Go’s net/http module logging things directly, without actually returning a error proper error to the k6 code that called it… :man_shrugging: We’ve seen such stupid things before with cookies, for example, so maybe we should review the stdlib code a bit here. stream error: stream ID 19; CANCEL is not really an error, I think, so it might be something like that. Or it might be something Windows-specific… :man_shrugging: More investigation is definitely needed.

I am not sure if the script would be very helpful, since this seems like a very low-level issue :disappointed:. Details about your server, network, and over-all environment would probably help a lot more - which server and version and OS are you using, are there any proxies and firewalls between k6 and the server, etc. But just in case, you can also send me the script as a private message here or in the k6 community slack, I am na-- there. I’d request permission to share it with the other 2 k6 developers, if necessary, but I won’t share it with anyone else.

Hello again, @ned , I got the respose.error and response.error_code logging to work. It seems that every failed requests has the following:

response.code === 0
response.error === "stream error: stream ID x; CANCEL" (where x is an odd number)
response.error_code === 1000

Details about your server, network, and over-all environment would probably help a lot more - which server and version and OS are you using, are there any proxies and firewalls between k6 and the server, etc.

I don’t really know any of this information, but I’ll try to get you as much of it as I can, along with the permission to share the script with you and your co-devs. I’ll get back to you here. Thanks!

Hi @mkurz,

You can try to run it with GODEBUG=http2debug=1 This will add a lot of logs about what k6 does under the hood including which stream is being used for what and so on. This will hopefully show you something that you can act on or at least show you after what is the “cancel” happening. You can also obviously remove the sensitive data and upload it here(or somewhere else and link it here), so we can try to help you.

This will be really noisy so I recommend redirecting to a file to be looked at later. I don’t what the syntax is on windows(as I haven’t used it in a while) but you should redirect both stdout and stderr to a file.

Hope this helps you.