Tracking errors in influxdb/grafana

I’m using influx db to capture my metrics and want to be able to capture error statistics.

I’ve setup a small test which does three transactions.

  1. accountsAuthenticate - passes 2 of 2 checks
  2. betslipEnquiry - passes 2 of 2 checks
  3. betslipSubmit - fails 1 of 2 checks

When I visualize using this grafana dashboard (k6 Load Testing Results dashboard for Grafana | Grafana Labs) and it indicates there were three transactions but no errors.

Imgur

I’m ok with grafana when using Elastic Search as a data source but my experience with influxdb is low and I cant even find a good tool to browse data with.

Any tips?

1 Like

Hi @BobRuub

Welcome on the support forum :tada:

I’m not entirely sure if I understand what your exact inquiry is. I want to avoid making too many assumptions, so could you please elaborate, so I can support you more effectively? My understanding is that you would like the checks per second panel to also display failed checks per second? Could you precise what you intend to achieve? :slight_smile:

A couple of generic pointers I can give with my partial understanding of what you’re trying to achieve:

  • InfluxDB exposes its data as an HTTP rest API. There must be some more fancy tools existing, but in general, you can install the influx and flux command line tools (depending on the influx version you’ve installed) to interact with it. Furthermore, the Influx querying API is well documented, you might find pointers there.
  • Grafana has a query builder helper too. If you click on a panel’s title->edit, Grafana shows the query it uses, and offers a (in my opinion) pretty intuitive way to build and modify queries. I find the UI is good at letting the user know what is possible and, what is, not. Not being myself too comfortable in Influx, I was able to modify the dashboard you’ve linked pretty easily using the query builder, indeed.

Let me know if that’s helpful. I’ll keep an eye out for more previsions on what you’re trying to achieve :+1:

Thanks for the response.

I’ll look into the influxdb api and the grafana side. Was sorta hoping for some sort of browser like db visualiser but they don’t really exist. The tools I do have, chronograf, assumes you know how the data is structured which is pretty much my problem, much more used to elastic search or sql rather than time series. I’m a visual learner, could read doco all day but seeing it makes it real.

As for my actual problem, I really want to be able to see the following.

  1. a graph of successful, as in passed all checks, transaction by name.
  2. a graph of un-successful, as in failed one or more checks, transaction by name.
  3. a table of transaction failures with transaction name, http code and failure description.

Thinking about my example I am adding a trend metric for each attempted transactions whereas I can solve 1 and 2 should by adding a trend metric for each success and a trend metric for each failure.

 const urlCheckResponse = check(urlResponse, {
    'status is 200': (r) => r.status === 200,                     // contains http 200 OK
    'response body': (r) => r.body.indexOf('accountNumber') !== -1,   // contains string accountNumber
  });

if (urlCheckResponse ){
    accountsAuthenticate.add(urlResponse.timings.duration);
} else {
    accountsAuthenticateFail.add(urlResponse.timings.duration);
} 

Not sure how’d I’d solve 3 as yet.

Thanks a lot for the clarifications. I’ve been able to conduct some research and experimentation which, I believe, might be helpful to you. Let me know if that’s the sort of things you were indeed looking for.

Using a test script of my own, which tries to reproduce a workflow roughly similar to what you showed:

const products = new Array("hoodie", "beanie", "belt", "cap", "polo");

export default function () {
  const randomProduct = products[Math.floor(Math.random() + products.length)];
  const response = http.get(
    "http://ecommerce.test.k6.io/product/" + randomProduct + "/"
  );

  check(
    response,
    {
      "is status 200": (r) => r.status == 200,
      "text verification": (r) => r.body.includes(randomProduct),
    },
    {
      status: response.status,
      name: "this is request A",
      failure_reason: response.error,
    }
  );

  sleep(Math.random() * 5);
}

N.B I add tags to my checks here. One of the response status, one for what I interpreted as what you call “transaction name”, finally one for the reason of a failure (if empty, then no failure; we can filter it out later).

My test setup runs the script, sends the output to InfluxDB, and graphs the results in Grafana using the dashboard you pointed out. I’ve added three panels to Grafana to try to address your need based on my script test modifications.

To address your point 1.:

To address your point 2.:

To address your point 3:

Although I’m more used to Prometheus, my experience of Time Series databases is that although we call them “databases”, they’re really “tagged series”. I find this helpful to think of them in different terms because that helps to adjust our mental models. InfluxDB also reused some terminology from the SQL world, which I believe leads to unjustified expectations regarding how to interact with their system. That’s unfortunate. I found during this investigation that InfluxDB has a great documentation page explaining how things are described under the hood.

Let me know if that’s helpful :bowing_man:

Very Helpful,

I’ve updated most of my scripts to cater for this and in general works really well.

However, seeing unexpected behavior in some circumstances. for e.g. I perform a http delete and a http 204 response time is acceptable, however, still throws an error.

Grafana Dashboard

K6 Code Snippet

  const urlResponse = http.put(Url.toString(), Payload, headerParams);
  const urlCheckResponse = check(urlResponse, {
    'status is 204': (r) => r.status === 204,                     // contains http 204 OK
  },{
    status: urlResponse.status,
    name: "blackbooksDelete",
    failure_reason: pAccountNumber + " : non http 204 response",
  });

Any clues as to what I’m doing wrong?

SORTED.

Was assuming anything not = 200 was a failure, changed it to anything less than 200 and greater then or equal 300 and it works fine.

Still trying to get my head around influxdb as a concept but starting to make sense.

1 Like