[RPKI] Grafana dashboard

Tim Bruijnzeels tim at nlnetlabs.nl
Wed Jun 15 12:07:53 UTC 2022


Hi Cristian,

> On 15 Jun 2022, at 13:29, Cristian Cardoso <cristian.cardoso11 at gmail.com> wrote:
> 
> Hi
> 
> As I mentioned some time ago, my certificates are expiring every 3 months, so I'm trying to somehow monitor as many items as possible to try to find out where the problem is, so I'm trying to create a dashboard with the information in grafana.

I am not sure whether grafana is a good monitoring tool for this. Perhaps it is.. like I said I would really like to know what people would like to see here and/or in some more dedicated end-point.

For the moment what may help you is to look at the following gauges:

- krill_ca_parent_success

Example on our testbed system:

# HELP krill_ca_parent_success status of last CA to parent connection (0=issue, 1=success)
# TYPE krill_ca_parent_success gauge
krill_ca_parent_success{ca="testbed", parent="ta"} 1

A value of 0 would indicate a problem. Note that the gauge is multi-dimensional: i.e. there can be more than 1 CA in krill and each CA can have more than one parent. Though in *your* setup you may have only 1 CA with 1 parent it still names that ca and parent.

- krill_ca_ps_success

Example on our testbed:

# HELP krill_ca_ps_success status of last CA to Publication Server connection (0=issue, 1=success)
# TYPE krill_ca_ps_success gauge
krill_ca_ps_success{ca="ta"} 1
krill_ca_ps_success{ca="testbed"} 1

Again, a value of 0 would indicate a problem. The CAs are named. At this point CAs in Krill can only use one publication server so there is no ps=".." in this case.


There is a CLI and API end-point to see per CA issues, but the call is authenticated:
https://krill.docs.nlnetlabs.nl/en/stable/cli.html#krillc-issues

Example API call:

$ krillc issues --ca newca --api
GET:
  https://localhost:3000/api/v1/cas/newca/issues
Headers:
  Authorization: Bearer secret


I think that it may be useful if also exposed the CA specific issues (i.e. trouble connecting to parent/publication server) end-point on an unauthenticated path for monitoring purposes in future. Providing some additional mapping is easy to do and would allow for unauthenticated monitoring tools to scrape for issues more easily. E.g.:
https://localhost:3000/issues/newca

In case people feel that this would create an unwanted exposure, please let me know.. I think this should be okay, because the information is not that sensitive - and you can shield access to this path on a proxy.

Tim






> 
> Em sex., 10 de jun. de 2022 às 12:11, Tim Bruijnzeels <tim at nlnetlabs.nl> escreveu:
> Hi Cristian,
> 
> On 9 Jun 2022, at 15:50, Cristian Cardoso via RPKI <rpki at lists.nlnetlabs.nl> wrote:
> > 
> > Hi
> > 
> > Does anyone use Krill metrics in prometheus and generate dashboards in Grafana?
> 
> As you probably found out the krill prometheus metrics are documented here:
> https://krill.docs.nlnetlabs.nl/en/stable/monitoring.html
> 
> We are not using Grafana for krill ourselves since most of the metrics aren't really things that are that interesting to watch over a time series.
> 
> But let me turn this around into a question:
> What are the krill metrics that you would want to monitor and graph?
> 
> > Does anyone happen to know if it is possible to convert the unix timestamp to some date value supported in Grafana?
> 
> I think I had some limited success in the past, but I am a grafana newbie and forgot.
> 
> Tim
> 
> 
> > -- 
> > RPKI mailing list
> > RPKI at lists.nlnetlabs.nl
> > https://lists.nlnetlabs.nl/mailman/listinfo/rpki
> 



More information about the RPKI mailing list