Skip to main content
Solved

What Is The Simplest, Most Direct Way To Observe When A Node Becomes Disconnected From The Leader, Preferably When It…

  • May 6, 2026
  • 8 replies
  • 1 view

This message originated from Cribl Community Slack.
Click here to view the original link.

I've poked around a bit in the #channel history and didn't find a direct answer to this, so forgive me if I missed it. What is the simplest, most direct way to observe when a node becomes disconnected from the leader, preferably when it has been disconnected for x consecutive minutes? We're ingesting Cribl Stream logs into Splunk. Is this information in those? If not, which server-side logs might contain this information?

Best answer by fetaboy918

Have you tried to create an alert in Splunk to monitor the last event per server from Cribl Edge where you filter like, latest_time > 2h for eg? That's what we do anyway. You will catch log ingestion delay + possible disconnection.

8 replies

I believe there is some debug logging. I'm just going to confirm.

It won't really be feasible to rely on that given that it logs to cribl.log.

I could create a "No Data Received" Notification on a common source, by Fleet, but that seems like an indirect way of creating observable data when it seems like there should already be some in the server-side logs. Surely, events get logged on the leader to correspond to 'Connection Status = Disconnected' and 'Disconnected at'?

Links for this message:
image.png

mmarker
  • Known Participant
  • May 6, 2026
I am also interested in this concept because I just had 90 edge nodes drop off out of nowhere...

mmarker
  • Known Participant
  • May 6, 2026
A bunch of my cribl edge and stream went down exactly at 2026-04-22 19:46:46

We're migrating off the Splunk UF to Edge and would love to alert and/or report on 'Connection Status = Disconnected' nodes. We can massage the equivalent information from our Splunk Deployment Servers (index=_internal) today. I'm confident you guys wouldn't have a blind spot that Splunk doesn't have! :wink:

  • New Participant
  • Answer
  • May 6, 2026
Have you tried to create an alert in Splunk to monitor the last event per server from Cribl Edge where you filter like, latest_time > 2h for eg? That's what we do anyway. You will catch log ingestion delay + possible disconnection.

That might have to work, in cases where we're talking about servers and not workstations (since the latter are prone to having natural delays when powered down). Thanks!