Skip to main content
Question

How are you handling token persistence?

  • March 11, 2025
  • 30 replies
  • 28 views

I imagine a lot of people are using the Splunk Load Balanced destination with indexer discovery. How are you handling token persistence? We patch/recycle our Cluster Master monthly so looking for a method to "restore" the tokens without impacting data delivery.I believe tokens are stored in the kvstore, $SPLUNK_HOME/etc/passwd and $SPLUNK_HOME/auth/splunk.secret files.

30 replies

Raanan Dagan
  • Employee
  • March 11, 2025

<@U02QJ374Z3R&gt; there are many who are using Splunk LB, but as far as I know .. using Indexer discovery + updating the Indexer discovery often .. I am not sure how often most customers updating the token ..


Raanan Dagan
  • Employee
  • March 11, 2025

Once you update the Token, Cribl will not impact data delivery since .." Worker Process Rolling RestartDuring a restart, to minimize ingestion disruption and increase availability of network ports, Worker Processes on a Worker Node are restarted in a rolling fashion. 20% of running processes – with a minimum of one process – are restarted at a time. A Worker Process must come up and report as started before the next one is restarted. This rolling restart continues until all processes have restarted. If a Worker Process fails to restart, configurations will be rolled back. "


Raanan Dagan
  • Employee
  • March 11, 2025

with Cribl there are 2 ways (that I can think-off) to update the Indexer Discovery token. Manually in the UI or Cribl API


  • Author
  • Known Participant
  • March 11, 2025

I don't use Indexer Discovery, nor do I recommend it to clients. Just say no.


  • Author
  • Known Participant
  • March 11, 2025

<@U01J549PR6Y&gt; - it's not that I want to update the token regularly in Cribl (if at all unless a breach or something). But when I recycle my cluster manager the auth tokens get wiped on the Splunk side and would therefore be invalidated in Cribl indexer-discovery.. Granted this is more of a Splunk question, but figured enough Cribl customers leverage Splunk LB destination that someone has a solution for it.


  • Author
  • Known Participant
  • March 11, 2025

<@UEGNG8MJB&gt; any particular reason? Curious what better method exists to maintain a working list of online indexers.


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

i used DNS. one name, all the indexer IPs behind it.I'm curious why a CM restart trashes the token. Never seen that before.


  • Author
  • Known Participant
  • March 11, 2025

Not a restart, but swapping out the EC2 instance for updated AMI etc.


  • Author
  • Known Participant
  • March 11, 2025

We also recycle the indexers for the same reasons. (security compliance)


Raanan Dagan
  • Employee
  • March 11, 2025

As Jon said, most common that I have seen are the DNS alternative


  • Author
  • Known Participant
  • March 11, 2025

DNS. Cribl handles DNS round robin waaaay better than Splunk. And being an old-school network guy who prefers an NLB over any round robin DNS, that says something. The programmers did an excellent job of leveraging DNS entries that are loaded with IPs. And the Cribl load balance approach is better than Splunk's.


Raanan Dagan
  • Employee
  • March 11, 2025

I've found DNS is just reliable with Stream.


  • Author
  • Known Participant
  • March 11, 2025

Thanks. So then the 'discovery' config would look something like this?


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

correct!


  • Author
  • Known Participant
  • March 11, 2025

great....stay tuned. :smile:


  • Author
  • Known Participant
  • March 11, 2025

Worked like a charm. Thanks for the guidance on this, much appreciated.


  • Author
  • Known Participant
  • March 11, 2025

One follow up, which backpressure method best suits Splunk LB destination?The way I understand this setting, is that if the destination (ie Splunk indexers) cannot receive data:» block - buffer in memory on cribl workers » drop - /dev/null it » PQ - queue it on disk on the cribl workers until destination is accepting data


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

enable PQ


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

and set-up some constraints around the storage


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

i would enable compression too (not on by default iirc)


  • Author
  • Known Participant
  • March 11, 2025

Reading https://docs.cribl.io/stream/persistent-queues/#persistent-queue-details-and-constraints|this, it looks like the PQ's use workers' storage. Hard to but a number on the amount of storage a worker can allocate for PQ, since it's relative to the amount of data you'd be sending to the destination, right?


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

if you were doing 240 GB per day, that's 10 GB per hourdivided by 2 workers == 5 GB per worker; then factor in compression


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

yesrate of sending * expected down time / number of workers * compression expected


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

and add 50% :slightly_smiling_face:


  • Author
  • Known Participant
  • March 11, 2025

napkin math ftwthanks a lot Jon. I did set it to 5GB, since our workers are on Fargate and have 20GB of ephemeral storage by default.And since PQ is worst case sort of scenario, ie the entire indexer cluster is :dumpsterfire: , seems to be a safe setting.