Skip to main content

This article will cover a few of the ways you can clone data, and what the advantages of the choices are.

 

Cloning events can happen at the following places:

  • In the Routing table (most common)
  • Pre-processing Pipeline, before the events get to the Routing table
  • In a Pipeline and/or Output Router destination

As a refresher, the path an event takes through Cribl looks like this:

AD_4nXdYVhiy2W0kmrYmitTubnvoAVVAQMmR82bZYYm9A1JGAUoVeDXDRPeoSUa-1HdB_4f1rx8fv7dYof7p_6-DFMI6BP9I-BjtnYgdTZqiw9bC3AEdTcLlr7Y-saoFGlpwf2mkqn8mWg?key=u2fb0IYzUoTQg22pjdGX4hij

Where in the stack you choose to clone will most often be dictated by your particular use case requirements. For this exercise, imagine we have two analysis destinations. All events will land in General SIEM. But events flagged as being from “bad guys” need to also be delivered to the Special SIEM. In order to identify the baddies, we'll assume we have a list of IPs in a file that we can use as a lookup table.

 

Cloning In the Routing Table

 

This is the most common place to clone your events. First, a refresher on how Routes work in Cribl Stream. Each event percolates through the Data Routes table rules in order. If the javascript filter for a rule matches the event – returns truthy – the event will be processed by that rule. The specified pipeline will process the event, and the selected destination will receive the resulting event. If the final flag is checked, that's the end of that event's story. The event has been "consumed" by the routing rule.

 

When the Final Flag is unchecked, the processing actually happens on a clone of the event. The original, untouched event continues down to the following rules looking for another match. When you disable the final flag, you get a new widget on the row in the table indicating the event will be copied into this rule.

 

💡You also get a new section in the rule where you can add fields to the cloned event before it heads into the pipeline, and destination.

 

Implementation details for our use case:

  • Add a route with a filter of something like C.Lookup('my_baddies.csv','ip').match(src_ip). The final flag is not checked. What this means: If an event has a field named ip and it matches an entry in the lookup, we'll take a clone of that event and send it down the pipeline to the Special SIEM.
  • A following route rule would capture all the events, both good and bad, and send to General SIEM. No lookup required, we'll just use true. Final flag is checked for this one, meaning any matching events would be consumed by this rule.

Key Point: When using the Routing Table and Final Flag to clone events, the cloning is happening before your Pipelines and Destinations:

 

AD_4nXcBtJDnD5wDj8GCWEJKPPmGDSgcE8oFEt8zCGSkud2xA_zhUETrA3dehz-pVZbVTCH73RIRBdl2FV2Mb-p6Ew3MqazKT-TCp7KQa2PAa9aj67YhLCn5pEEyKWAeF_QA3lhP-d7XSA?key=u2fb0IYzUoTQg22pjdGX4hij

 

In most cases cloning makes sense in the routing table. But not always.

 

Pre-Route Cloning

 

Let's take another refresher on how Cribl works. Normally a Stream Pipeline is attached to a Route. (See the event lifecycle diagram above.) But you can also attach a pipeline directly to a source (IE, before the events are handed off to any routing decisions they are passed through a Pipeline). If you needed to clean up, normalize, or enrich data before it lands in the routing table, this is where you'd do it. We call this a pre-processing pipeline.

 

Organizationally and/or performance wise, this may make more sense than doing it in the routing table. Compare having simple rules that reference hinky vs a long JS expression with C.Lookup() calls. Or worse, multiple rules making C.Lookup() calls. Qualify once, qualify early.

 

The implementation details: 

  • Add a lookup function that checks the IP and creates a hinky field on match.
  • Add a Clone function for when hinky exists.
  • Add any other functions that make sense to add to the pipeline.
    • Note: The Clone function should happen after other functions that would be common to both events.

AD_4nXdk-AH3rwYHzv6Sewn73lpasTUM4SBpR_IyKpPcI_yU7P08OPuLXr8FSP3AqDBWV46tOnC0SMpf7RYzyg8hMjn86vC3kVLFSfn44tRp2SdYKU-UWGu7otawWVgfHFlZMrsjIzG_?key=u2fb0IYzUoTQg22pjdGX4hij

  • In the Routing table:
    • Send events tagged as hinky to Special SIEM (Final Flag checked).
    • Send all events to General SIEM (Final Flag checked).

AD_4nXeFC4mRWFGoACE2BOphcSV7BzlVJvDssGze1uJIvAjNOWXkU3e4z30h442Mgng7DX7FuikO8qfY6Emc92665G8DATOPKP7BdrJ59QprN8ILudTFD0w_K_e-m275IIezk0hIaib_Ow?key=u2fb0IYzUoTQg22pjdGX4hij

 

Post-route Cloning and Output Routers

 

For whatever reason, we've decided pre-processing isn't for us, and cloning with Final Flag controls wasn't right either. Instead, we want to control cloning in our Pipeline.

 

In this case, our Pipeline receives a single copy of the event from the routing table. Within the pipeline we can decide which events, and when the events, are cloned. As before, try to complete all your common work before cloning so you don't cause extra work. We'll perform the baddie lookup in the filter for the Clone function, so we only clone the event if it's a baddie. In the Clones section in the Clone function, set a field to ID the new event. We can use hinky => true again.

 

From the Clone function onward in the pipeline, there will be 2 events making their way through the rest of the rules, and out to the destination.

 

Normally you don't want 2 events going to the same SIEM. This is where the Output Router comes into play. Think of the Output Router as another mini-version of the routing table. In an Output Router, you can point events to other destinations based on filters, and check (or uncheck) final.

 

You can use OR in a few ways. Let's pick up where we left off above with a cloned event in the pipeline. We now have two events exiting the pipeline, headed to the OR. One has a hinky field, one does not. In the OR we add a destination entry with a filter of hinky. Send it to Special SIEM, and flag it as final. Then add another destination to the OR, filter of true, and flag it as Final:

 

AD_4nXfSODyujgapwoTRge1sbE5xfLV0IsdMdfye7V5wLl4bVqH1mQk_IUfF0rqeGGahmUQgCLGMtp2uTykm-N2hG-AxeAjN2Ouf-3jobI9exk9RVCFn5tqHx3-Bo1mVHD1KIoT0pnod-Q?key=u2fb0IYzUoTQg22pjdGX4hij

 

For another take on the OR, let's rewind a bit. Remove the Clone function from the pipeline. It’s an ordinary pipeline now: one event in, one event out. When it lands in the OR, you'll have one destination entry pointed to Special SIEM with the final flag unchecked. Then there’s a second rule, importantly listed after the first, with a delivery to General SIEM with the final flag checked. Now the cloning happens in the OR itself. You can, of course, adjust the filters as needed in the OR. For example, the C.Lookup() function could be in there to only send baddie events to the Special SIEM:

 

AD_4nXdu6wJ-37iUvRP242OZL_2MLfp48-sCN52Mj6A7FVa6SQh_2ohxKO4P1HHz-c3fnroqoSJz8kFz4QgO6YGedOEDk-k4MtLkw_YWEdQfw9rWPAqi4bY9CueuOEAgiHFBL4ojzXqJXQ?key=u2fb0IYzUoTQg22pjdGX4hij

 

How to Choose

 

There are 2 primary factors to consider when deciding where to clone.

 

First: performance. Try to stage your clones after you've parsed, enriched, filtered, etc. If you clone before these activities, you'll do them all twice. 

 

Second: organization. Ease of management should not be understated. In some situations it ranks more important than performance. If it means the team can jump in and make changes without fear of a house of cards collapsing, that may be worth a few CPU cycles. If management ease is your priority, pick the cloning point that makes the most sense to you.

 

Conclusion

 

Any time you start cloning events you're going to have extra layers of complexity. Cribl helps by providing the flexibility to choose what stage you clone, how you identify them, and how you deliver them. Where it works best for your situation is all that matters.

Be the first to reply!

Reply