Skip to main content
Question

Extracting values from a netscaler log

  • March 11, 2025
  • 21 replies
  • 25 views

this is beyond my skillset, so was hoping someone could help me. I have netscaler logs that come in that are key<space>value and was hoping there was a regex to extract them all.Goes like this:```Source 1.2.3.4:18356 - Vserver 4.5.6.7:389 - NatIP 8.9.0.1:18356 - Destination 6.6.6.6:389 -```That's just one sample. The rest are sending in a similar format. Was looking for one regex to extract them and put them in fields.

21 replies

wow. Thanks, so much!


hey <@ULBGHDPNY&gt; - thanks for the help here. Was wondering if I could run another one by you. You seem to be the regex expert on this channel.I have a feed coming in that uses key-value events. The ones with strings are enclosed in quotes ("), which makes them useless in Spunk for the TERM command. I'd like to remove the quotes, if they exist from each key-value pair. Also, if possible, replace spaces in the value with underscores. Is that possible?Here's an examples:```1.2.3.4 devname="DEVNAME" devid="AZEVTM21000028" eventtime=1681220885920164816 tz="-0400" logid="0000000013" type="traffic" subtype="forward" level="notice" vd="root" srcip=4.5.6.7 srcport=50239 srcintf="port2" srcintfrole="undefined" dstip=7.8.9.0 dstport=10000 dstintf="port5" dstintfrole="undefined" srccountry="Reserved" dstcountry="Reserved" sessionid=1316660221 proto=6 vrf=1 action="timeout" policyid=42 policytype="policy" poluuid="65eAAA79a-88Z1-51ec-87EGF8-61fd6d21915b" policyname="AWS-POLICHY-TO-MADEUP-Subnets" service="TCP-1000" trandisp="noop" duration=10 sentbyte=52 rcvdbyte=0 sentpkt=1 rcvdpkt=0 appcat="unscanned" testfield="SOME TEXT"```


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

No regex required for the first ask. Just use the Parser function, Operation mode in reserialize, Type K=V, source field _raw. Boom. Quotes gone


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

To combine the second ask, change the Parser to extract, saving to a new field, call it `parsed`. Then use Mask function to clean up spaces. Then Serialize function to change back to K=V. Finally, and Eval to drop the `parsed` field.```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "mode": "extract", "type": "kvp", "srcField": "raw", "cleanFields": false, "allowedKeyChars": [], "allowedValueChars": [], "dstField": "parsed" }, "id": "serde" }, { "filter": "true", "conf": { "rules": [ { "matchRegex": "/\s+/g", "replaceExpr": "''" } ], "fields": [ "parsed" ], "depth": 5 }, "id": "mask" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "" ], "dstField": "_raw", "cleanFields": false, "srcField": "parsed" }, "id": "serialize" }, { "filter": "true", "conf": { "remove": [ "parsed" ] }, "id": "eval" } ] }, "id": "scottB"}```


awesome. I'll try this out. Thanks so much.


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

result sample:


did the first part. Amazing. Love this thing!


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "mode": "extract", "type": "kvp", "srcField": "raw", "cleanFields": false, "allowedKeyChars": [], "allowedValueChars": [], "fieldFilterExpr": "value != null && value != 'undefined'", "dstField": "parsed" }, "id": "serde" }, { "filter": "true", "conf": { "srcField": "parsed.eventtime", "dstField": "_time", "defaultTimezone": "local", "timeExpression": "time.getTime() / 1000", "offset": 0, "maxLen": 150, "defaultTime": "now", "latestDateAllowed": "+1week", "earliestDateAllowed": "-420weeks" }, "id": "auto_timestamp" }, { "filter": "true", "conf": { "rules": [ { "matchRegex": "/\s+/g", "replaceExpr": "''" } ], "fields": [ "parsed" ], "depth": 5 }, "id": "mask" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "!eventtime", "!tz", "" ], "dstField": "_raw", "cleanFields": false, "srcField": "parsed" }, "id": "serialize" }, { "filter": "true", "conf": { "remove": [ "parsed" ] }, "id": "eval" } ] }, "id": "scottB"}```


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

(tz doesn't make sense in the context of epoch time -style timestamps)


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

^^^ my simple clean up recs» Ditch fields that are empty or 'undefined'» use eventtime for _time, but then drop it and the pointless tz field


yeah, I'm debating dropping the time. Need to run it by the analysts. They'll probably want to keep it until they feel more comfortable.


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

you're keeping it :slightly_smiling_face:in _time


Didn't realize just how many undefined values were in this feed. Dropped. Gotta love it.


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

quotes, undefined fields, and the time field, i'd guess you whack 20%+ from the overall volume


hey <@ULBGHDPNY&gt; - sorry to keep bugging you. I have one last one,, if that''s ok. I have a key<space>value feed and I want to turn it into a key=value format. I tried what you mentioned above and a few other things, but that didn't work. Any suggestions?


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

sure, give me a few minutes to wrap up a call


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

can you provide a sample event with this format?


sure


```Apr 11 17:29:09 host1 shd_logs_bdc1nx: Status: CPULd 3.4 DskUtil 7.4 RAMUtil 17.7 Reqs 171 Band 82695 Latency 68 CacheHit 7 CliConn 19821 SrvConn 20379 MemBuf 98 SwpPgOut 54243 ProxLd 31 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 1.1 SophosLd 15.7 McafeeLd 0.0 WTTLd 0.0Apr 11 17:29:03 host2 shd_logs_bdc3nx: Status: CPULd 4.9 DskUtil 8.0 RAMUtil 18.4 Reqs 184 Band 401701 Latency 147 CacheHit 4 CliConn 19516 SrvConn 20169 MemBuf 98 SwpPgOut 56092 ProxLd 45 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 0.0 SophosLd 19.2 McafeeLd 0.0 WTTLd 0.0Apr 11 17:29:00 host3 shd_logs_ndc2nx: Status: CPULd 1.3 DskUtil 5.0 RAMUtil 14.6 Reqs 0 Band 0 Latency 4 CacheHit 0 CliConn 7 SrvConn 10 MemBuf 63 SwpPgOut 0 ProxLd 0 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 0.0 SophosLd 0.0 McafeeLd 0.0 WTTLd 0.0```


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

My take on it:» Extract the part that has the KV pairs into `payload`» Use regex extract on `payload` with KEY_0 and VALUE_0 shenanigans.» Use Serialize to push those extracted fields back into raw as K=V» use eval to clean up the mess```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "source": "_raw", "iterations": 100, "overwrite": false, "regex": "/Status: (?<payload>.*)/" }, "id": "regex_extract" }, { "filter": "true", "conf": { "source": "payload", "iterations": 100, "overwrite": false, "regex": "/(?<_KEY_0>\S+)\s+(?<_VALUE_0>\S+)/" }, "id": "regex_extract" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "!", "!cribl_breaker", "!host", "!source", "!payload", "!index", "" ], "dstField": "_raw", "cleanFields": false }, "id": "serialize" }, { "filter": "true", "conf": { "keep": [ "_raw", "_time", "source", "index" ], "remove": [ "*" ] }, "id": "eval" } ] }, "id": "scottb2"}```


thanks. Perfect. I can learn a lot from this.