Skip to main content
Question

Having hard luck with parsing logs - tips?

  • March 11, 2025
  • 4 replies
  • 42 views

Hi, I have gone through the cribl university user and admin courses and setting up the product is quite easy however am constantly stuck trying to parse or extract strings of data into fields so that we can search or use the information.

For systems which send in JSON or CSV it is very simple, however most of ours are embedded Linux or network devices which give something like the following:

<183> 02/06/2025:00:24:20 GMT NYDC1VPX01-DMZ 0-PPE-0 : default SSLLOG SSL_HANDSHAKE_SUCCESS 9838002 0 : SPCBId 33168208 - ClientIP 10.1.1.130 - ClientPort 57486 - VserverServiceIP 10.200.80.80 - VserverServicePort 443 - ClientVersion TLSv1.2 - CipherSuite "TLS1-AES-256-CBC-SHA" - Session New - HandshakeTime 47 msShow

What combination of functions can split this out into fields such as hostname, clientip, clientport, etc.

I have tried using ChatGPT to help and can get regex and grok expressions from the AI however these don't work when applying to cribl. I don't see many questions on this topic which makes me think it is very easy for everyone else and I am doing something wrong!

4 replies

Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

I'd use Regex Extract maybe for the first bit, up to SSL_HANDSHAKE_SUCCESS (as eventtype):

^\<(?<pri>\d+)\>\s*(?<time>\S+\s[A-Z]+)\s(?<host>\S+)\s\S+[\s:]+default\s(?<logtype>\S+)\s(?<eventtype>\S+).*?-\s(?<therest>ClientIP.*)

So you'll get time, host, logtype and eventtype, plus the rest of the log in therest

Now a new RegEx extract with the source field set to therest:

(?<_KEY_0>\w+)\s(?<_VALUE_0>[^\-]+)

Set the g flag on this regex. You should get all the fields extracted from the remainder of the log


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

Try this pipeline:

{  "id": "curious-syslog",  "conf": {  "output": "default",  "streamtags": [],  "groups": {},  "asyncFuncTimeout": 1000,  "functions": [  {  "filter": "true",  "conf": {  "comment": "Extract the \"header\" of the log first"  },  "id": "comment"  },  {  "filter": "true",  "conf": {  "source": "_raw",  "iterations": 100,  "overwrite": false,  "regex": "/^\\<(?<pri>\\d+)\\>\\s*(?<time>\\S+\\s[A-Z]+)\\s(?<host>\\S+)\\s\\S+[\\s:]+default\\s(?<logtype>\\S+)\\s(?<eventtype>\\S+).*?-\\s(?<therest>ClientIP.*)/"  },  "id": "regex_extract"  },  {  "filter": "true",  "conf": {  "comment": "Now extract the key value pairs from the rest"  },  "id": "comment"  },  {  "filter": "true",  "conf": {  "source": "therest",  "iterations": 100,  "overwrite": false,  "regex": "/(?<_KEY_0>\\w+)\\s(?<_VALUE_0>[^\\-]+)/"  },  "id": "regex_extract"  },  {  "filter": "true",  "conf": {  "comment": "Rebuild into raw json payload (optional)"  },  "id": "comment"  },  {  "filter": "true",  "conf": {  "type": "json",  "dstField": "_raw",  "fields": [  "!cribl",  "!_*",  "!source",  "!therest",  "*"  ]  },  "id": "serialize"  },  {  "filter": "true",  "conf": {  "keep": [  "_*",  "cribl*"  ],  "remove": [  "*"  ]  },  "id": "eval"  }  ]  }}

  • Author
  • New Participant
  • March 11, 2025

Wow, did you do that off the top of your head, or are there some tools which can help create this?


Jon Rust
Forum|alt.badge.img
  • Employee
  • March 11, 2025

The pipeline was created in Cribl Stream, and then just exported.

The regex… I've had a 30+ year relationship with regex. :-) I dream in regex.