Skip to main content

I have created an event breaker rule that works in the knowledge area of cribl stream. But when I run a filesystem collection job to pick up the same file I used to create the event breaker, it does not work.

Event breaker rules in

63_cf0f405fc60f4619bc0fc1c09e729d2d.png

Andrew,
I have a feeling your header line regex is matching the first #separator line. I can test it out but I think maybe you would want to change your header line to ^#[Ff] so we ignore the lines before the fields line.


Can you validate youve committed and deployed?


Yeah sorry I think the header line is actually excluding everything that starts with #. What do your first few events look like on the file import?


Sorry, I mean what do the first few events look like in Stream when you run the job? You had a screenshot above but it started at event 7, just wondering how the first few events look.


The breaker works for me in the preview, but not when I run an actual collection.

sample.tsv: UTF-8 Unicode (with BOM) text, with CRLF line terminators


It is being pulled from an NFS mount (Synology NAS) on a linux (ubuntu 20.04) vm.


Which is strange since it is showing you are hitting the tsv-bro breaker.


How big is the file you are collecting?


The filetype is just a variable on the path for a filesystem. I have about 15 different "filetypes"… Im hitting the right breaker as seen in my collection. Im on cribl 3.4.1 for both leader and worker.


Its 10,000 line sample, but when the file actually comes in its about 10-50 GB


Can you try bumping your max event size to the max 134217728 (128MB). How big are your working IIS logs?


You could also try to bump up the event breaker buffer timeout on the file system collector.


The IIS logs would have been about the same size, 10,000 line sample. Each event about the same size. Those whole files are anywhere from 100KB to 2-3GB. Any recommendation on the buffer timeout?

I will recreate the whole event breaker tomorrow using the same config and max size you recommend, and test again in case I flubbed something else up somewhere. Ill let you know!


Im still having the issue. This was my order of operations to fix/recreate issue.

Increased event breaker buffer timeout on collector source to 600000, commit, deploy.
Unsuccessful.
Delete event breaker, commit, deploy.
Recreate event breaker with settings you recommended to try, commit, deploy.
Filesystem collection has same results/symptoms.
I connected to the worker ui from the leader, and verified the breaker exists on the worker. The file preview with the breaker on the worker works.

I checked logs for the specific adhoc run and I only see 1 error.

{
"time": "2022-05-04T13:29:35.668Z",
"cid": "api",
"channel": "Job",
"level": "error",
"message": "failed to cancel task",
"jobId": "1651670972.72.adhoc.NDCA-collector",
"taskId": "collect.0",
"reason": {
"message": "Instance 1651670972.72.adhoc.NDCA-collector|collect.0 not registered",
"stack": "RpcInstanceNotFoundError: Instance 1651670972.72.adhoc.NDCA-collector|collect.0 not registered\n at /opt/cribl/bin/cribl.js:14:13169102\n at /opt/cribl/bin/cribl.js:14:11427356\n at runMicrotasks ()\n at processTicksAndRejections (internal/process/task_queues.js:95:5)\n at async k.handleRequest (/opt/cribl/bin/cribl.js:14:13168338)",
"name": "RpcInstanceNotFoundError",
"req": {
"instanceId": "1651670972.72.adhoc.NDCA-collector|collect.0",
"method": "cancel",
"args": []
}
},
"source": "/opt/cribl/state/jobs/default/1651670972.72.adhoc.NDCA-collector/logs/job/job.log"
}


That just looks like the worker was restarting. Have you tried collecting a small file ~20 events?


Yeah, the error looked benign, but just providing info.

I edited the file down to about 20 lines, changed EOL to LF from CRLF. Same symptoms.


I "solved" the issue. This may need to become an engineering ticket.

I looked back at the file encoding…
sample.tsv: UTF-8 Unicode (with BOM) text, with CRLF line terminators

There may be an issue with the filesystem collector interpreting the byte order mark?

I used notepad++ to remove the BOM and encoded it as just UTF8 and it worked.
This is a band-aid fix for me, as converting the encoding of 50-100GB of files each day prior to ingest is not particularly scalable or effective.

Thanks Dan for the assist!


Yes encoding has hit me a few times. My next suggestion was to run head on your file and see if you had any extra stuff at the beginning. I will add this to a feature request I already had in for supporting additional encoding on file system collector.


Filesystem collection results (events do not have correct fields/values)

63_b6549325f19647d39b89290d44605c9c.png

Event breaker rules out

63_81df795f395a48d98b47d8df1d7a83ac.png

I want it to exclude the # lines since those are not events. The first real events are tab separated and the field names are in that field list. I tried changing the header line to "^#[Ff]" the event breaker preview completely fails.

63_0a8dd1f9226c4180936c0a6a50c92db5.png

Heres the first few lines of the file. With my original settings, the import looks fine and field/value pairs look good. But when run with a filesystem collector, it fails. Im going to try with the header line changes that you recommended.

#separator \x09#set_separator  ,#empty_field  (empty)#unset_field  -#path  conn#open  2021-11-12-12-45-00#fields ts  uid id.orig_h  id.orig_p  id.resp_h  id.resp_p  id.vlan id.vlan_inner  proto  service duration  orig_bytes  resp_bytes  conn_state  local_orig  local_resp  missed_bytes  history orig_pkts  orig_ip_bytes  resp_pkts  resp_ip_bytes  tunnel_parents  orig_cc resp_cc suri_ids  community_id#types  time  string  addr  port  addr  port  int int enum  string  interval  count  count  string  bool  bool  count  string  count  count  count  count  set[string] string  string  set[string] string2022-05-01 00:00:00.000012  CUaMDI3N3CtEwGXbX9  128.83.27.4 46210  170.114.10.87  443 4020  \N  tcp \N  65.207075  0  6218  SHR 1  0  0  ^hdf  0  0  9  6590  \N  US  US  \N  1:1nbEONdQpmuQtjlL3SSQbc28Wyo=2022-05-01 00:00:00.000320  CAZzJv4QRVv5Yek7Oh  128.83.130.204  54935  58.247.212.36  53  4020  \N  udp dns \N  \N  \N  SHR 1  0  0  ^d  0  0  1  156 \N  US  CN  \N  1:KJjQRZuB5bkT7+ebSf4FW7RJiL8=2022-05-01 00:00:00.000432  CdRza81SzhESDDyhI9  128.83.72.175  58632  192.111.4.106  443 4020  \N  tcp ssl 376.280685  1458  6534  S1  1  0  0  ShDd  3  1590  7  6826  \N  US  US  \N  1:ZqDFOlfGk/8wlEO1gmawxhE6YBg=2022-05-01 00:00:00.001140  CAcMyE40njQ2DatMNc  128.83.28.30  59755  205.251.197.3  53  4020  \N  udp dns \N  \N  \N  S0  1  0  0  D  1  140 0  0  \N  US  US  \N  1:SeSWa3fEVB/I60glsRug0PmDPys=

Sorry, the first few events are the commented rows from the log. Exactly as they appear in the log.

63_756f0d672b2b4c1ba8591e589e192265.png

I ran what you sent above through a collector with what I think is the same breaker as you and it worked.

63_02a421a0a1394ed687b621e07e8271d0.png

The next thing I would check is what type of encoding you have on your file, is this pulling from a linux machine? If so run a file testfile.tsv on your test file.


UTF-8 should be fine.

63_4152161f3b074d248e87507cb58e3a4e.png

This is using the breaker I posted above and pulling with a file system collector.

The only difference is my filter. When I add a ‘filetype field on the collector and use bro with your filter it breaks it. Where are you adding filetype?


Dan, that does make sense, Ill reconfigure that one and reply back. I have an example where that would be inconsistent if thats the case.

This event breaker works, collects, and extracts correctly. Event though the first few lines match ^# as well.

63_6da4b73f62b445198a15df164e10de71.png

Jon, yes I have. On multiple occasions, for each attempt. I had restarted the worker too just in case.


Reply