Guidance on when to add additional processors due to a large number of consecutive tasks?

The sizing pages talk about scaling in regards to data volumes in and out. However, is there any guidance on when to add additional processors due to a large number of consecutive tasks (i.e. Collection Jobs)? Does each Collection Job run in its own worker process? On an 2 node worker group with 8 vCPUs/ea could we run into queuing issues if we had 50 or 100 collectors attempting to run at the same time?

Page 1 / 1

Jobs are broken into tasks which are put into a job queue and are taken off the queue in the leader node as the worker processes in the group request tasks to complete. In this manner, all the data that was discovered is distributed as evenly as possible across the worker group.

Something to consider is the limits page regarding the number of jobs/tasks that can be run concurrently: https://docs.cribl.io/stream/collectors-job-limits/

for larger collection use cases, i'd encourage a separate worker group dedicated to collection

Thank you all

It's best to avoid scheduling jobs in such a way that they run simultaneously. Some overlap may be unavoidable but the more processes that are available then the more tasks that can be executed to finish a job.

Reply

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded