Is it possible to POST to the API to trigger a collector?

Question

Is it possible to use the REST API to run a collector job?

Brendan Dalpe · Accepted Answer

It is possible! You can find more details on the Cribl API Documentation, but Ive listed the general steps below.

You will need to obtain the JSON configuration of the pre-configured collector. NOTE: If the collector doesnt already exist you will need to create it before running these steps!

GET /api/v1/m/<worker-group-name>/lib/jobs returns the JSON of the collector configurations. Find the one with the corresponding id field.

{  "items": [  {  "type": "collection",  "ttl": "4h",  "removeFields": [],  "resumeOnBoot": false,  "schedule": {},  "collector": {  "conf": {  "discovery": {  "discoverType": "http",  "discoverMethod": "get",  "itemList": [],  "discoverDataField": "entry",  "discoverUrl": "`https://1.2.3.4:8089/services/search/jobs`",  "discoverRequestParams": [  {  "name": "output_mode",  "value": "`json`"  },  {  "name": "search",  "value": "`\"search index=_internal\"`"  }  ]  },  "collectMethod": "get",  "pagination": {  "type": "none"  },  "authentication": "login",  "loginUrl": "`https://1.2.3.4:8089/services/auth/login?output_mode=json`",  "loginBody": "`username=${username}&password=${password}`",  "tokenRespAttribute": "sessionKey",  "authHeaderExpr": "`Splunk ${token}`",  "username": "admin",  "password": "redacted",  "collectUrl": "`${id}/results`",  "collectRequestHeaders": [],  "collectRequestParams": [  {  "name": "output_mode",  "value": "`json`"  }  ]  },  "destructive": false,  "type": "rest"  },  "input": {  "type": "collection",  "staleChannelFlushMs": 10000,  "sendToRoutes": true,  "preprocess": {  "disabled": true  },  "throttleRatePerSec": "0",  "breakerRulesets": [  "splunk_test"  ]  },  "id": "splunk",  "history": []  },  ...  ]}

Use this data to POST back to /api/v1/m/<worker-group-name>/jobs with an added run field with the configuration.

For example, I want to run this collector in preview mode:

{  "type": "collection",  "ttl": "4h",  "removeFields": [],  "resumeOnBoot": false,  "schedule": {},  "collector": {  "conf": {  "discovery": {  "discoverType": "http",  "discoverMethod": "get",  "itemList": [],  "discoverDataField": "entry",  "discoverUrl": "`https://1.2.3.4:8089/services/search/jobs`",  "discoverRequestParams": [  {  "name": "output_mode",  "value": "`json`"  },  {  "name": "search",  "value": "`\"search index=_internal\"`"  }  ]  },  "collectMethod": "get",  "pagination": {  "type": "none"  },  "authentication": "login",  "loginUrl": "`https://1.2.3.4:8089/services/auth/login?output_mode=json`",  "loginBody": "`username=${username}&password=${password}`",  "tokenRespAttribute": "sessionKey",  "authHeaderExpr": "`Splunk ${token}`",  "username": "admin",  "password": "redacted",  "collectUrl": "`${id}/results`",  "collectRequestHeaders": [],  "collectRequestParams": [  {  "name": "output_mode",  "value": "`json`"  }  ]  },  "destructive": false,  "type": "rest"  },  "input": {  "type": "collection",  "staleChannelFlushMs": 10000,  "sendToRoutes": true,  "preprocess": {  "disabled": true  },  "throttleRatePerSec": "0",  "breakerRulesets": [  "splunk_test"  ]  },  "id": "splunk",  "history": [],  "run": {  "rescheduleDroppedTasks": true,  "maxTaskReschedule": 1,  "logLevel": "info",  "jobTimeout": "0",  "mode": "preview",  "timeRangeType": "relative",  "expression": "true",  "minTaskSize": "1MB",  "maxTaskSize": "10MB",  "capture": {  "duration": 60,  "maxEvents": 100,  "level": "0"  }  }}

This returns a JSON response with a Job id: {"items":["1621367040.54.adhoc.splunk"],"count":1}

You can then query the jobs endpoint to get the status of the job.

GET /api/v1/m/<worker-group-name>/jobs/1621367040.54.adhoc.splunk

Which provides a JSON response (check the status.state field for more information):

{  "items": [  {  "id": "1621367040.54.adhoc.splunk",  "args": {  "type": "collection",  "ttl": "60s",  "removeFields": [],  "resumeOnBoot": false,  "schedule": {},  "collector": {  "conf": {  "discovery": {  "discoverType": "http",  "discoverMethod": "get",  "itemList": [],  "discoverDataField": "entry",  "discoverUrl": "`https://1.2.3.4:8089/services/search/jobs`",  "discoverRequestParams": [  {  "name": "output_mode",  "value": "`json`"  },  {  "name": "search",  "value": "`\"search index=_internal\"`"  }  ]  },  "collectMethod": "get",  "pagination": {  "type": "none"  },  "authentication": "login",  "loginUrl": "`https://1.2.3.4:8089/services/auth/login?output_mode=json`",  "loginBody": "`username=${username}&password=${password}`",  "tokenRespAttribute": "sessionKey",  "authHeaderExpr": "`Splunk ${token}`",  "username": "admin",  "password": "redacted",  "collectUrl": "`${id}/results`",  "collectRequestHeaders": [],  "collectRequestParams": [  {  "name": "output_mode",  "value": "`json`"  }  ],  "filter": "(true)",  "discoverToRoutes": false,  "collectorId": "splunk",  "removeFields": []  },  "destructive": false,  "type": "rest"  },  "input": {  "type": "collection",  "staleChannelFlushMs": 10000,  "sendToRoutes": false,  "preprocess": {  "disabled": true  },  "throttleRatePerSec": "0",  "breakerRulesets": [  "splunk_test"  ],  "output": "devnull",  "pipeline": "passthru",  "filter": "(true)"  },  "id": "splunk",  "history": [],  "run": {  "rescheduleDroppedTasks": true,  "maxTaskReschedule": 1,  "logLevel": "info",  "jobTimeout": "0",  "mode": "preview",  "timeRangeType": "relative",  "expression": "true",  "minTaskSize": "1MB",  "maxTaskSize": "10MB",  "capture": {  "duration": 60,  "maxEvents": 100,  "level": "0"  },  "type": "adhoc",  "taskHeartbeatPeriod": 60  },  "initialState": 3,  "groupId": "default"  },  "status": {  "state": "finished"  },  "stats": {  "tasks": {  "finished": 1,  "failed": 0,  "cancelled": 0,  "orphaned": 0,  "inFlight": 0,  "count": 1,  "totalExecutionTime": 80,  "minExecutionTime": 80,  "maxExecutionTime": 80  },  "discoveryComplete": 1,  "state": {  "initializing": 1621367040902,  "paused": 1621367040903,  "pending": 1621367041736,  "running": 1621367041738,  "finished": 1621367041852  }  },  "keep": false  }  ],  "count": 1}

Andrew Luke · Answer

Isn't there a better way of doing this? Sending a huge body in an API request is just asking for errors. I've been trying to use this setup to run some script collectors via API and having no luck. I also have around a dozen script collectors I want to kick off programmatically via the API rather than put on a schedule. Sending a huge ungainly API command for each one is a bit untenable. If there isn't a better way of doing this I suggest the addition of a RunJob API command which has a minimal set of inputs, and its generally beter to have those inputs be parameters rather than prone to error JSON bodies.

Current error being worked, though I might drop this and find a better way:

{ "status": "error", "message": "invalid config jobs: [{\"keyword\":\"required\",\"dataPath\":\"/ohkgZv\",\"schemaPath\":\"#/definitions/collection/required\",\"params\":{\"missingProperty\":\"collector\"},\"message\":\"should have required property 'collector'\"},{\"keyword\":\"if\",\"dataPath\":\"/ohkgZv\",\"schemaPath\":\"#/patternProperties/.*/if\",\"params\":{\"failingKeyword\":\"then\"},\"message\":\"should match \\\"then\\\" schema\"}]"}

Sign up

Using your Cribl.Cloud account

Login to the community

Using your Cribl.Cloud account

Scanning file for viruses.

This file cannot be downloaded