Symptom
An S3 replay or collection job returns no results and does not show clear errors in the collect job logs, even though the archived files exist in storage and direct path-based retrieval can work.
This behavior can occur when using a path filter with multiple time tokens such as 'palo_alto/${_time:%Y}-${_time:%m}-${_time:%d}-${_time:%H}:${_time:%M}/${*}' against a large set of partitioned directories.
Environment
- Cribl Stream
- S3 Collector / Amazon S3 source workflow
- Self-managed deployment
- Archive layout partitioned by time, for example
YYYY-MM-DD-HH:MMdirectory structure
Resolution
- Confirm the source type is an S3 Collector when ingesting data directly from an S3-compatible bucket, unless data is being delivered through SQS.
- Validate that the exact path expression works in a controlled test. A known working example is
'palo_alto/${_time:%Y}-${_time:%m}-${_time:%d}-${_time:%H}:${_time:%M}/${*}'when the backing storage and dataset are small enough. - Reduce the scope of the search by creating a small test area with only a few time-partitioned subdirectories and test the same path pattern there first.
- If needed, simplify the path filter to use fewer time tokens, such as a yearly partition path, to reduce the number of directories Cribl must enumerate.
- If direct paths succeed but broad time-token searches do not, treat the issue as a scale or resource-limitation scenario rather than a path-syntax problem.
Cause
The path syntax itself can be valid, but replay may still fail in practice when the collector must enumerate too many partition directories. In the investigated case, the likely factor was the size of the dataset and the cost of traversing many time-based partitions. The user ultimately confirmed that the functionality worked in Cribl, but not for their specific use case because of the size of the data being processed.
Last Validated
4.16.1
Additional Information
If you need to isolate whether the problem is path expansion versus storage listing behavior, test both:
- a direct, fully qualified path to a known-good directory
- a small test subtree with only two or three time-partitioned folders
