Skip to main content
Solved

Office 365 Activity Source Ingestion Lag Setting Confusion With Event Timestamp Filtering

  • February 25, 2026
  • 9 replies
  • 16 views

This message originated from Cribl Community Slack.
Click here to view the original link.

Does the Ingestion lag (minutes) setting in Advanced Settings of the Office 365 Activity source function the same as the standard collector event time filtering - filtering individual events based on the configured time range for the collection job? Meaning, does this setting target the actual event timestamps within each content blob, or does this setting apply to the actual content blob contentCreated returned from each subscription? Microsoft's documentation reads as if time filtering on the content blobs uses the contentCreated which should be the time the content became available for retrieval, but the events within each content blob may be delayed by up to a few days in some cases. https://learn.microsoft.com/en-us/office/office-365-management-api/office-365-management-activity-api-reference#list-available-content

Best answer by Stefan Laschitzki

No, "Ingestion lag" is a skew for your query window. It determines how far in the past the Collector searches for new data. The time filter is also applied, but on the genearted earliest and latest values. E.g., if the lag is 20 minutes, the collector searches for -25m@m til -20m@m

9 replies

No, "Ingestion lag" is a skew for your query window. It determines how far in the past the Collector searches for new data. The time filter is also applied, but on the genearted earliest and latest values. E.g., if the lag is 20 minutes, the collector searches for -25m@m til -20m@m

  • Author
  • Inspiring
  • February 25, 2026
Thanks, so it sounds like this skew is applied to the content blob discovery performed on the /subscriptions/content?contentType={ContentType}&startTime={0}&endTime={1} path. So this will ensure if a content blob for a subscription becomes available with a contentCreated timestamp in the past, it would still be collected, but the individual events within each content blob, which have expected delays, are not affected.

  • Author
  • Inspiring
  • February 25, 2026
If this is the case, why is Ingestion lag (minutes) setting even required if contentCreated is the time the content blob became available for retrieval? The documentation defines the value as "The datetime when the content was made available", so it doesn't sound like there would be delays that need to be handled during the discovery phase, especially not delays up to 24-72 hours, but maybe Cribl has observed Microsoft publishing content blobs in the past.

Cribl only asks for that time frame once. If the ingestion delay on MS side is larger than your time skew (ingest lag configured in Cribl), you will just miss the event.

  • Author
  • Inspiring
  • February 25, 2026
I think my main point of confusion is whether the skew is applied during the discovery phase when available blobs are discovered or during the collection phase when events are extracted from within each discovered blob. Based on what I have read from Microsoft's documentation, the only delay that should be expected is the event delivery delay within each blob. For example, a blob can contain events from 5 minutes ago and 5 days ago. But the actual contentCreated value that is used to mark when each blob is made available to retrievers and is used by collectors (I am assuming this is what Cribl uses) to find the blobs within the defined interval for each subscription should not be delayed. If the the ingestion lag being set to 0 causes you to miss events that are delayed on the Microsoft side, then it sounds like the Ingestion lag targets the actual event timestamps within each content blob and behaves similar to the standard collector event time filtering.

  • Author
  • Inspiring
  • February 25, 2026
Okay, I found Cribl's implementation of this REST collector and see the ingestion lag is applied during the discovery phase. https://github.com/criblio/collector-templates/blob/main/collectors/rest/o365_activity/O365_Activity-SharePoint.json `` "discovery": { "discoverType": "http", "discoverMethod": "get", "pagination": { "type": "response_header", "maxPages": 0, "attribute": [ "nextpageuri" ] }, "enableStrictDiscoverParsing": false, "enableDiscoverCode": false, "discoverUrl": "https://manage.office.com/api/v1.0/${C.vars.o365_tenant_id}/activity/feed/subscriptions/content", "discoverRequestParams": [ { "name": "PublisherIdentifier", "value": "${C.vars.o365_publisher_id}" }, { "name": "contentType", "value": "\"Audit.SharePoint\"" }, { "name": "startTime", "value": "${state.latestTime != null ? new Date(state.latestTime 1000).toISOString() : earliest != null ? new Date((earliest - (C.vars['o365_ingestion_lag_min'] || 0) 60) 1000).toISOString() : new Date(Date.now() - (C.vars['o365_ingestion_lag_min'] || 0) 60 1000 - (C.vars['o365_polling_interval_min'] || 5) 60 * 1000).toISOString()}" }, { "name": "endTime", "value": "${latest != null ? new Date((latest - (C.vars['o365_ingestion_lag_min'] || 0) 60) 1000).toISOString() : new Date(Date.now() - (C.vars['o365_ingestion_lag_min'] || 0) 60 1000).toISOString()}" } ] }``

  • Author
  • Inspiring
  • February 25, 2026
Knowing that the ingestion lag is used during the discovery phase, I still am not sure why this is even required. It seems as if this implementation is designed to address delays in individual event delivery, but discovery filters based on the time the actual blob became available - independent of the timestamps of the events within the blob. From the Microsoft docs:
  • The contentCreated property is not the date that the event being notified was created. This is the date the notification was created. The events detailed in that blob may have been created well before the content blob was created. Therefore, you can never query the API directly for events that occurred within any given period.
The only way I could miss events with an ingestion delay set to 0 is if Microsoft published a blob with a contentCreated time in the past, but if this value is the exact time it becomes available for retrieval and thus would be discoverable by the Cribl collector job, that does not seem plausible.

If the blob is really available at "contentCreated" time, you are right. However, I assume that the ingestion lag has been added for a reason.

  • Author
  • Inspiring
  • February 25, 2026
Yeah, I agree. Thanks for looking into this with me!