The overarching idea of this approach is to develop a Cribl Pack in a development environment, leverage git for version control, and then deploy the pack in the production environment once it is properly evaluated and tested. This allows for Cribl administrators to review changes made by the data owners as a part of the development cycle and maintain the quality of production data without having to navigate every single data source in the organization.
Keep Control
Cribl administrators get final say on what gets put into production. Leverage change control systems (JIRA, ServiceNow, etc.) in order to track what onboarding is underway, what items within the pack must be updated for production readiness, who is responsible for the pack, and to simply track why changes are occurring in the environment. You can automate a lot of these tasks by leveraging forms in ServiceNow, JIRA, or Azure; creating a templated change control approach will cut down on mistakes or missed information during the onboarding process.
Make Use of Git
The best way to operationalize this effort is to leverage Git to move packs throughout your environment while having easy tracking of changes that occur within each pack. For simplicity’s sake, have a single Git repository for each pack that is created. Have teams write to Git as packs are created and managed. So, first thing to do: get you a Git repository for your pack. I would recommend both a dev and prod (or main) branch here so that any changes made during production readiness are recorded appropriately.
Take Advantage of Workspaces
There are a couple of different approaches you can take here in order to set your development teams up for success. These teams need to have access to Cribl to onboard their data, build routes and pipelines, and possibly even connect it to a destination. Then roll all of that configuration into a pack.
The best way to get your Dev team into Cribl is to make a Development workspace. Workspaces allow for an isolated Cribl environment within your already provisioned Cribl Cloud tenant. Simply spin up a worker group for the dev team to work within and have them build their packs there. The main catch here is that you still need to provide them with access to your Cribl tenant. You can sequester them to your Development workspace, but you’ll need to do some kind of access set up.
In the event that you can’t create a workspace for some reason, you could also have your developers leverage Cribl’s on-premises free edition. Then, your developers can install Cribl on a VM, their workstation, or a dedicated development server and build their packs within that tenant. This is not the recommended approach as it introduces a lot more variables, but it is a possibility.
Create Pack
So, now we have an empty Git repository and a Cribl environment to develop in that is isolated from the production environment. The next step is to create the pack. To do this, simply go to Stream -> choose your Worker Group -> Processing -> Packs. Then pick Create Pack. Fill in as much of this information as you can. Make sure you adhere to the standards outlined in the data governance for your organization. For example, make sure you are following a naming scheme, tag sources and destinations, add versioning, etc.
Ingest Data
Once the barebones pack is created, we need to click into it and create our source. Where are you ingesting data from? Let’s say it’s from a custom application that has a REST API. Well, thankfully, in Cribl 4.14.0, we can now add REST collectors in packs, so let’s do that. Navigate to the Worker Group -> Data -> Sources -> REST API, and configure it in accordance with your data governance rules. Be sure you’re actually getting data in before moving on.
If, for some reason, you do not want to or cannot set up sources in this pack and you need to get data to it for testing, here are some other options you can do to get sample data into the disconnected environment:
- If the data is already flowing to production, copy a Capture from the production environment over to the development environment
- In Production: Edit Sample, Select All, Copy
- In Dev: Import Sample, Paste events
- If the data is already flowing to production, replay the data to the development environment.
Build Pipelines
Awesome, data is connected to Cribl, and we can see that data coming in. Now we need to determine what to do with that data. Making application owners responsible for this step reduces the likelihood of not understanding the data that is coming in, not keeping up with changes to the platform (we’ve ALL been hit with an upgrade that changed the logging format), and it reduces the load on your Cribl administration team immensely.
Follow the custom application example from the Ingest step: as the application owner, you know that you require the fields timestamp, user, and message. Nothing else from these logs really matters, so you build a pipeline that parses the data, keeps those fields, and reserializes it into KV pairs. Now, when you send it to your SIEM, it will take up far less space and reduce your licensing, which is what security cares about. Be sure to test the data as you’re working on the pipeline to make sure it is parsing as expected.
Set Up Destination
Okay, so we have our source and pipeline set up in our pack now. Now we need to set up the destination. This part is optional. If needed, the Cribl Administration team could take care of this section of the pack. If the responsibility matrix established during the data governance phase indicates that application/log owners are responsible for designing the data from source to destination, then they should continue setting it up here. Data governance may have also decided that the application/log owners are only responsible for getting data into Cribl and in the right format for the destination. Either of these approaches are acceptable - it just depends on what is best for your organization.
If the destination is meant to be set up within the pack, however, now is the time to do that and test it. Be sure there is an indicator somewhere in the data that demonstrates that it is a development environment, and this is not production data. Document that indication in the change ticket so that whoever is promoting the pack into production knows what needs to be updated for production readiness.
Create Route
Just like any other time that we’re ingesting data, we need to put all the pieces together. Make sure to set up the route in your pack to get data from your configured source, through your new pipeline, into your destination.
Export Pack to Git
Follow these steps to get your newly made pack into Git. You should already have a Git repository created and ready to be used for this pack. We recommend creating a dev branch as well.
- Clone down your Git repository to your local machine and switch to the dev branch.
- Double-check that all of your configurations created for this integration (source, route, pipeline, destination, etc.) are all properly included within the Pack that you’re working on. You can do this by navigating to Processing->Packs->Your new pack and double-checking configurations.
- Once you have confirmed the above, go back to the Packs page and export the pack by clicking on the 3 dots on the right-hand side of the screen next to the pack information and choosing Export.
- Unzip the .crbl file that you have downloaded, and copy the contents of the pack folder to your Git repository folder. You want the base folder structure in Git to be the contents of the pack itself - should have things like package.json, README.md, and data and default folders.
- Add, Commit, and Push this configuration to Git.
- Update the change control ticket to indicate development is complete and route it to the Cribl administrator team.
Promote Code to Production
At this point, your new pack is in Git on the dev branch. The change control ticket is in the queue for Cribl Admins. What’s next? The Cribl Administrators should now review the configuration either via Git or via the Dev environment.
- If the Cribl Administrator team has any feedback about the configuration or changes that need to be made, then the pack should be returned to the dev team for further updates, and the change ticket should be routed back to them.
- If the Cribl Administrator team is happy with the current version, then they should promote the Git configuration to the prod/main branch and import it into the production environment.
- Ensure that tags are updated to reflect the production environment
- Commit and deploy the changes
- Test the route and pipeline
- Close the change ticket as complete
Example Scenario
The application development team needs to onboard their Shine logs into Cribl. They’ve developed an API in their application just to do this. The Cribl administration team has a Testing workspace set up just for such a scenario, and they have an SSO group set up to allow login to this workspace as needed. They have also created a dataset in Cribl Lake for the development team to send data to. The application development team has been through the Cribl User training and has a decent understanding of the environment.
- Change control process is kicked off; a Jira ticket is created and assigned to the application development team.
- Application development team creates a Git repository called shine_stream_ingest; they update the ticket with this information.
- Application development team navigates to Cribl, logs into the Testing workspace, and creates a pack with the pack ID of shine_stream_ingest and the name Shine Stream Ingest. They also fill out the details of the pack, like the README and tags, in accordance with the organization’s data governance conditions.
- Application development team creates a REST API collector Source, a pipeline to parse their data efficiently, a Cribl Lake Destination, and a route that pieces these things together.
- Application development team commits and deploys their changes.
- Application development team tests their data ingest from end to end within the Testing workspace.
- Application development team ensures that everything is added to the pack, save it, and export it. Then they import it to a dev branch in Git and notify the Cribl Administration team by moving the change control ticket into their queue.
- Cribl Administration team reviews the pack. They import it into the production environment and test it. They deem it safe and efficient. They update the tags to reflect production, save the changes, commit and deploy the changes, and export it to the main branch in Git.
- Cribl Administration team updates the change ticket with the new Git information and closes the change as complete.
