What is Post Process Searching?
Post-process searching is a technique used to optimize dashboards in Splunk. A general rule of thumb with Splunk is: every search running at a time is taking up a single CPU on the server for processing. So, when you have a dashboard that could have 15, 25, or even 30 panels on it, you can see how this would be extremely resource-intensive. Especially if it is a widely used dashboard and multiple people are opening it at the same time.
Post-process searching can help with this problem by applying a single common base search across multiple dashboard panels. So we can potentially cut the count of individual searches firing in a dashboard by 30-40% (or higher in some cases). This is a critical technique for your Splunk users to learn, especially as your Splunk environment grows and you have more and more people using resources for alerts, adhoc searches, dashboards, and more.
There are two major hindrances to Splunk performance. First, is the impact of hardware allocation. The second is bad user behavior. Enabling your users will absolutely do wonders for getting your environment to run smoothly.
How do I know if I can use post process searching?
When you or your users are developing dashboards, you may notice a lot of the panels reference the same root search, but you may display the data in different charts (like the same data visualized in a timechart as well as a single value). Or you may reference the same root search but transform variances of fields with different stats counts across your panels. This is where post-process searching can help you optimize your dashboards.
Post-process searches
Post-process searches perform additional processing on results from a base search. A base search can be a global search or any other search within a dashboard. Use the base attribute in a post-process <search> to indicate the base search id.
You can use a single post-process search to generate results or you can chain multiple post-process searches together.
Best practices
Use these Splunk best practices to make sure that post-process searches work as expected.
- Use a transforming base search. A base search should be a transforming search that returns results formatted as a statistics table.
- Non-transforming base search issues. Non-transforming base searches can cause the following search result and timeout issues. If you observe these issues in a dashboard, check the base search to make sure that it is a transforming search.
- No results returned. If the base search is a non-transforming search, you must explicitly state in the base search what fields will be used in the post-process search using the | fields command. For example, if your post-process search will search for the top-selling buttercup game categories over time, you would use a search command similar to the following: | fields _time, categoryId, action
- Event retention. If the base search is a non-transforming search, the Splunk platform retains only the first 500,000 events that it returns. A post-process search does not process events in excess of this 500,000 event limit, silently ignoring them. This can generate incomplete data for the post-process search. This search result retention limit matches the max_count setting in limits.conf. The setting defaults to 500,000.
- Client timeout. If the post-processing operation takes too long, it can exceed the Splunk Web client timeout value of 30 seconds.
How do I implement post process searching?
It’s important to note, post-process searching must be done via the source of the dashboard. You have to actually open the XML of the dashboard to add your base search. Post-process example:
1. <!-- My parent search -->
2. <search id="xyz">
3. <query>index=_internal |stats count by destIp destPort eventType</query>
4. </search>
5. <!-- post processing reference -->
6. <chart>
7. <search base="xyz">
8. <query> stats dc(destIp) as “Distinct Count of Hosts”</query>
9. </search>
10. </chart>
11. <chart>
12. <search base="xyz">
13. <query>timechart count by destPort</query>
14. </search>
15. </chart>
Notice in the above example, at the top of our XML we name our base search “xyz” via the “search id” parameter in the XML file. The base search contains a transforming command, stats, which tables out our fields of interest that we reference in panels further down the dashboard. Then we pass various transforming commands on different panels in the dashboard depending on how we want to visualize the data. This process could GREATLY reduce the load on indexers and search heads if this is practiced wherever possible throughout our Splunk environment.
About SP6
SP6 is a Splunk consulting firm focused on Splunk professional services including Splunk deployment, ongoing Splunk administration, and Splunk development. SP6 has a separate division that also offers Splunk recruitment and the placement of Splunk professionals into direct-hire (FTE) roles for those companies that may require assistance with acquiring their own full-time staff, given the challenge that currently exists in the market today.
