Hand touching screen with tech icons

How to Create Contribution KPIs in IT Service Intelligence (ITSI)

ITSI transaction processing

ITSI transaction processing can involve transactions flowing from multiple sources, in which case you can use Splunk ITSI to drill down to issues with specific transaction endpoints.

One of the most important KPIs to track is transaction error rates.

These can indicate problems with application health, with remote dependencies, or with the transaction requests themselves. 

For our purposes, we’ll assume you are handling transactions from both customers and suppliers.  These KPIs can be split by customer and supplier entities.  We found that calculating error rates per entity and then averaging them at the aggregate level often led to either excessive alert noise or failure to detect problems. 

The reason for this was fairly simple to understand: different customers or suppliers have different transaction volumes. 

The following example of such a calculation will make the problem clear, in this case with low transaction volume and high error rate for an individual entity:

CustomerTotal TransactionsTotal ErrorsError Rate
Customer 199910.10%
Customer 211100.00%
1000Average Error %50.05%

How do we track the overall error rate when there is a common occurrence of low-volume entities (customers or suppliers)?  We can implement a KPI for error rate contribution per entity; count errors at the entity level and get a sum at the aggregate level, then divide the entity count by the total across all entities. 

In the base search configuration, we can then take the appropriate measure at the entity level and sum it at the aggregate level. 

Using the same entities and numbers from the example above, you get the following results:

CustomerTotal TransactionsTotal ErrorsError Rate Contribution
Customer 199910.10%
Customer 2110.10%
1000Overall Error %0.20%

The basics of this can be adapted to several common ITSI transaction processing use cases. 

For example, you could take a distinct count of sessions per web server, calculate the per-server percentage of the total, and use this to detect load-balancing problems.  This can be especially useful if you need to do entity-level anomaly detection over many entities such as Docker containers, when your entity count exceeds the limits of ITSI’s out-of-the-box Entity Cohesion anomaly detection.

About SP6

SP6 is a Splunk consulting firm focused on Splunk professional services including Splunk deployment, ongoing Splunk administration, and Splunk development. SP6 has a separate division that also offers Splunk recruitment and the placement of Splunk professionals into direct-hire (FTE) roles for those companies that may require assistance with acquiring their own full-time staff, given the challenge that currently exists in the market today.