Did you know that Splunk has the capability to ingest non-log-based data through multiple onboarding methods? Two of those methods are HTTP Event Collector (HEC) and polling via an API.
Here, we will touch on the Splunk API data ingestion, as it is traditionally the most common method used – with the HEC(push) method catching up very rapidly.
Data is available within all electronic systems, in the past, it was seen as the best practice to “lock” in customers to a custom bus and language to drive platform adoption. There has been a change with most systems going forward, as it is considered a large advantage for a system to enable easy integration with multiple tools throughout the enterprise.
API-based data ingestion is typically used for ingesting non-log-based data into Splunk. There are those situations when API data ingestion is the best method and can be chosen over traditional log ingestion, depending on the enterprise. In order to start the ingestion process, plan carefully and pay attention to:
- Data Volume
- Data Type
- Data Security
- Data Storage Requirements
From a configuration viewpoint, you will need Spunk equipment with associated code. The equipment can include a search head, HF, or the stand-alone Splunk system as well. The system is equipped with a piece of code, usually a TA or Add-on. The Add-on will utilize the API language(protocol) that can be a proprietary custom syntax or one of the industry standards, such as the REST interface. The configuration when polling data from the targeted device or system is usually given as an IP Address or URL and an associated port. Most if not all, API utilizes some form of authentication and will allow SSL/TLS traffic encryption if desired.
The caveats of API data ingestion through an endpoint:
- Pull/poll method of data collection through the usage of polling intervals
- Near real-time ingestion may be achieved using short intervals, increasing the load and demand on the equipment involved with pulling the data
- Complex or complicated redundancy methodology and failover planning schemes
- Different resultant formats can be utilized dependent upon the endpoints API implementation. The most common are JSON or CSV formatted output.
The tools in Splunk to help with API ingestion are listed below:
- Push Messaging
- HEC (built-in) (Pushed based, HTTP POST message)
- Pull/Poll Retrieval Messaging
- REST API Modular Input: https://splunkbase.splunk.com/app/1546/#/overview
- Custom/Specific Add-On/Technology Add-Ons: https://splunkbase.splunk.com/
- Custom scripts utilized through modular inputs
About SP6
SP6 is a Splunk consulting firm focused on Splunk professional services including Splunk deployment, ongoing Splunk administration, and Splunk development. SP6 has a separate division that also offers Splunk recruitment and the placement of Splunk professionals into direct-hire (FTE) roles for those companies that may require assistance with acquiring their own full-time staff, given the challenge that currently exists in the market today.