Is Splunk really useful for DBAs? The short answer: absolutely!
DBA (Database Administrator) is one of the most complicated and important roles in any enterprise, as they are ultimately responsible for where data lives. They are tasked with providing the fastest performance, highest availability, and total security for data. I personally spent 12 years as a DBA, and I can say the DBA is the first person to blame when it comes to problems with data integrity, search performance, or loss of data – and sometimes we even get blamed for cold coffee on a snowy morning! That’s why DBAs usually spend about 50% of their time avoiding a potential problem and the other 50% of their time proving that their databases are not the problem.
So, what DBAs really need is a one-stop-shop that can show them the whole flow of data, starting on wires and stopping on user interfaces. A place that should not only help them manage the backend components of their databases but also help them prove that they are not guilty when an issue is raised. So, is there a solution that has can provide this visibility in our world? Well, after changing my career over to the SEIM world, I finally have the answer for my DBA friends – yes we have – it is Splunk!
How does Splunk interact with Databases and Database logs?
Splunk primarily uses the following two methods to monitor the logs or data of databases and ingest them into Splunk indexes. In this article, I will use Oracle as an example database platform but the same concepts apply to most other popular databases:
Splunk DB Connect
Splunk DB Connect is one of the most useful FREE Splunk apps that DBAs can leverage to interact with their database through Splunk. This Splunk app uses Java Runtime Environment (JRE) and JDBC drivers to connect to Oracle, MS SQL, IBM DB2, MYSQL, and many more. You can learn more about this app and see a list of all the database supported out of the box right here – https://splunkbase.splunk.com/app/2686/
Using an Oracle database as an example, this app will allow Splunk to interface directly with your database:
- Reach your database with database user credentials that you provide and apply SQL statements. These statements will fetch the results from the database and ingest those results into Splunk. With this capability, DBAs can create performance monitoring SQL statements or get the data dictionary tables as well as the performance views and ingest all of that data into Splunk! Don’t worry about data update and integrity, Splunk DB Connect is designed to use an interval schedule or Cron tab schedules to execute these SQL. Also, the app has the ability to use the rising column to track updates (like increasing DateTime, or transaction counter, any field that increases and is unique to each row) so that it can read in database records without duplicates.
- Export data from Splunk back to your database. This is also a very useful feature of Splunk DB Connect that allows you to write anything you wish from Splunk (like the results of analytics in Splunk) back to your database!
- Use SQL commands in your Splunk searching to execute live queries against your databases. In addition to the data available to you in Splunk with Splunk’s query language (SPL), using Splunk DB Connect will open up the power of SQL to build your searches, reports, and dashboards.
File Monitoring
The most common method for Splunk to ingest and monitor data is to read directly from files on disk. The Splunk Universal Forwarder (agent) can be installed directly on database servers and configured to monitor any database configuration or log files. In Oracle, this is commonly used to monitor the audit files, so you can ingest all of your AUD, FGA, and other audit files into Splunk. This will be very helpful as these files will be read by Splunk in near-real-time, giving you a very close eye on your database and data access to maintain security. In addition to audit logs, you may choose to ingest performance, configuration, and alert logs as well – leveraging Splunk to monitor those logs for fluctuations in performance and ensure performance stability while also analyzing performance historically with the data that’s already been ingested into Splunk.
How can DBAs use these logs?
After the logs and data have been ingested by Splunk, DBAs can use both SQL and SPL to process those logs. This will give them a wide range of analytical tools, historical searching, and reporting and visualization options. Building dashboards can be very useful for monitoring database performance with real-time data – and also important for monitoring access to PII (Personally Identifiable Information).
And here are some common use cases:
- Backup status – you can create a dashboard that will monitor your backups and also alert you when a backup fails.
- Sensitive data access – you can create a dashboard that shows you which users are accessing certain tables, even apply whitelists or blacklists to ensure your data stays secure.
- Any database object changes – since this data is already being ingested by Splunk you can report on changes to database objects over time.
- Database performance – you can absolutely leverage performance views in your database and ingest them into Splunk. Then, apply any SQL+SPL to create enhanced performance monitoring views.
- Storage monitoring – you can also use the storage statistics saved in your databases in Splunk. This will give you the ability to analyze your data storage usage and create your own alerts and reports accordingly.
What makes Splunk unique for DBAs?
Maybe DBAs reading this article will say “OK, sounds good, but all of the above use cases can be accomplished using a vendor solution like OEM (Oracle Enterprise Manager). So why should I use an external tool like Splunk to do that work for me?”
Great question – the idea behind using Splunk comes down to resolving the multi-layer data flow issue and resolving these with a single view.
Currently, a DBA must cooperate with other teams to create a real-time or historical dashboard that correlates data coming from databases, networks, OS, middleware, and front-end applications (among others). This is where the real magic happens.
And here are some of the use cases to provide a single pane of glass for your environment:
You can track memory usage for the database, database server, app server across the same time range on one single dashboard. That will give you an immediate view across the entire stack and eliminate the hassle of communicating and troubleshooting across a range of servers.
You can monitor top SQLs running at a certain time, and in that same view, you have another panel looking for network bottlenecks or application failures.
Create one dashboard for all storage monitoring, that will keep an eye on your tablespaces as well as OS drives/mount points. Now instead of reacting to failures when you run out of storage, you can be more proactive.
Create a dashboard that will monitor the backup status from the database and at the same time verify disk performance (i.e. SAN storage), I/O, and storage availability.
Tips and hints:
- Splunk DB Connect is a FREE Splunk app that any Splunk admin can download, install, and use. You can get Splunk DB Connect from here.
- When using Splunk DB connect to connect to your database, make sure you have the right JDBC driver(s) installed for that database. You can find all drivers listed here.
- Double-check the privileges of the user credentials you use to connect to your database with Splunk DB Connect. This all depends on which data you want Splunk to be able to access in your database. The same concept applies to file monitoring with the Splunk Universal Forwarder – make sure the account the Splunk Universal Forwarder is running as has access to the files you want it to monitor.
- Consider the performance impact of Splunk DB Connect executing SQL queries against your database. In most cases, you can limit Splunk’s impact by optimizing your queries; but depending on the required minimum complexity and the amount of data that will be retrieved, you may notice a performance impact on the database. This is especially a problem with the initial reading of a large, full table into Splunk, after Splunk catches up to the latest record in the table this impact should be significantly reduced. You can also leverage the “followTail” parameter in inputs.conf to avoid ingesting historical data sets when they aren’t needed. More about inputs.conf and this parameter can be found here.
- Consider how ingesting this data will impact your Splunk license. Remember that whenever you are ingesting new data from database logs or tables into Splunk, that data volume will be applied to your daily license quota.
In summary
Splunk is an all-in-one solution to resolve database issues across multiple technologies within your enterprise. Using Splunk you get the advantage of having other systems’ logs to troubleshoot any issue within the environment, keep track of historical statistics for database objects and performance, and finally, you will have the opportunity to prove and in a very short time – that this cold cup of coffee was not because of you (but let’s face it, maybe it was your fault 😊).
About SP6
SP6 is a Splunk consulting firm focused on Splunk professional services including Splunk deployment, ongoing Splunk administration, and Splunk development. SP6 has a separate division that also offers Splunk recruitment and the placement of Splunk professionals into direct-hire (FTE) roles for those companies that may require assistance with acquiring their own full-time staff, given the challenge that currently exists in the market today.