![Nagios log server vs splunk Nagios log server vs splunk](/uploads/1/2/5/6/125602938/819158187.jpg)
Dave Williams - Nagios Log Server - Practical Experience. This session will detail the green field deployment of Nagios Log Server in a client environment consisting of HP LAN Switches, 3PAR disk storage, HP Blade Chassis with Flex Fabric using.
For monitoring logs with Nagios, typically the log checker will return a warning only for newly discovered error messages each time it is invoked (so it must retain some state in order to know to ignore them on subsequent runs). Therefore I usually set: maxcheckattempts 1isvolatile 1This causes Nagios to send out the alert immeidately, but only once, and then go back to normal.My favorite log checker is, but I'm biased because I wrote it myself after not finding any existing ones that I liked. The logwarn package includes a Nagios plugin. Nothing in your config jumps out at me as being misconfigured.By design, checklog will only show either an OK message, or the last log entry that triggered an alert.
If you need to see multiple entries, you'll need to modify the plugin.However, I find the fact that you're not getting recoveries somewhat odd. The way checklog works (by comparing the current log to the previous version), you should get a recovery on the very next service check. Except of course, when there have been additional matching entries added to the log since the last check.Does forcing another service check (or several) cause it to recover?Also, I don't intend this in a mean way, but make sure it's really malfunctioning.Is your log getting additional matching entries in between checks, causing it not to recover?
Your check is matching '?' Which will match anything new in the log. Is something else (a non-error) being added to the log and inadvertently causing a match?If none of the above are the issue, I would suggest narrowing it down by taking Nagios out of the equation. Try running checklog manually (from the command line, but as the same user as nagios), and with a different oldlog.
It should go something like this -. run check with a new 'oldlog' - get initialization message. run check - check OK. make change to log.
run check - check fails. run check - check OKIf this doesn't work, then you know to focus on the log, the oldlog, and how the checklog is doing the check.If it works, then it points more towards a problem with your nagios configuration.