[blockquote]To answer your questions, though, -email- is just something I put there for hiding my eddress.[/blockquote]Super, just thought I’d check !shy Whether changing $CONTACTEMAIL$ to a real address is or isn’t possibly causing a problem I don’t know, can’t say I’ve ever done it… the ‘at’ symbol might throw a spanner in the works or it might not. I guess you’d know whether or not it still worked after you change that one thing though
[blockquote]I think this might be the original notify-by-email. I’m having server trouble, and I can’t get to it (yet).
/usr/bin/printf “%b” “***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $LONGHOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n” | /bin/mail -s “Host $HOSTSTATE$ alert for $HOSTNAME$!” $CONTACTEMAIL$ [/blockquote]That looks like the host notify command to me, which’ll be sent if the whole host goes down, it’s the service-notify-by-email command that deals with the service check side, that’s the one we need to take a look at (when you can get on the server again)
[blockquote]$USER1$/check_nt -H $HOSTADDRESS$ -p $USER19$ -v USEDDISKSPACE -l C w -90 c -95 | /bin/mail -s “USEDDISKSPACE alert for $HOSTNAME$!” -email-[/blockquote]Cool that it generates an email, though I’m a little surprised it works especially as it even complains about errors, anyway that’s by the by. What happens here is the above service check gets run for your wks146 server by nagios, which is populating the appropriate macros with information you configured for that wks146 server, i.e. $HOSTADDRESS$ gets replaced with the IP address of wks146 and $USER19$ I assume is configured as your serverside client’s port… so it runs that, gets the output back, then pumps the whole lot at /bin/mail which emails it to you. So far, so groovy. The reason why you are not getting alerts I imagine is that the /bin/mail part of the process at the end, which is by all accounts working and sending the email, is finishing ‘cleanly’, thus it’s exit code is a normal happy exit code. I reckon Nagios sees this exit code from /bin/mail and thinks that the service check is OK, regardless of what the plugin exitcode was or might be, and hence OK means no alerts.
So on to
[blockquote]I’ve also tried this (and it works without errors, but it doesn’t generate an e-mail):
$USER1$/check_nt -H wks146 -p $USER19$ -v USEDDISKSPACE -l C w -90 c -95 | /bin/mail -s “USEDDISKSPACE alert for $HOSTNAME$!” -email-[/blockquote]
Hmmm, not sure what’s going on here… depends on the definition of ‘works’… what i’d first imagined, is that nagios can’t resolve wks146 as a host IP address, and thus is timing out and passing nothing whatsoever in the way of an email body to /bin/mail which might be saying ‘yeah OK, i’ll send nothing’ and exit with what nagios sees as an OK status. On reflection I would have thought that /bin/mail would complain about it a bit more than that and at least exit with some sort of error code. Really can’t fathom that one and I’m not near any flavour of linux at this time to try it and see…
As far as getting something working goes, you should be just fine with
$USER1$/check_nt -H $HOSTADDRESS$ -p $USER19$ -v USEDDISKSPACE -l C w -90 c -95
That should be OK and as far as sending alerts goes, the last exit code nagios will see is the one from the plugin, and then nagios will/should use the service-notify-by-email command to mail out the alert just like it used to… Granted, it may well come out without any of the stuff you want, but that will need to be fixed**(another thought occurred, see footnote)* in the editing of the service-notify-by-email command object which with any luck will solve the problem, assuming groundworks nagios uses the same $SERVICEOUTPUT$ macro that normal nagios does, and I can’t see why it wouldn’t.
[blockquote]Thanks, also, for the clarification of checks/commands/macros, too! It’s all starting to become (somewhat) clear.
Instead of being a rookie, maybe one day I’ll make it to “The Show.” [/blockquote]
No worries. Now your looking ‘under the hood’ it should all start dropping into place and I’m sure it won’t be long berfore your back at the groundworks forum soving everyone elses problems for them
But for now I think I’ve gone on long enough, Im off to bed now, it’s late o’clock.
Toodles
/S
*****The check_nt plugin definately comes back with the right data, we already seen this by running it from the command line, so the question is why it wasn’t getting through to you on your original alert emails…
We’ve been looking at the possibility that the service-notify-by-email command is at fault, but it is possible that the original check_nt_disk_C command was configured to use some hooky plugin wrapper that didn’t use check_nt like we did from the command line and was sending back only an exit code with no data. I think this is most unlikely but might be a possibility, not knowning exactly what it did. If suddenly using that check_nt command above you start getting back the data, then that means that this is in fact the reason all along and there is nothing up with service-notify-by-email, we just needed to change the check to pure check_nt… As I say, unlikely, but I just realised that one thing I never asked though is whether you are running any other checks that do come back with the plugin output in an alert email (like packetloss details and RTA for check_ping)? Probably should have, might have saved a lot of time. So, do you, and do they?