Error: Warning: Unable to move file * to check results queue


#1

I have a new setup with ubuntu 9.04 and I get this error:

Jul 10 15:12:02 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkcyG3Jk' to check results queue. Jul 10 15:12:04 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkRs0dBu' to check results queue. Jul 10 15:12:38 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkluG7NX' to check results queue. Jul 10 15:12:39 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkhLRvWf' to check results queue. Jul 10 15:12:42 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/check9ZSQcj' to check results queue. Jul 10 15:13:17 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkcvkYrM' to check results queue. Jul 10 15:13:54 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkTvsuZQ' to check results queue. Jul 10 15:14:32 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkCOwNrG' to check results queue. Jul 10 15:15:10 nagios nagios: Warning: Unable to move file '/usr/local/nagios/var/spool/checkresults/checkcIv2SL' to check results queue.

And in Nagios webinterface I have:

[code]localhost

Current Load

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:12:39 MSD 2009
Current Users

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:13:17 MSD 2009
HTTP

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:12:04 MSD 2009
PING

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:12:42 MSD 2009
Root Partition

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:13:54 MSD 2009
SSH

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:14:32 MSD 2009
Swap Usage

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:15:09 MSD 2009
Total Processes

PENDING N/A 0d 0h 8m 16s+ 1/4 Service check scheduled for Fri Jul 10 15:12:38 MSD 2009
[/code]

All is localhost so there should be no problems, But I dont get the error. Permissions seems to be ok.

Debug doesnt say much:

[1247224957.632998] [001.0] [pid=13646] check_for_external_commands() [1247224957.633007] [064.1] [pid=13646] Making callbacks (type 8)... [1247224957.883876] [008.1] [pid=13646] ** Event Check Loop [1247224957.883942] [008.1] [pid=13646] Next High Priority Event Time: Fri Jul 10 15:22:38 2009 [1247224957.883953] [008.1] [pid=13646] Next Low Priority Event Time: Fri Jul 10 15:22:54 2009 [1247224957.883958] [008.1] [pid=13646] Current/Max Service Checks: 0/0 [1247224957.883963] [001.0] [pid=13646] check_for_external_commands() [1247224957.883971] [064.1] [pid=13646] Making callbacks (type 8)... [1247224958.134921] [008.1] [pid=13646] ** Event Check Loop [1247224958.134975] [008.1] [pid=13646] Next High Priority Event Time: Fri Jul 10 15:22:38 2009 [1247224958.134986] [008.1] [pid=13646] Next Low Priority Event Time: Fri Jul 10 15:22:54 2009 [1247224958.134991] [008.1] [pid=13646] Current/Max Service Checks: 0/0 [1247224958.134996] [001.0] [pid=13646] handle_timed_event() start [1247224958.135001] [064.1] [pid=13646] Making callbacks (type 8)... [1247224958.135010] [008.0] [pid=13646] ** Timed Event ** Type: 5, Run Time: Fri Jul 10 15:22:38 2009 [1247224958.135016] [008.0] [pid=13646] ** Check Result Reaper [1247224958.135020] [001.0] [pid=13646] reap_check_results() start [1247224958.135024] [016.0] [pid=13646] Starting to reap check results. [1247224958.135056] [016.1] [pid=13646] Starting to read check result queue '/usr/local/nagios/var/spool/checkresults'... [1247224958.136438] [016.0] [pid=13646] Finished reaping 0 check results [1247224958.136444] [001.0] [pid=13646] reap_check_results() end [1247224958.136448] [001.0] [pid=13646] handle_timed_event() end [1247224958.136453] [001.0] [pid=13646] reschedule_event() [1247224958.136457] [001.0] [pid=13646] add_event() [1247224958.136462] [064.1] [pid=13646] Making callbacks (type 8)... [1247224958.136467] [008.1] [pid=13646] ** Event Check Loop [1247224958.136476] [008.1] [pid=13646] Next High Priority Event Time: Fri Jul 10 15:22:39 2009 [1247224958.136484] [008.1] [pid=13646] Next Low Priority Event Time: Fri Jul 10 15:22:54 2009 [1247224958.136489] [008.1] [pid=13646] Current/Max Service Checks: 0/0 [1247224958.136494] [001.0] [pid=13646] check_for_external_commands() [1247224958.136502] [064.1] [pid=13646] Making callbacks (type 8)... [1247224958.387894] [008.1] [pid=13646] ** Event Check Loop [1247224958.387952] [008.1] [pid=13646] Next High Priority Event Time: Fri Jul 10 15:22:39 2009 [1247224958.387991] [008.1] [pid=13646] Next Low Priority Event Time: Fri Jul 10 15:22:54 2009 [1247224958.387996] [008.1] [pid=13646] Current/Max Service Checks: 0/0 [1247224958.388001] [001.0] [pid=13646] check_for_external_commands() [1247224958.388010] [064.1] [pid=13646] Making callbacks (type 8)... [1247224958.638889] [008.1] [pid=13646] ** Event Check Loop [1247224958.638961] [008.1] [pid=13646] Next High Priority Event Time: Fri Jul 10 15:22:39 2009 [1247224958.638972] [008.1] [pid=13646] Next Low Priority Event Time: Fri Jul 10 15:22:54 2009 [1247224958.638977] [008.1] [pid=13646] Current/Max Service Checks: 0/0 [1247224958.638982] [001.0] [pid=13646] check_for_external_commands() [1247224958.638990] [064.1] [pid=13646] Making callbacks (type 8)...

Anyone has a clue or know how I can pinpoint the problem?


#2

I’ve investigated further with strace:

[pid 23675] write(3, "[1247229897.063604] [016.0] [pid="..., 72) = 72 [pid 23675] _llseek(3, 0, [768791], SEEK_CUR) = 0 [pid 23675] time(NULL) = 1247229897 [pid 23675] open("/usr/local/nagios/var/spool/checkresults", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 1 [pid 23675] fstat64(1, {st_mode=S_IFDIR|0775, st_size=12288, ...}) = 0 [pid 23675] fcntl64(1, F_SETFD, FD_CLOEXEC) = 0 [pid 23675] gettimeofday({1247229897, 64014}, NULL) = 0 [pid 23675] write(3, "[1247229897.064014] [016.1] [pid="..., 122) = 122 [pid 23675] _llseek(3, 0, [768913], SEEK_CUR) = 0 [pid 23675] getdents(1, /* 4 entries */, 4096) = 76 [pid 23675] stat64("/usr/local/nagios/var/spool/checkresults/czrANml", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 [pid 23675] stat64("/usr/local/nagios/var/spool/checkresults/czrANml.ok", 0xbfb3669c) = -1 ENOENT (No such file or directory)

Seems like czrANml.ok all .ok files aint created.

total 12
drwxrwxr-x 2 nagios nagios 12288 Jul 10 16:49 checkresults

root@nagios:/usr/local/nagios/var/spool/checkresults# ls -l
total 36
-rw------- 1 nagios nagios 0 Jul 10 16:49 c1uxGN4
-rw------- 1 nagios nagios 0 Jul 10 16:48 c6rq9ek

So it shouldnt be a permission error, why aint the .ok file created?

Here is output of one file:

[code]root@nagios:/usr/local/nagios/var/spool/checkresults# cat checkbhUvtC

Active Check Result File

file_time=1247229957

Nagios Service Check Result

Time: Fri Jul 10 16:45:57 2009

host_name=localhost
service_description=Current Users
check_type=0
check_options=0
scheduled_check=1
reschedule_check=1
latency=0.129000
start_time=1247229957.129777
finish_time=1247229957.133295
early_timeout=0
exited_ok=0
return_code=2
output=(null)
[/code]


#3

did you run “make install-commandmode” when compiling? it’s possibly the easist way to solve permission porblems.


#4

Yes I did, and I did it again, still the same :frowning:


#5

sorry, no idea what’s up… :frowning: