Ticket #225 (closed defect: invalid)

Opened 6 months ago

Last modified 6 months ago

avahi-daemon fails to open /etc/avahi/services

Reported by: mpfj Assigned to: lennart
Priority: major Milestone:
Component: avahi-daemon Version:
Keywords: Cc:

Description

This looks similar to ticket #190, but not the same !!

Running on Atmel's NGW100 dev kit (AVR32 platform), avahi-daemon fails to open the contents of /etc/avahi/services.

Attached is an strace output ...

Attachments

avahi-segfault.txt (15.8 kB) - added by mpfj on 07/23/08 13:40:47.

Change History

07/23/08 13:40:47 changed by mpfj

  • attachment avahi-segfault.txt added.

07/23/08 13:42:29 changed by mpfj

  • component changed from avahi-core to avahi-daemon.

07/23/08 15:11:55 changed by mpfj

If I have an empty services directory, I get this:-

open("/etc/avahi/services", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 12
fstat(12, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl(12, F_SETFD, FD_CLOEXEC)          = 0
brk(0x15000)                            = 0x15000
getdents64(12, /* d_reclen == 0, problem here *//* 0 entries */, 4096) = 48
getdents64(12, /* 0 entries */, 4096)   = 0
close(12)                               = 0
close(11)                               = 0
close(10)                               = 0
close(9)                                = 0
write(2, "No service file found in /etc/av"..., 45No service file found in /etc/avahi/services.) = 45
write(2, "\n", 1
)                       = 1

Whereas if I create (say) the file ssh.service, then I get this:-

open("/etc/avahi/services", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 12
fstat(12, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl(12, F_SETFD, FD_CLOEXEC)          = 0
brk(0x15000)                            = 0x15000
getdents64(12, /* d_reclen == 0, problem here *//* 0 entries */, 4096) = 80
getdents64(12, /* 0 entries */, 4096)   = 0
close(12)                               = 0
close(11)                               = 0
close(10)                               = 0
close(9)                                = 0
write(2, "Failed to read /etc/avahi/servic"..., 35Failed to read /etc/avahi/services.) = 35
write(2, "\n", 1
)                       = 1

So it's worked out that the services directory is not empty, but is not happy with the contents ???

07/23/08 17:35:36 changed by mpfj

From static-services.c, this is the code that checks the services ...

    if ((globret = glob(in_chroot ? "/services/*.service" : AVAHI_SERVICE_DIR "/*.service", GLOB_ERR, NULL, &globbuf)) != 0)

        switch (globret) {
#ifdef GLOB_NOSPACE
	    case GLOB_NOSPACE:
	        avahi_log_error("Not enough memory to read service directory "AVAHI_SERVICE_DIR".");
	        break;
#endif
#ifdef GLOB_NOMATCH
            case GLOB_NOMATCH:
	        avahi_log_info("No service file found in "AVAHI_SERVICE_DIR".");
	        break;
#endif
            default:
	        avahi_log_error("Failed to read "AVAHI_SERVICE_DIR".");
	        break;
        }

In my case, globret = 2, which I think is GLOB_ABORTED. Any way of finding out why ?

07/23/08 17:36:09 changed by mpfj

Oh, and in_chroot = 0

07/23/08 20:57:34 changed by mpfj

I'm using buildroot on this dev kit, which uses uclibc. I have just noticed that support for the glob() function has 2 options:-

(a) GNU glob()

(b) SUSv3 glob()

The docs state that the SUSv3 version is the one to use (having a smaller footprint but doesn't support all the GNU specific optios). It also states that the GNU version is "out of date".

I will test the gnu glob version out and see if it makes any difference ??

07/24/08 11:54:53 changed by mpfj

I've worked out where the glob() starts to fail ... in simple_protocol_setup().

I added the following debug info around the "unlink(AVAHI_SOCKET);" line ...

    glob_t globbuf;
    int globret;
    memset(&globbuf, 0, sizeof(globbuf));
    avahi_log_info("current_dir = %s", get_current_dir_name());
	avahi_log_info("glob test 3c ...");
    globret = glob("/etc/avahi/services/*.service", GLOB_ERR, NULL, &globbuf);
    avahi_log_info("test globret = %d", globret);
    globfree(&globbuf);

    /* We simply remove existing UNIX sockets under this name. The
       Avahi daemon makes sure that it runs only once on a host,
       therefore sockets that already exist are stale and may be
       removed without any ill effects */

    avahi_log_info("unlink %s", AVAHI_SOCKET);
    int ret = unlink(AVAHI_SOCKET);
    if (ret == -1)
	    avahi_log_info("unlink failed, errno = %d", errno);
    
    memset(&globbuf, 0, sizeof(globbuf));
    avahi_log_info("current_dir = %s", get_current_dir_name());
	avahi_log_info("glob test 4 ...");
    globret = glob("/etc/avahi/services/*.service", GLOB_ERR, NULL, &globbuf);
    avahi_log_info("test globret = %d", globret);
    globfree(&globbuf);

... and this is the console output:-

current_dir = /
glob test 3c ...
test globret = 0
unlink /var/run/avahi-daemon/socket
unlink failed, errno = 2
current_dir = /
glob test 4 ...
test globret = 2

So the glob() works fine (globret = 0) before the unlink(), but fails after it (globret = 2).

07/24/08 12:19:28 changed by mpfj

  • status changed from new to closed.
  • resolution set to invalid.

Ah ... this looks like a glob() issue.

It doesn't appear to reset errno when entering glob(), so any previous errors (e.g. from say unlink()) are reported as the errno for glob() !!

Yuk.

I will report this to uclibc powers-that-be ...

(follow-ups: ↓ 9 ↓ 10 ) 07/24/08 15:31:56 changed by tedp

System calls don't reset errno when they succeed. The calling code is meant to do that if it is relying on errno to determine if an error occurred, but usually it is the return value that indicates *whether* an error occurred and errno indicates *what* the error cause was.

You can set errno to zero before calling glob if you like, but it probably won't be particularly helpful.

Probably more helpful is to look at ls -alR /etc/avahi/services and see if there are any unusual permissions. Notably avahi-daemon uses the GLOB_ERR flag so glob will return the error you see if it finds any directories that it doesn't have permission to traverse.

The globret = 2 you see matches GLOB_ABORTED ("read error"): http://www.uclibc.org/cgi-bin/viewcvs.cgi/trunk/uClibc/include/glob.h?rev=17614&view=markup

(in reply to: ↑ 8 ) 07/24/08 15:39:15 changed by tedp

In addition to tedp:

Notably avahi-daemon uses the GLOB_ERR flag so glob will return the error you see if it finds any directories that it doesn't have permission to traverse.

Is there a reason to use the GLOB_ERR flag?

(in reply to: ↑ 8 ) 07/24/08 16:09:34 changed by mpfj

Replying to tedp:

You can set errno to zero before calling glob if you like, but it probably won't be particularly helpful.

But it did fix the problem !?!

Probably more helpful is to look at ls -alR /etc/avahi/services and see if there are any unusual permissions. Notably avahi-daemon uses the GLOB_ERR flag so glob will return the error you see if it finds any directories that it doesn't have permission to traverse.

Here's the output from "ls -alR /etc/avahi/services" ...

$ ls -alR /etc/avahi/services
/etc/avahi/services:
drwxr-xr-x    2 root     root         4096 Jul 24 11:21 .
drwxr-xr-x    3 root     root         4096 Jul 24 11:04 ..
-rw-r--r--    1 root     root          150 Jul 24 11:21 ssh.service
$

I don't see any reason why glob() should get a read error from that ?