[avahi] Is it time to drop .local SOA heuristic?

Petr Menšík pemensik at redhat.com
Wed Dec 7 16:37:33 UTC 2022


Hello everyone.

We have spent almost whole day discussing options about fixing issue [1] 
with non-working .local resolution. It turned out the responsible were 
unicast name server, which contains .local zone with SOA record.

I think current implementation is not the best one. I believe Apple 
product is doing similar checks, but with different results [2]. If they 
find .local SOA record, then they first send query to DNS, then if not 
found there try at least mdns. That seems to be ideal way to solve it. 
DNS can respond fast that no such name exists, unlike its multicast 
counterpart.

Unfortunately nss-mdns plugin just skips MDNS, but never retries it 
after the DNS. I see there two problems:

- mdns should not implement dns queries itself, especially when parallel 
IPv4+IPv6 queries occur.
- resolve plugin does not allow anything pass after that in current 
fedora's configuration. So appending fallback mdns plugin AFTER dns 
would not work, because nothing reaches there when systemd-resolved is 
active.

There is a question, whether still in year 2022, almost 10 years after 
MDNS RFC were published, exist deployed .local zones in unicast DNS with 
a real data. Would anyone know? Do we still want to prevent .local zone 
breakage in unicast DNS?

Our results of the best available solution were this:

- if .local SOA is detected, do MDNS query anyway
- if the host is not found on MDNS, return UNAVAIL as result to nss. 
Allow continuing to next module, resolve or dns. We expect 
/etc/nsswitch.conf contains something like hosts: ... mdns4_minimal 
[NOTFOUND=return] dns
- set shorter timeout in this case, so the DNS response is returned in 1 
or 2 seconds. Current 5 seconds does not seem acceptable.
- (optional) move .local SOA query to avahi-daemon, so it can be cached 
for at least few seconds.

If .local SOA is not found, then return NOT_FOUND and stop further 
resolution.

An alternative would be removing this test at all and do mdns queries 
always. If we had network-specific configuration of mdns, like 
systemd-resolved can do, this would not be necessary. You could just 
disable mdns where .local unicast domain provides useful data.

What would you think about that plan? Does it sound reasonable or not? I 
think it would make it still usable even on legacy networks. Without 
dramatic regression. Yes, it would add increased delay to unicast 
*.local names, but otherwise they would stay working.

Any comments welcome. I have tried to do draft [3] on nss-mdns plugin, 
which keeps timeout unchanged, but at least allows DNS query after 
unsuccessful MDNS query. If someone would like to test it, I would be 
grateful.

Regards,
Petr

1. https://bugzilla.redhat.com/show_bug.cgi?id=2148500
2. https://github.com/lathiat/nss-mdns/issues/75
3. https://github.com/lathiat/nss-mdns/pull/84

-- 
Petr Menšík
Software Engineer, RHEL
Red Hat, https://www.redhat.com/
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB



More information about the avahi mailing list