Page MenuHomeVyOS Platform

DNS records set via 'system static-host-mapping' return NXDOMAIN from 'service dns forwarding' after a request to a forwarded zone
In progress, HighPublic

Details

Difficulty level
Unknown (require assessment)
Version
-
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

jjakob triaged this task as High priority.Wed, May 20, 10:17 PM
jjakob created this task.
jjakob created this object in space S1 VyOS Public.
pasik added a subscriber: pasik.Thu, May 21, 9:29 AM
jack9603301 added a subscriber: jack9603301.EditedThu, May 21, 10:17 AM

via 'system static-host-mapping' return NXDOMAIN from 'service dns forwarding' after a request to a forwarded zone

Excuse me for not understanding the meaning of this question? As a player who thinks he is a professional DNS authoritative resolution service, I don't know the meaning of this sentence.

Does it mean that the request is wrongly forwarded to the recursive server or that NXDOMAIN is returned directly from the recursive forwarder of the router?

jjakob added a comment.EditedThu, May 21, 10:52 AM

The full description and way to reproduce is at https://github.com/PowerDNS/pdns/issues/9136 since this is a pdns-recursor bug. But in essence, after pdns-recursor startup or restart, requests that come in to pdns-recursor (service dns forwarding in VyOS) for a domain from /etc/hosts work normally. Then a request for any other domain comes in, that gets forwarded via forward-zones-recurse (service dns forwarding name-server), for example google.com, that request gets resolved without errors, but causes this bug to manifest. After that, a request for any hostname from /etc/hosts returns NXDOMAIN.

The hostnames in system static-host-mapping are added vy VyOS to /etc/hosts, pdns-recursor then picks up these entries via the config option export-etc-hosts=yes.
So the manifestation in VyOS is that entries in system static-host-mapping don't work, even though the configuration of pdns-recursor and /etc/hosts is correct.

There is a workaround by adding addNTA("zone", "A comment") to pdns-recursor lua config that we could use, but it would require us to parse out all the zones from /etc/hosts, create a lua config file and write all the addNTA for each zone into it. Not that simple, but possible if powerdns doesn't fix it quickly.

Also, this is reproducible with pdns-recursor from upstream master (4.4.0) so upgrading won't help.

jack9603301 added a comment.EditedThu, May 21, 11:00 AM

You mean that when pdns-recursor forwards the query to the back-end recursive parsing service for the first time, after that, the static entries in query /etc/hosts will always return NODOMAIN.

If there is an understanding error, please prompt.

You mean that when pdns-recursor recursively forwards the request to the back-end recursive parsing service, the static entries in the query / etc / hosts will always return NXDOMAIN?

Yes. Any request that comes in that gets forwarded will trigger the bug. It's unavoidable. For me it works fine for a few seconds (10s at most) as there are queries getting forwarded constantly.

Alas, it's really a troublesome problem. If it's a bug, I haven't used pdns-recursor. I usually use ISC bind, but I have a solution different from the one you put forward. It is based on the independent maintenance of open source branches, looking for the code with problems and implementing the patch. @jjakob

I can summarize the following solutions, and maybe there are other solutions:
a) Fix the bug yourself
b) Use other storage mechanisms to resolve records to bypass
c) Self parsing hosts

If you mean we should maintain our own fork of powerdns, I'm against that. PowerDNS is open source and anyone can submit patches to it the same as VyOS. If you want to try fixing the bug in pdns-recursor, you can clone pdns, debug it, build it, test it and submit the patch at https://github.com/PowerDNS/pdns . Of course you have to oblige by their contribution guidelines that are listed there. They also have a IRC channel at OFTC #powerdns .

jjakob added a comment.EditedThu, May 21, 11:40 AM

I can summarize the following solutions, and maybe there are other solutions:
a) Fix the bug yourself
b) Use other storage mechanisms to resolve records to bypass
c) Self parsing hosts

a) would be very hard for someone unfamiliar with the PowerDNS codebase. You can try.
b) Not possible (edit: I misunderstood - it would be possible to use zone files instead of /etc/hosts, but there would be a considerable amount of work to rewrite vyos-hostsd to generate them)
c) the workaround I mentioned above is possible: add a lua-config-file with addNTA for each domain zone in /etc/hosts. This isn't just the domains from system static-host-mapping, but DHCP hostfile-update too. It would probably be done in vyos-hostsd, as well as conf-mode/host-name.py. I don't have time to implement it now, however you're welcome if you wish to take it on.

@jjakob I'm sorry, but I think you may have misunderstood me. I just summarized the problems that can be solved at present. Of course, this patch can finally be submitted to PDNS. Relatively speaking, the current solution to the problem may be the first priority, and there are only two main ways to solve the problem, either to solve it or to bypass it.

vyos@vyos:~$ dpkg -l | grep pdns
ii  pdns-recursor                    4.2.1-1pdns.buster                  amd64        PowerDNS Recursor

the latest release of PowerDNS Recursor 4.3.0

Although I wanted to try it out, it seems the best way is to try to upgrade to the latest stable version. From the perspective of version management, the higher version often fixes some existing bugs, while the stable version ensures sufficient testing to avoid 0days.

c-po added a subscriber: c-po.Thu, May 21, 12:28 PM

Latest rolling runs PowerDNS recursor 4.3 T2470

In T2486#64335, @jjakob wrote:

Also, this is reproducible with pdns-recursor from upstream master (4.4.0) so upgrading won't help.

jjakob changed the task status from Open to In progress.Fri, May 22, 2:41 PM
jjakob claimed this task.
jjakob moved this task from Need Triage to In Progress on the VyOS 1.3 Equuleus board.

Upstream says this is normal behavior when DNSSEC is enabled, so the workaround that I'm working on (addNTA) is actually the proper fix.