I am very happy to report that the issue id resolved. The router now boots up fully without intervention once again.
Aug 4 2020
Awesome! That's really quick turnaround! I'll give it a try when the newer build appears.
Entering configure mode and then typing load and then commit brings everything up to what the config in config.boot specifies, and the running configuration shows the correct contents for eth1. It brings the router up to where it should have been at boot.
Aug 3 2020
Unfortunately, the problem does not appear to be fixed in the latest rolling build, vyos-1.3-rolling-202008031923-amd64.iso.
That sounds hopeful! I will try it in a few hours and report back.
This configuration replicates the error:
OK, I take that back. Even when the interface with prefix-delegation defined is dead last, it still has this error. The last IP address is not configured, although /run/dhcp6c/dhcp6c.ethN.conf is correctly created, and DHCPv6-PD works. But config parsing is broken, and the entire config node is missing when I query for it.
It's actually a little worse than I'd initially realized; the interface that DHCPv6-PD is being requested on (the interface with the prefix-delegation stanza) has to be the very last interface, even if it doesn't refer to the ones after it. So, even if I don't delegate any addresses from eth2 to eth3, it still fails. I have to do the delegation from interface eth3 on a system with four ethernet interfaces. And preliminary testing suggests that it has to be eth5 on a system with 6, even if eth4 and eth5 aren't configured at all.
Jul 29 2020
I should add that this problem has existed for at least a couple months, right up until 1.3-rolling-202007241919. Rolling builds after that one appear to ignore the prefix-delegation configuration entirely (T2740), so they don't exhibit this problem.
Jun 4 2020
@dsummers I have been able to get the current nightly builds to work on Comcast Business, which is delivered via ethernet. In this particular case, there are some unfortunate gotchas to keep in mind, but no modification of VyOS is currently needed, at least in my case. Very cool!
May 28 2020
@c-po if the interface dependency system that @jjakob describes works as I might imagine, then perhaps it's just a matter of adding the interfaces that appear in prefix-delegation configs to the dependency lists. (There would be some subtleties dealing with things like vifs within an interface, but that can be sorted out.)
Recovery from failures does seem generally desirable, but it would also be preferable to discover errors in configuration while in conf code. For this reason, it seems like the best way to handle this would be to defer starting dhcp6c until the very end of configuring all the interfaces, if that's possible. Is there a mechanism already to do this, or should I look into restructuring things slightly.
Something else I realized last night: In general, it's not safe to start dhcp6c before all interfaces are configured, as long as PD is specified (whether 'address dhcpv6' is specified). That's because the prefix-delegation stanza can refer to any other interface on the system--even ones that haven't been set up, yet. That might include vif interfaces (such as I noticed last night) or any other virtual interface, like br or tun.
@jjakob Yes, exactly my thoughts, and what my last pull request starts. I'll try to catch the remaining cases later this evening my time (in 12 hours or so). I can imagine one case that might be a little tricky.
Ah, I think I see what you, @tbr and @jjakob are getting at. If I want to do DHCPv6-PD, then I need to start the daemon. But if I don't want an address using the protocol, then I can explicitly turn it off.
@jjakob Yes, I tried dhcpv6-options parameters-only; it had no effect. I did not try 'address dhcpv6' simply because that doesn't seem like a great configuration. But it would have been worth testing, anyway. And while that would have been good to test, it would be a pretty awkward an unexpected workaround for regular users to think of it.
I have sent a pull request: https://github.com/vyos/vyos-1x/pull/437
Aha! Thanks, @c-po. As I suspected, there is an assumption in ifconfig/interface.py that the DHCPv6 client need not be started if we're not getting our address via DHCPv6, which of course is not the case, here.
I'm sorry to say that with the current rolling release, still no dhcp6c is started when I add the PD configuration. Here is my test config:
May 27 2020
I included all the bits I thought were relevant in T421#65109 (you'll need to click "Show older changes" at this point to see it).
May 26 2020
Thanks for the response.
Hey, @c-po, thanks for getting this moving again. As you may recall, I did a lot of development work on this some time ago that I never pushed to get merged. (I dropped the ball.) Unfortunately, you've had to re-do a lot of what I did, I'm sure. I'm happy to incorporate any of that past work or do some fresh development to polish this feature up.
I'm sorry, to be honest, I don't understand this sentence. Maybe I don't understand the network in the United States. In China, ISPs use PPPoE for the last kilometer of user authentication.
That great, and necessary. But then, once that's done (I've done it in my copy), then I seem to be missing the next step: It's not starting a DHCPv6 client. What is supposed to do this?
Essentially, that means that nothing happened. What you want prefix delegation to do on the client side (which is what we're talking about) is request a delegated prefix from the upstream router via DHCPv6, and then divide the delegated prefix into site-level aggregations and assign the resulting network numbers (delegated prefix + sla id) to other interfaces. None of that happened.
Hmm. So, correcting that error did allow the configuration to apply and save successfully. However, no prefix delegation request appears to be sent and no configuration of delegated interfaces appears to be done. Indeed, no dhcp client appears to be running.
May 25 2020
Ah! Yes, I see what you're referring to. I'll take a look at that in a few hours.
I'm trying to catch up on the comments. There are a lot.
Oh, wait, is this done, and completely working, now? I'm still trying to catch up on this long thread. If it's all working, now, I'll just have to try the latest build and see if it works.
Hello! I appologize tor being absent for so long, now. But I'm back, and ready to contribute again. It looks like I have a lot to catch up on.
Feb 28 2019
Hmmm. I'll try to help you debug this issue offline (look for my message).
I've reproduced the hang you described on a test router. It looks like this:
Hmmm. My next guess would be that you could be inadvertently blocking neighbor discovery (which happens on the link-local addresses). Can you try turning off the IPv6 firewall long enough to test whether it's the firewall at all? Another thing to try would be tcpdump on the LAN and WAN interfaces, as well as putting another machine (like a laptop) on the link between the two routers, to see where the packets are appearing and not appearing. You should see lots of link-local traffic between the LAN hosts and the VyOS router, as well as between the two routers.
Feb 27 2019
Ah, interesting. I'll see if I can reproduce the address dhcpv6 problem.
Feb 26 2019
Yeah, I still have a bit of debug output in there. Easy enough to remove.
Great! The one I built last night is still available here: http://www.avernus.com/~gadams/vyos-crux.201902250834.dhcpv6pd-amd64.iso
Indeed, this is on Comcast Business. At least they're consistent in their oddity, eh?
Great! I'll look forward to hearing how it works for you.
Feb 25 2019
I've learned a lot about building ISO images over the past couple weeks. I have a first version of my change ready; it's in two commits currently:
I'm a little confused about the status of this task.
Feb 17 2019
Oy. Turns out that other routers (like those that found in ISP-provided cable modems) can have subtle quirks in their DHCPv6-PD implementations. I've worked around another one.
Feb 13 2019
A quick progress update: I have fixed a bug (that may or may not have been present before) that prevented renew dhcpv6 interface eth3 (or whatever interface) from working outside of an active configuration session. I imagine most uses of dhcp lease renewal would occur in a normal router login session.
Feb 10 2019
Well, I've been developing it against the 1.2.0 branch. It might work back-ported to 1.1.8, but that's not my focus.
Hey! Sorry I was out for a bit. But I'm back now. Time to catch up.
Dec 3 2018
@aaliddell No worries! It was a really easy fix. :)
Nov 29 2018
Did there turn out to a be problem with this fix? I definitely need it in my environment, where I run dhclient on eth3 or br0, and eth0 never gets an IPv6 address (link local or otherwise).
Nov 28 2018
I have now implemented the syntax I described above. There are still some edge cases, mostly because of the fact that dhclient is started in a whole bunch of places, and making it all consistent is tricky. Perhaps refactoring /opt/vyatta/sbin/vyatta-dhcpv6-client.pl (probably rewriting it in Python) is in order. I may not do that right now, though.
OK! I'm happy to say that I have prefix delegation working with ISC dhclient, now, using a dhclient exit hook to collect the delegated prefix and farm out chunks of it to local interfaces. Now I'm tnhinking about the configuration syntax.
Nov 27 2018
Hey, laziness is a programmer virtue, remember!
Aha! Looking into it a bit more, the hook scripts are given the environment variables new_ip6_prefix and old_ip6_prefix, so that's where we should get the delegated prefix (and remove an old one, as appropriate). So, all we need to do is add some configuration settings to request PD and to indicate a subnet number within the delegated prefix to assign out to any desired interfaces. Then, it's a simple matter of exit-hook scripting to set this all up.
Nov 26 2018
That's very interesting. Thanks for sharing.
Nov 8 2018
I have sent a pull request that adds support for outbound IPv6 queries, implementing what I described above: https://github.com/vyos/vyos-1x/pull/58.
Yes, that change works. I'll look forward to it appearing in an RC. :)
Aha! I have figured out what causes pdns-recursor not to answer requests on its IPv6 sockets, even though it binds to them. It's the allow-from setting. If I change it from:
then everything works.
Nov 7 2018
The plot thickens, however. According to netstat -an | grep :53, it is listening on the IPv6 addresses specified.
Yes, I concur that keeping just listen-address for both address types would definitely be preferable, and we should just distinguish between them when building the config, if needed.
Oct 17 2018
A long time ago (before Oct 2016) I built Roy Marples' dhcpcd and hacked
/opt/vyatta/etc/config/scripts/vyatta-postconfig-bootup.script to install, configure, and start it up. I've been running with this config for over two years, and it's pretty stable. I'd love for this to be built into VyOS, rather than a local config hack.
Oct 29 2016
Hmm. Things are afoot.
Oct 7 2016
Recent dev builds on the current (lithium) branch don't need to be told which port is the console; systemd is able to figure it out, and spawns the correct getty processes.
Oct 6 2016
I've written a handy script to start ntpd manually:
I tried adding this to /config/scripts/vyatta-postconfig-bootup.script:
This hack does work, but it only lasts until you reboot VyOS. When the OS comes back up, you'll need to do this again.
Sep 21 2016
I have sent a pull request.
Sep 16 2016
Sep 8 2016
Most likely postinst, but I can't find that file in the git repos.
Aha!. I've tried 999.201609070235 (current). Things look quite a bit better; /opt/vyatta/etc/config/scripts/vyatta-postconfig-bootup.script is now persisted, and things seem to start up and run quite nicely.
It should be safe to start a getty on ttyS1 (in addition to the one on ttyS0) for all devices, shouldn't it? Even on devices that don't have a ttyS1 (or even a ttyS0), that shouldn't cause any failures.
Aha! I think I have found the cause. In vyatta-boot-image.pl, there is this code:
Ooh. I see that the script that copies over the ssh keys is vyatta-cfg-system/scripts/install/install-image-existing, but it's run on the old system--the one you're upgrading from. So putting the fix in there would require upgrading the old OS first.
Aug 10 2016
Unfortunately, I'm traveling right now, so I'll have to try out a newer image and give you the output from 'sudo blkid' in three weeks. I'll look forward to some good progress when I return!