I still think the failure recovery mechanism needs to exist, but I agree with you. I think we should postpone the startup mechanism of dhcp6c until all interfaces have been initialized. A better idea is to execute dhcp6c processing uniformly after all interfaces have been initialized.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
May 28 2020
Recovery from failures does seem generally desirable, but it would also be preferable to discover errors in configuration while in conf code. For this reason, it seems like the best way to handle this would be to defer starting dhcp6c until the very end of configuring all the interfaces, if that's possible. Is there a mechanism already to do this, or should I look into restructuring things slightly.
Please merge this fix.
The repair settings take effect on tests in the local environment.
@gadams Yes, I thought that since the system CTL automatic restart failed, I might need to write a script to perform the automatic recovery. Now it doesn't seem necessary. I will modify its service file.
OK, I have found the best recovery. I will submit PR immediately. I will modify the service settings of systemctl and use its failure to restart automatically to fix the problem. When dhcp6c service fails to start, it will restart according to the preset settings.
Something else I realized last night: In general, it's not safe to start dhcp6c before all interfaces are configured, as long as PD is specified (whether 'address dhcpv6' is specified). That's because the prefix-delegation stanza can refer to any other interface on the system--even ones that haven't been set up, yet. That might include vif interfaces (such as I noticed last night) or any other virtual interface, like br or tun.
@gadams it makes no sense to use this as a catch-all thread. New requests/bugs should go into dedicated tasks.
@jjakob Yes, exactly my thoughts, and what my last pull request starts. I'll try to catch the remaining cases later this evening my time (in 12 hours or so). I can imagine one case that might be a little tricky.
Yes there have been issues with interface naming in the past. Hopefully they are finally resolved in 1.3 now.
In general, there are several solutions:
a) Add the CLI option of auto repair daemons, and rely on cron to execute the repair program. In case of service failure, the service can be restarted automatically
b) Find the only way to solve the problem thoroughly
Generally, I prefer a + b, so that when the service fails to start in a single time, the daemons can complete the recovery execution.
But it's just an idea. If you have any other suggestions, please let me know.
@zsdc can you try to reproduce this issue on 1.3 rollings or on 1.2.5? I can't reach this behavior.
PR added https://github.com/vyos/vyatta-cfg-vpn/pull/33.
vyos@vyos# commit [ vpn ] Warning: local prefix 192.168.34.0/24 specified for peer "192.168.50.2" is not configured on any interfaces
PR https://github.com/vyos/vyatta-cfg-quagga/pull/48
Also added the second commit which fixes the path to zebra daemon
I try to manually modify the contents of /etc/ppp/ipv6-up.d/1000-vyos-pppoe-pppoe0 , and change the following commands:
@gadams I agree it's confusing, to change the syntax isn't hard, we just have to choose the best user-friendly syntax and behavior. It can be even accomplished without changing the syntax, by:
a) if 'dhcpv6-options delegate' is set, do the same as for 'parameters-only', plus start dhclient by add_addr('dhcpv6')
or
b) start dhclient if either 'dhcpv6-parameters' or 'address dhcpv6' is set but only assign an address in the 'address dhcpv6' case, may be the simplest option.
Reworking my config so that it uses eth0->eth3 instead of eth1->eth4 makes everything work as expected. So something has clearly changed regarding the interface naming/creation logic.
@jjakob Yes, I tried dhcpv6-options parameters-only; it had no effect. I did not try 'address dhcpv6' simply because that doesn't seem like a great configuration. But it would have been worth testing, anyway. And while that would have been good to test, it would be a pretty awkward an unexpected workaround for regular users to think of it.
@gadams have you tried the above 2 settings: 'address dhcpv6' and 'dhcpv6-options parameters-only' without your patch to see if the client doesn't assign an address in that case?
I have sent a pull request: https://github.com/vyos/vyos-1x/pull/437
@tbr thanks for clarifying that, I agree. So the way to do that would be to set 'address dhcpv6' and 'dhcpv6-options parameters-only'. That is slightly confusing at first, as the combination of those 2 options shouldn't actually assign an address. I haven't tried it but that's how I expect it should work, I don't use PD currently. If it does work my comments regarding new methods in scripts are entirely unneeded.
This is difficult to solve with the current config syntax where the bond and bridge members are under the bonding and bridge nodes. When modifying bond or bridge members, only interfaces-bonding.py or interfaces-bridge.py is called, which can't modify the interfaces themselves, as all the interface logic (adding and deleting addresses) is in the interface script itself (e.g. interfaces-ethernet.py). The thing that says which script to run is the owner attribute of the interface node, which is ran by vyatta-cfg scripts on commit.
Just my input: It seems there is some confusion about what DHCPv6-PD actually is. And this reflects in the endless discussions and questions here in the thread...
Maybe we should add new methods to the Interface or DHCP class to allow starting just DHCPv6-PD without assigning an address to it? The way it's done now is by assigning an address with the value "dhcpv6" to the interface through the add/del_addr methods of Interface class. There needs to be a separate method for DHCPv6-PD without addressing (and generate a dhcpc config that doesn't assign the address, of course).
Aha! Thanks, @c-po. As I suspected, there is an assumption in ifconfig/interface.py that the DHCPv6 client need not be started if we're not getting our address via DHCPv6, which of course is not the case, here.
@gadams In my environment, oddly enough, after a reboot dhcp6c@pppoe0 Restart PPPoE 0 by using the vyos command or use systemctl to query and find that the service is manually restarted dhcp6c@pppoe0 Before that, the service was in a failed state and the Restart field of the service did not work. During the configuration loading process after system restart, I cannot make it start normally
I'm sorry to say that with the current rolling release, still no dhcp6c is started when I add the PD configuration. Here is my test config:
May 27 2020
Test package for 'vyos-api-tools' here:
https://github.com/jestabro/vyos-api-tools
During testing I've found that there is a well known problem (we had for ethernet interfaces) also in the serial ports. They can be enumerated and mapped to /dev/ttyUSBxxx differently from boot to boot. This is especially painful on my development APU4 board which also has a Sierra Wireless MC7710 LTE module installed which operates via ttyUSB2 (when no serial console cable is attached) - on subsequent boots this can become ttyUSB3 or depending on the number of FT232 dongles I attach.
Dependency dropped.
The dependency on flask-restx was dropped, in favor of FastAPI. The move to Flask itself for stability was completed,
The dependencies, such as FastAPI, are to be collected in the debian package 'vyos-api-tools'. Screenshot of the redoc page below:
@c-po It has to be said that this is really a troublesome problem. I'm still stuck in fault exploration, rather than making patches. Therefore, there may be some assumptions and verification processes, but it's really troublesome. When the system is restarted, the automatic operation of the service will fail, and I have to restart the service manually.
@c-po Strangely, after rebooting the system, I specifically dhcp6c@pppoe0 Query and find that this service is manually restarted by using the vyos command to restart PPPoE 0 or by using systemctl dhcp6c@pppoe0 Before, this service was in failure status, and your Restart field of the service didn't work. I can't make it start normally during the configuration loading process after the system is restarted 。
@gadams your mentioned problem is already fixed in the latest rolling image
@jack9603301 your assumptions are invalid. I have a fully reboot-save PPPoE setup. Please stop making wrong assumptions! and search the code properly!
Bug with FRRouting 7.3.1
@c-po Trace the problem that the system cannot automatically perform prefix delegation after restarting:
ps afx shows frr processes still running. systemctl stop vyos-router; pkill -f '*.frr*.'; systemctl start vyos-router makes vyos-router start successfully, but with these errors:
@c-po I didn't find out where to call using wide DHCP. I intend to analyze the reason why restart can't automatically start allocating prefix in the 20200523 image.
Additional information, it also appears to be broken in 1.2.5 (self built image) - seems to be the same problem.
Show ip route without prefix don't work.
VyOS added those during the 2nd boot after upgrade. I would assume bug related to the fact that my config doesn't include eth0.
set policy community-list COM01 rule 10 action permit set policy community-list COM01 rule 10 regex 65001:0 commit [edit] vyos@r-roll#
commit without errors
vtysh -c "show run" ! bgp community-list expanded COM01 permit 65001:0 !
Why 2 times hw-id for eth1 eth2 eth3
In T2523#65310, @c-po wrote:Why is there no eth0 on VyOS 1.2.5?
I included all the bits I thought were relevant in T421#65109 (you'll need to click "Show older changes" at this point to see it).
@gadams, please describe use case where wide does not start and include config with expected result and VyOS version. Sure config can be adjusted, luckily its an open Git repo so just send a PR.
@c-po @gadams In 20200523, I found several bug, although it will not affect my use too much (anyway, it can be restarted). For example, when pppoe is rebooted many times, the prefix will be reassigned, but the old prefix that has been assigned has not been deleted. Although I can keep it working, I still don't know how to solve it. I wonder whether restarting its interface can solve the problem when the current prefix allocation request is restarted.
I will track and test whether the dhcpv6-pd is functioning properly to see if the previous problems still exist in my environment.
May 26 2020
Thanks for the response.
dhclient is not used, wide-dhcp client is started on demand. Also prefixes are properly assigned to interfaces, using this at home for pppoe. Specific prefix size request is implemented as of T2506.
Hey, @c-po, thanks for getting this moving again. As you may recall, I did a lot of development work on this some time ago that I never pushed to get merged. (I dropped the ball.) Unfortunately, you've had to re-do a lot of what I did, I'm sure. I'm happy to incorporate any of that past work or do some fresh development to polish this feature up.
Windows 10 works with SLAAC like a charm.
I tried mocking with your configuration and thus needed to delete the policy statement as I have no policy installed. Maybe you can boot your system with the vyos-config-debug option and share the output? Or send the full config.boot.
Why is there no eth0 on VyOS 1.2.5?
Successfully tested on 1.3-rolling-202005261512, propose to backport it to CRUX.
vyos@r-roll:~$ show ip Invalid command: show [ip]
Need to remove from Makefile 2 strings
Does anyone have some idea on how to test with different kernels? For now this is a deal breaker while using the 1.3.x branch. Tho I would really love to keep using bleeding edge in order to help testing things :-)
One other question -- looks like SLAAC is working with the router-advert, but I believe Windows clients need DHCPv6 to receive addresses. I believe that the auto-config of DHCPv6 based on assigned prefix is not included yet, correct? Is that the 'Assisted' or 'Managed' modes that pfSense/Opnsense has?
Thank you @c-po. It's very good. I fully understand you. Like you, most of the open-source contributors are amateurs. I don't mean anything else. I think you may have misunderstood my remarks. Please don't be too sensitive. Anyway, thank you. In addition, you can take a look at the comments above and the two questions I raised. I don't know how to solve it for the time being. If you or someone else has a good solution, thank you.
Thanks again
@jack9603301 please note that this is currently a beta implementation which of course contains bugs. Also the CLI will change in the near future to support requesting specific prefix sizes (T2506)
The ethernet TypeError is fixed in the upcoming rolling release