Thanks much @hagbard . I will test it out and update.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
Apr 29 2019
Apr 28 2019
next rolling will have the fixed version.
Apr 27 2019
Hi Daniil,
Apr 26 2019
Fixed in commits:
@jestabro I fixed the indent on both current and curx thanks for reporting
Apr 25 2019
Yes, it's pretty vague bug, and seems it's more related on how the VM was initially created if it will work or not.
I had mixed results with the dhclient part, but that's not the major issue and April 25 iso should have the refactored script on board. I see exactly the same issues with netplug you see, it went a few rolling iso's back to test with but couldn't determine yet when it has started. Even an ip link set up dev <device> doesn't bring the interface back up. netplug get the status information via the netlink interface from the kenrel, so I'm going to start looking there to see if anything has changed. Going forward, I think systemd-networkd will be the successor sooner or later anyway, I gotta play around with it at one point anyway. It usually monitors the interfaces via netlink as well, but has more filters and rules you can therefore apply.
I'm still not too sure why it sometimes breaks and sometimes works, I didn't find anything useful in the log too, only the information we have already.
I was using esxi 6.7.
Development on this resides in a fork of vyatta-config-migrate (https://github.com/jestabro/vyatta-config-migrate), though properly should reside in a fork of vyos-1x.
hi @hagbard, I did some extensive testing. Actually I was already testing with "1.2.0-rolling+201904240337". So here are my findings.
Looks like a race condition, since it is now being started by systemd as well, which was previously not the case.
@yun I think I found something, vmware-tools won't even call ether-resume.py, it only does sometimes and sometimes not. I tested it with 1.2.0-rolling+201904240337 and did a suspend and resume multiple times with the old ether-resume.py and everything is just working fine.
But there was before a fully working one, anyway curl will work as well. Let's close this ticket then, was just bad communication I guess. I have found a few other issues, I'm currently looking into. Looks like netlink in the kernel changed, breaks netplug and pppoe-server. Thanks for pointing me into the right direction.,
Ah, now I see - yes this is correct, there is no "real" wget installed instead the BusyBox version is used. BusyBox is a multi-call binary (a lot of tiny helper programs shipped in one binary, program snipped is determined using argv[0] that's why every BusyBox binary is a symlink to /bin/busybox
wget https://downloads.vyos.net results in 'not an ftp or http url'
wget https://...
or if you check with ldd you'll see that it is only compiled against libc and that's it.
I can not reproduce the issue.
I'm experiencing this issue as well on a homespun 1.2.0 image; have been for a little over a month now. Occurs almost every day with no indication as to what the cause can be.
added not on old-style cfg-mode templates
Added a sidenote about switching on router and routing on switch
First commit on implementing this: https://github.com/runborg/vyatta-cfg-system/commit/15f6f2e06cc3e7d4e25f9cd381e70b8d978717f6
Now it seems to report guest info succesfully:
Tested in 1.2.0-rolling+201904250337, everything looks great now, no warnings arise and required config is in place, so I'll wait for 1.2.2 to get the fixed version of ethtool.
Thanks for your efforts, great work!
I'm not a fan of cluttering the OS too much, there are two much better ways
I've commit that issue. According to my understanding, this problem will be solved in vyos version 1.3. Have you fix that issue?
Did you ever get this figured out?
Apr 24 2019
Check for the /var/log/vmware-network.log files, the tool creates for each type a log and rotates it once the command finished.
Because you mentioned networkd earlier, I looked into this immediately and found the following differences:
Thanks, I tested it, my findings below:
Ok, just reopen the PR. I'll review and merge it in then.
When I read it back, I can understand the confusion. Sorry, will try to be more clear next time.
Ah, I see. You didn't mention that, so I was quite confused about your statement.
@hagbard But we were talking about my patch, and that it didn't work for you in latest rolling... So i tested my patch in the latest rolling (and noted the date) that it worked. Should I have made it more clear that I was testing my patch?
I wonder what changed then, will also test with latest rolling
Hi hagbard, I don't understand why you close the PR so early without me testing the latest iso. Please when you refer to "latest" iso, to also note the rolling date. This makes it easier for everyone who tries to contribute i think.
This router is receiving BGP from several internal BGP routers each with full table peers or couple of peerings.
Hmmm, yeah, this one isn't doing anything yet either - just a test.
I've closed your PR without merging, since it can't be the script. Shall I close this bug here for now and you open a new one when you hit the road bumps again?
This one is running on Hyper-V 2016 and is not pushing any traffic. It is my test router and experimenting with RPKI.
The routers doing traffic are on hardware and not running 1.2.x yet.
nteplugd is just fine. vmwaretoolsd tries to start (resume-vm) the interfaces via systemd-networkd (it looks for the interface files), then the ether-resume kicks in and starts the dhclient, so far so good. Netplugd isn't the issue here. I also have seen it 2-3 times, right now it works in the same environment I used yesterday. I think there might be some interference with systemd-networkd called by the vmware scripts.
If you observe it again, please let me know the image you used, so I can reproduce it better. Also, you should find /var/log/vmware-net.... logs on the sytem, they basically trace all calls from the vmware supplied scripts when you trigger an action via the vmwaretoolsd. If it happens again, let's have a look at these files.
Thanks for the detailed history, that makes things more clear.
So for me the latest rolling worked, do you know what part from networkd is interfering with dhcp for you? Did you see if netplug called dhcp correctly after resume?
Ah, Thanks Merijn!
@zsdc Can you please test?
I am running 1.2.1 compiled on 17-04-2019, uptime is 6 days without issue.
RIB entries 1366663, using 209 MiB of memory
Peers 16, using 330 KiB of memory
Peer groups 4, using 256 bytes of memory
My next step is to replace the VMWare image with the standard ISO I build at the same time, to eliminate vmtools. I tried reducing the open-vm-tools config statement:
I am also seeing a memory leak on a BGP full tables router. It is NOT using flow accounting, but IS a Crux 1.2.1 compiled VMWare image.
The garps should be send as they are being set per default to the values you have pasted from the documentation. Do you want to make that a config option to modify the defaults?
@zsdc Can you please share some config data or clarify what you mean? thx
I didn't changed anything. I did the netplug changes via T894 i think, I would have to look it up. The only change happened for the vmware-tools itself, we switched to the debian jessie package and now to the bpo jessie one.
That package contains the suspend and resume, poweroff and poweron scripts/structure. Netplug is entirely separate and the package comes from our pool. It contains a linkdown and a linkup scripts, which basically triggers the link up/down scripts which are the original vyatta ones, which came previously via vyatta-cfg-system or so.
So, there was basically a huge cleanup, plus making netplugd available again (was removed for an unknown reason before), repakage and the latest open-vm-tools plus the script we deploy for it for the resume/suspend mechanism.
So, right now I'm not sure how stable it is, please let me know if you uncover further issues, it should be logged via syslog so we have a chance ti investigate what it may does when it's blocked.
I wonder what changed then, will also test with latest rolling
Apr 23 2019
The interference seems to come from networkd, which is executed via ./scripts/vmware/network resume-vm executed by vmware-toolsd. So that looks like a longer mission.
I left you a few comments on the PR. Tested it now as well, your code doesn't work from what I see. But I see that dhcp stopped working, I have a look and see what I can find out. Looks like netplugd in the latest rolling has an issue too.
Fix, should be in the next rolling release:
https://github.com/vyos/vyatta-cfg-quagga/commit/41df1579f6ca3e5a1618ee85bbb337011148f1ef
Note that the mentioned annoyance of migrate/system/3-to-4 setting the serial console speed should be mooted by 'T805 Drop config compatibility with Vyatta Core older than 6.5'.
@yun yes, please create a PR, I have a look the asap.
@zsdc is local-as required anyway? Isn't it always the same as the router-as?
Ok final attempt and trivial fix.
It seems that changing run-parts to /bin/run-parts was not needed. So netplug works fine as it is.
Hi all, I can confirm that with vyos-1.2.0-rolling+201904160337-amd64, this issue is fixed.
If I boot the older 2019-02-16 version, the bug can be reproduced easily. So it must be an issue in FRR that is introduced in 7.1 as the newer livecd uses FRR 7.0:
Is this a FRR bug or something else? Because I don't use any BGP stuff I just added the ip -4 route add command to my VM, so it's always executed. However, as @runar mentioned, it will bypass FRR. But executing the command via FRR didn't work, so the issue must be in FRR?
Gonna keep the task open for a week or 2 to see if there are still any other issues.
It will be good to have ability to configure followed GARP settings for individual VRRP groups or, at least for keepalived daemon overall, becouse in some situation switches can filter multiple ARP-packets, that is generated on transition.
In the process of migration to 1.2.1 we have discovered, that some GARP packets (we have 6 VRRP-groups on Interner interface) was filtered with ARP-spoofing filter by our ISP.
Problem was solved with VRRP-migration scripts, that execute some additional arping in ARP-Reply mode.
Any chance of getting this merged into 1.2.2?
@c-po thanks for that. I changed my configs from postconfig script to new config syntax
Relevant config: