Page MenuHomePhabricator

Removing serial console port from ESXi VM causes flooded syslog
Closed, InvalidPublicBUG

Description

I have deployed a fresh ISO installation into ESXi without a serial interface connected to the VM.

Removing the system console node from the CLI and even after a reboot did not fix the issue. My syslog is spammed.

Dec 12 19:27:32 VMU-02-AZURE systemd[1]: serial-getty@ttyS0.service holdoff time over, scheduling restart.
Dec 12 19:27:32 VMU-02-AZURE systemd[1]: Stopping Serial Getty on ttyS0...
Dec 12 19:27:32 VMU-02-AZURE systemd[1]: Starting Serial Getty on ttyS0...
Dec 12 19:27:32 VMU-02-AZURE systemd[1]: Started Serial Getty on ttyS0.
Dec 12 19:27:32 VMU-02-AZURE agetty[3580]: /dev/ttyS0: not a tty
Dec 12 19:27:43 VMU-02-AZURE systemd[1]: serial-getty@ttyS0.service holdoff time over, scheduling restart.
Dec 12 19:27:43 VMU-02-AZURE systemd[1]: Stopping Serial Getty on ttyS0...
Dec 12 19:27:43 VMU-02-AZURE systemd[1]: Starting Serial Getty on ttyS0...
Dec 12 19:27:43 VMU-02-AZURE systemd[1]: Started Serial Getty on ttyS0.
Dec 12 19:27:43 VMU-02-AZURE agetty[3617]: /dev/ttyS0: not a tty
Dec 12 19:27:53 VMU-02-AZURE systemd[1]: serial-getty@ttyS0.service holdoff time over, scheduling restart.
Dec 12 19:27:53 VMU-02-AZURE systemd[1]: Stopping Serial Getty on ttyS0...
Dec 12 19:27:53 VMU-02-AZURE systemd[1]: Starting Serial Getty on ttyS0...
Dec 12 19:27:53 VMU-02-AZURE systemd[1]: Started Serial Getty on ttyS0.
Dec 12 19:27:53 VMU-02-AZURE agetty[3624]: /dev/ttyS0: not a tty

Rebooting the system has no effect. It is only fixed after deleting the service by hand and rebooting the system again.

$ sudo systemctl disable serial-getty@ttyS0
Removed symlink /etc/systemd/system/getty.target.wants/serial-getty@ttyS0.service.
$ reboot

Details

Difficulty level
Unknown (require assessment)
Version
1.2.3
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Perfectly compatible

Event Timeline

c-po created this task.Dec 12 2019, 6:31 PM
c-po updated the task description. (Show Details)Dec 12 2019, 6:34 PM
jjakob added a subscriber: jjakob.EditedDec 12 2019, 7:10 PM

I'm experiencing the same issue of the service failing to start on 1.3.
The installation was first started with the default config in a VM that had a serial port. Then the installation was transferred to a physical machine without a serial port, and the whole /config directory was manually copied from the old installation on that machine. The machine was then rebooted. The result were the same errors in syslog/journal.
I believe the issue is that if the config.boot is manually replaced or edited on disk, the script that would normally be triggered on commit when deleting system console is never triggered, thus the service remains enabled, but there is no system console in the config to delete any more.

My proposal for a fix would be to run service setup scripts after each successful config load in the bootup process. These scripts would check the desired service state in the config, check the current enabled/running status via systemctl, then enable/disable and start/stop the service as necessary. The services would remain enabled/disabled across reboots, so even if the startup config load failed, the ssh/console services would still be started via systemd so that the failed load can be fixed.
This would be required foe all services that are started via systemd and are necessary for system access, e.g. ssh, system console (getty). Services that aren't strictly necessary for console access or are started in the config load process wouldn't need to have these scripts. They should be started later in the boot process when their configuration is created (at config load time).

hagbard renamed this task from Removing serial console port from ESXi VM causes flodded syslog to Removing serial console port from ESXi VM causes flooded syslog.Dec 12 2019, 7:11 PM
pasik added a subscriber: pasik.Dec 12 2019, 8:39 PM
syncer assigned this task to Dmitry.Jan 1 2020, 1:54 PM
syncer triaged this task as Normal priority.
syncer edited projects, added VyOS 1.3 Equuleus; removed VyOS 1.2 Crux.
Dmitry added a comment.Jan 2 2020, 6:40 PM

@c-po, it not possible to reproduce in 1.2.4 and 1.3 latest rolling.
I tried also delete system console on deployed 1.2.3, and it also works without issues, syslog clear.

c-po closed this task as Invalid.Jan 2 2020, 10:03 PM
c-po moved this task from Need Triage to Finished on the VyOS 1.3 Equuleus board.Sun, Feb 9, 2:15 PM