Page MenuHomeVyOS Platform

snmpd crash
Closed, WontfixPublic

Description

On 2 instances of Vyos 1.1.7 64bits the snmpd process stops. No log messages pointing at the reason.
Running the process with debugging does not give any information yet.
3 instances of Vyos 1.1.7 on similar hardware do not have this issue, configs are very similar. All 5 are used as BGP routers and are monitored with LibreNMS.

Details

Difficulty level
Normal (likely a few hours)
Version
-

Event Timeline

Hello @Merijn
we actually will need hw specs for all of them,
and configs too, you can strip sensitive info from them
Thanks!

Not sufficient amount of details provided

The configs on all the routers are roughly the same, the SNMP config is completely the same.
I will collect the hardware details.

Today i was able to reproduce the issue by asking LibreNMS for a discovery and poller run on the routers.
The router with this issue came back with the following:
[12996289.397101] snmpd[21450]: segfault at 0 ip 00007f7aa2f9b783 sp 00007fff8b027150 error 4 in libnetsnmpmibs.so.30.0.2[7f7aa2f37000+15e000]
[13118934.979942] snmpd[14392]: segfault at 0 ip 00007f19328cd783 sp 00007fffba4b0c50 error 4 in libnetsnmpmibs.so.30.0.2[7f1932869000+15e000]
[13119530.842811] snmpd[14912]: segfault at 0 ip 00007f632287b783 sp 00007fff97b5a640 error 4 in libnetsnmpmibs.so.30.0.2[7f6322817000+15e000]
[13119977.416127] snmpd[15236]: segfault at 0 ip 00007f086c308783 sp 00007fffd4d55600 error 4 in libnetsnmpmibs.so.30.0.2[7f086c2a4000+15e000]

I checked the settings in LibreNMS and the selected checks are identical between the hosts with issues and the hosts without issues.

These are the details of the router with issues:

  1. ROUTER2:~$ show hardware cpu
  2. Architecture: x86_64
  3. CPU op-mode(s): 32-bit, 64-bit
  4. CPU(s): 4
  5. Thread(s) per core: 1
  6. Core(s) per socket: 4
  7. CPU socket(s): 1
  8. NUMA node(s): 1
  9. Vendor ID: GenuineIntel
  10. CPU family: 6
  11. Model: 30
  12. Stepping: 5
  13. CPU MHz: 2393.940
  14. Virtualization: VT-x
  15. L1d cache: 32K
  16. L1i cache: 32K
  17. L2 cache: 256K
  18. L3 cache: 8192K
  19. ROUTER2:~$ show hardware dmi
  20. bios_date: 09/10/2013
  21. bios_vendor: Dell Inc.
  22. bios_version: 1.10.0
  23. board_asset_tag:
  24. board_name: 05KX61
  25. board_vendor: Dell Inc.
  26. board_version: A02
  27. chassis_asset_tag:
  28. chassis_type: 23
  29. chassis_vendor: Dell Inc.
  30. chassis_version:
  31. product_name: PowerEdge R210
  32. product_version:
  33. sys_vendor: Dell Inc.
  34. ROUTER2:~$ show hardware pci
  35. 00:00.0 Host bridge: Intel Corporation Core Processor DMI (rev 11)
  36. 00:03.0 PCI bridge: Intel Corporation Core Processor PCI Express Root Port 1 (rev 11)
  37. 00:08.0 System peripheral: Intel Corporation Core Processor System Management Registers (rev 11)
  38. 00:08.1 System peripheral: Intel Corporation Core Processor Semaphore and Scratchpad Registers (rev 11)
  39. 00:08.2 System peripheral: Intel Corporation Core Processor System Control and Status Registers (rev 11)
  40. 00:08.3 System peripheral: Intel Corporation Core Processor Miscellaneous Registers (rev 11)
  41. 00:10.0 System peripheral: Intel Corporation Core Processor QPI Link (rev 11)
  42. 00:10.1 System peripheral: Intel Corporation Core Processor QPI Routing and Protocol Registers (rev 11)
  43. 00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  44. 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05)
  45. 00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  46. 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
  47. 00:1f.0 ISA bridge: Intel Corporation 3400 Series Chipset LPC Interface Controller (rev 05)
  48. 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 05)
  49. 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  50. 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  51. 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  52. 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  53. 03:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a)
  54. ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-Core Registers (rev 04)
  55. ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 04)
  56. ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 04)
  57. ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 04)
  58. ff:03.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller (rev 04)
  59. ff:03.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Target Address Decoder (rev 04)
  60. ff:03.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  61. ff:03.4 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  62. ff:04.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Control Registers (rev 04)
  63. ff:04.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Address Registers (rev 04)
  64. ff:04.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Rank Registers (rev 04)
  65. ff:04.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Thermal Control Registers (rev 04)
  66. ff:05.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Control Registers (rev 04)
  67. ff:05.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Address Registers (rev 04)
  68. ff:05.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Rank Registers (rev 04)
  69. ff:05.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Thermal Control Registers (rev 04)
  70. ROUTER2:~$ show hardware scsi
  71. [0:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sda
  72. [1:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sdb
  73. [3:0:0:0] cd/dvd TEAC DVD-ROM DV-28SW R.2A /dev/sr0
  74. ROUTER2:~$ show hardware usb
  75. Bus 002 Device 005: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
  76. Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
  77. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  78. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  79. ROUTER2:~$ show system routing-daemons
  80. zebra ripd ripngd ospfd ospf6d bgpd #
  81. ROUTER2:~$ show system image
  82. The system currently has the following image(s) installed:
  83. 1: VyOS-1.1.7 (default boot) (running image) #
  84. ROUTER2:~$ show version all
  85. Version: VyOS 1.1.7
  86. Description: VyOS 1.1.7 (helium)
  87. Copyright: 2016 VyOS maintainers and contributors
  88. Built by: [email protected]
  89. Built on: Wed Feb 17 09:57:31 UTC 2016
  90. Build ID: 1602170957-4459750
  91. System type: x86 64-bit
  92. Boot via: image
  93. HW model: PowerEdge R210
  94. HW S/N: 507645J
  95. HW UUID: 44454C4C-3000-1037-8036-B5C04F34354A

And details for a host without issues

  1. ROUTER1:~$ show hardware cpu
  2. Architecture: x86_64
  3. CPU op-mode(s): 32-bit, 64-bit
  4. CPU(s): 8
  5. Thread(s) per core: 2
  6. Core(s) per socket: 4
  7. CPU socket(s): 1
  8. NUMA node(s): 1
  9. Vendor ID: GenuineIntel
  10. CPU family: 6
  11. Model: 30
  12. Stepping: 5
  13. CPU MHz: 1861.964
  14. Virtualization: VT-x
  15. L1d cache: 32K
  16. L1i cache: 32K
  17. L2 cache: 256K
  18. L3 cache: 8192K
  19. ROUTER1:~$ show hardware dmi
  20. bios_date: 09/10/2013
  21. bios_vendor: Dell Inc.
  22. bios_version: 1.10.0
  23. board_asset_tag:
  24. board_name: 05KX61
  25. board_vendor: Dell Inc.
  26. board_version: A01
  27. chassis_asset_tag:
  28. chassis_type: 23
  29. chassis_vendor: Dell Inc.
  30. chassis_version:
  31. product_name: PowerEdge R210
  32. product_version:
  33. sys_vendor: Dell Inc.
  34. ROUTER1:~$ show hardware pci
  35. 00:00.0 Host bridge: Intel Corporation Core Processor DMI (rev 11)
  36. 00:03.0 PCI bridge: Intel Corporation Core Processor PCI Express Root Port 1 (rev 11)
  37. 00:08.0 System peripheral: Intel Corporation Core Processor System Management Registers (rev 11)
  38. 00:08.1 System peripheral: Intel Corporation Core Processor Semaphore and Scratchpad Registers (rev 11)
  39. 00:08.2 System peripheral: Intel Corporation Core Processor System Control and Status Registers (rev 11)
  40. 00:08.3 System peripheral: Intel Corporation Core Processor Miscellaneous Registers (rev 11)
  41. 00:10.0 System peripheral: Intel Corporation Core Processor QPI Link (rev 11)
  42. 00:10.1 System peripheral: Intel Corporation Core Processor QPI Routing and Protocol Registers (rev 11)
  43. 00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  44. 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05)
  45. 00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  46. 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
  47. 00:1f.0 ISA bridge: Intel Corporation 3400 Series Chipset LPC Interface Controller (rev 05)
  48. 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 05)
  49. 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  50. 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  51. 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  52. 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  53. 03:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a)
  54. ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-Core Registers (rev 04)
  55. ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 04)
  56. ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 04)
  57. ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 04)
  58. ff:03.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller (rev 04)
  59. ff:03.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Target Address Decoder (rev 04)
  60. ff:03.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  61. ff:03.4 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  62. ff:04.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Control Registers (rev 04)
  63. ff:04.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Address Registers (rev 04)
  64. ff:04.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Rank Registers (rev 04)
  65. ff:04.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Thermal Control Registers (rev 04)
  66. ff:05.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Control Registers (rev 04)
  67. ff:05.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Address Registers (rev 04)
  68. ff:05.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Rank Registers (rev 04)
  69. ff:05.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Thermal Control Registers (rev 04)
  70. ROUTER1:~$ show hardware scsi
  71. [0:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sda
  72. [1:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sdb
  73. [3:0:0:0] cd/dvd TEAC DVD-ROM DV-28SW R.2A /dev/sr0
  74. ROUTER1:~$ show hardware usb
  75. Bus 002 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
  76. Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
  77. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  78. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  79. ROUTER1:~$ show system routing-daemons
  80. zebra ripd ripngd ospfd ospf6d bgpd #
  81. ROUTER1:~$ show system image
  82. The system currently has the following image(s) installed:
  83. 1: VyOS-1.1.7 (default boot) (running image)
  84. 2: VyOS-1.1.6 #
  85. ROUTER1:~$ show version all
  86. Version: VyOS 1.1.7
  87. Description: VyOS 1.1.7 (helium)
  88. Copyright: 2016 VyOS maintainers and contributors
  89. Built by: [email protected]
  90. Built on: Wed Feb 17 09:57:31 UTC 2016
  91. Build ID: 1602170957-4459750
  92. System type: x86 64-bit
  93. Boot via: image
  94. HW model: PowerEdge R210
  95. HW S/N: 8P2705J
  96. HW UUID: 44454C4C-5000-1032-8037-B8C04F30354A

I noticed that both routers having the SNMP issues are the VRRP masters. I switches VRRP to Backup en rebooted the routers. SNMP daemon seems stable at this moment.

syncer triaged this task as Normal priority.
syncer changed the edit policy from "Task Author" to "Custom Policy".
syncer added a project: VyOS 1.1.x.
syncer changed Difficulty level from Easy (less than an hour) to Normal (likely a few hours).
syncer set Version to -.

This require further debugging
we will not spent time on 1.1.x series