snmpd crash
Closed, WontfixPublic

Description

On 2 instances of Vyos 1.1.7 64bits the snmpd process stops. No log messages pointing at the reason.
Running the process with debugging does not give any information yet.
3 instances of Vyos 1.1.7 on similar hardware do not have this issue, configs are very similar. All 5 are used as BGP routers and are monitored with LibreNMS.

Details

Difficulty level
Normal (likely a few hours)
Version
-
Merijn created this task.Nov 24 2016, 8:50 AM
syncer added a subscriber: syncer.Nov 25 2016, 11:39 AM

Hello @Merijn
we actually will need hw specs for all of them,
and configs too, you can strip sensitive info from them
Thanks!

syncer closed this task as Invalid.Dec 14 2016, 7:17 PM

Not sufficient amount of details provided

Merijn added a comment.Jan 2 2017, 7:57 PM

The configs on all the routers are roughly the same, the SNMP config is completely the same.
I will collect the hardware details.

Today i was able to reproduce the issue by asking LibreNMS for a discovery and poller run on the routers.
The router with this issue came back with the following:
[12996289.397101] snmpd[21450]: segfault at 0 ip 00007f7aa2f9b783 sp 00007fff8b027150 error 4 in libnetsnmpmibs.so.30.0.2[7f7aa2f37000+15e000]
[13118934.979942] snmpd[14392]: segfault at 0 ip 00007f19328cd783 sp 00007fffba4b0c50 error 4 in libnetsnmpmibs.so.30.0.2[7f1932869000+15e000]
[13119530.842811] snmpd[14912]: segfault at 0 ip 00007f632287b783 sp 00007fff97b5a640 error 4 in libnetsnmpmibs.so.30.0.2[7f6322817000+15e000]
[13119977.416127] snmpd[15236]: segfault at 0 ip 00007f086c308783 sp 00007fffd4d55600 error 4 in libnetsnmpmibs.so.30.0.2[7f086c2a4000+15e000]

I checked the settings in LibreNMS and the selected checks are identical between the hosts with issues and the hosts without issues.

Merijn reopened this task as Open.Jan 2 2017, 7:57 PM
Merijn added a comment.Jan 2 2017, 8:04 PM

These are the details of the router with issues:

  1. ROUTER2:~$ show hardware cpu
  2. Architecture: x86_64
  3. CPU op-mode(s): 32-bit, 64-bit
  4. CPU(s): 4
  5. Thread(s) per core: 1
  6. Core(s) per socket: 4
  7. CPU socket(s): 1
  8. NUMA node(s): 1
  9. Vendor ID: GenuineIntel
  10. CPU family: 6
  11. Model: 30
  12. Stepping: 5
  13. CPU MHz: 2393.940
  14. Virtualization: VT-x
  15. L1d cache: 32K
  16. L1i cache: 32K
  17. L2 cache: 256K
  18. L3 cache: 8192K
  19. ROUTER2:~$ show hardware dmi
  20. bios_date: 09/10/2013
  21. bios_vendor: Dell Inc.
  22. bios_version: 1.10.0
  23. board_asset_tag:
  24. board_name: 05KX61
  25. board_vendor: Dell Inc.
  26. board_version: A02
  27. chassis_asset_tag:
  28. chassis_type: 23
  29. chassis_vendor: Dell Inc.
  30. chassis_version:
  31. product_name: PowerEdge R210
  32. product_version:
  33. sys_vendor: Dell Inc.
  34. ROUTER2:~$ show hardware pci
  35. 00:00.0 Host bridge: Intel Corporation Core Processor DMI (rev 11)
  36. 00:03.0 PCI bridge: Intel Corporation Core Processor PCI Express Root Port 1 (rev 11)
  37. 00:08.0 System peripheral: Intel Corporation Core Processor System Management Registers (rev 11)
  38. 00:08.1 System peripheral: Intel Corporation Core Processor Semaphore and Scratchpad Registers (rev 11)
  39. 00:08.2 System peripheral: Intel Corporation Core Processor System Control and Status Registers (rev 11)
  40. 00:08.3 System peripheral: Intel Corporation Core Processor Miscellaneous Registers (rev 11)
  41. 00:10.0 System peripheral: Intel Corporation Core Processor QPI Link (rev 11)
  42. 00:10.1 System peripheral: Intel Corporation Core Processor QPI Routing and Protocol Registers (rev 11)
  43. 00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  44. 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05)
  45. 00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  46. 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
  47. 00:1f.0 ISA bridge: Intel Corporation 3400 Series Chipset LPC Interface Controller (rev 05)
  48. 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 05)
  49. 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  50. 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  51. 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  52. 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  53. 03:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a)
  54. ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-Core Registers (rev 04)
  55. ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 04)
  56. ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 04)
  57. ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 04)
  58. ff:03.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller (rev 04)
  59. ff:03.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Target Address Decoder (rev 04)
  60. ff:03.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  61. ff:03.4 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  62. ff:04.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Control Registers (rev 04)
  63. ff:04.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Address Registers (rev 04)
  64. ff:04.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Rank Registers (rev 04)
  65. ff:04.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Thermal Control Registers (rev 04)
  66. ff:05.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Control Registers (rev 04)
  67. ff:05.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Address Registers (rev 04)
  68. ff:05.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Rank Registers (rev 04)
  69. ff:05.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Thermal Control Registers (rev 04)
  70. ROUTER2:~$ show hardware scsi
  71. [0:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sda
  72. [1:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sdb
  73. [3:0:0:0] cd/dvd TEAC DVD-ROM DV-28SW R.2A /dev/sr0
  74. ROUTER2:~$ show hardware usb
  75. Bus 002 Device 005: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
  76. Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
  77. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  78. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  79. ROUTER2:~$ show system routing-daemons
  80. zebra ripd ripngd ospfd ospf6d bgpd #
  81. ROUTER2:~$ show system image
  82. The system currently has the following image(s) installed:
  83. 1: VyOS-1.1.7 (default boot) (running image) #
  84. ROUTER2:~$ show version all
  85. Version: VyOS 1.1.7
  86. Description: VyOS 1.1.7 (helium)
  87. Copyright: 2016 VyOS maintainers and contributors
  88. Built by: maintainers@vyos.net
  89. Built on: Wed Feb 17 09:57:31 UTC 2016
  90. Build ID: 1602170957-4459750
  91. System type: x86 64-bit
  92. Boot via: image
  93. HW model: PowerEdge R210
  94. HW S/N: 507645J
  95. HW UUID: 44454C4C-3000-1037-8036-B5C04F34354A

And details for a host without issues

  1. ROUTER1:~$ show hardware cpu
  2. Architecture: x86_64
  3. CPU op-mode(s): 32-bit, 64-bit
  4. CPU(s): 8
  5. Thread(s) per core: 2
  6. Core(s) per socket: 4
  7. CPU socket(s): 1
  8. NUMA node(s): 1
  9. Vendor ID: GenuineIntel
  10. CPU family: 6
  11. Model: 30
  12. Stepping: 5
  13. CPU MHz: 1861.964
  14. Virtualization: VT-x
  15. L1d cache: 32K
  16. L1i cache: 32K
  17. L2 cache: 256K
  18. L3 cache: 8192K
  19. ROUTER1:~$ show hardware dmi
  20. bios_date: 09/10/2013
  21. bios_vendor: Dell Inc.
  22. bios_version: 1.10.0
  23. board_asset_tag:
  24. board_name: 05KX61
  25. board_vendor: Dell Inc.
  26. board_version: A01
  27. chassis_asset_tag:
  28. chassis_type: 23
  29. chassis_vendor: Dell Inc.
  30. chassis_version:
  31. product_name: PowerEdge R210
  32. product_version:
  33. sys_vendor: Dell Inc.
  34. ROUTER1:~$ show hardware pci
  35. 00:00.0 Host bridge: Intel Corporation Core Processor DMI (rev 11)
  36. 00:03.0 PCI bridge: Intel Corporation Core Processor PCI Express Root Port 1 (rev 11)
  37. 00:08.0 System peripheral: Intel Corporation Core Processor System Management Registers (rev 11)
  38. 00:08.1 System peripheral: Intel Corporation Core Processor Semaphore and Scratchpad Registers (rev 11)
  39. 00:08.2 System peripheral: Intel Corporation Core Processor System Control and Status Registers (rev 11)
  40. 00:08.3 System peripheral: Intel Corporation Core Processor Miscellaneous Registers (rev 11)
  41. 00:10.0 System peripheral: Intel Corporation Core Processor QPI Link (rev 11)
  42. 00:10.1 System peripheral: Intel Corporation Core Processor QPI Routing and Protocol Registers (rev 11)
  43. 00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  44. 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05)
  45. 00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
  46. 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
  47. 00:1f.0 ISA bridge: Intel Corporation 3400 Series Chipset LPC Interface Controller (rev 05)
  48. 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 05)
  49. 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  50. 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
  51. 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  52. 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20)
  53. 03:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a)
  54. ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-Core Registers (rev 04)
  55. ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 04)
  56. ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 04)
  57. ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 04)
  58. ff:03.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller (rev 04)
  59. ff:03.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Target Address Decoder (rev 04)
  60. ff:03.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  61. ff:03.4 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04)
  62. ff:04.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Control Registers (rev 04)
  63. ff:04.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Address Registers (rev 04)
  64. ff:04.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Rank Registers (rev 04)
  65. ff:04.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Thermal Control Registers (rev 04)
  66. ff:05.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Control Registers (rev 04)
  67. ff:05.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Address Registers (rev 04)
  68. ff:05.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Rank Registers (rev 04)
  69. ff:05.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Thermal Control Registers (rev 04)
  70. ROUTER1:~$ show hardware scsi
  71. [0:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sda
  72. [1:0:0:0] disk ATA SAMSUNG HD501LJ CR10 /dev/sdb
  73. [3:0:0:0] cd/dvd TEAC DVD-ROM DV-28SW R.2A /dev/sr0
  74. ROUTER1:~$ show hardware usb
  75. Bus 002 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
  76. Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
  77. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  78. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  79. ROUTER1:~$ show system routing-daemons
  80. zebra ripd ripngd ospfd ospf6d bgpd #
  81. ROUTER1:~$ show system image
  82. The system currently has the following image(s) installed:
  83. 1: VyOS-1.1.7 (default boot) (running image)
  84. 2: VyOS-1.1.6 #
  85. ROUTER1:~$ show version all
  86. Version: VyOS 1.1.7
  87. Description: VyOS 1.1.7 (helium)
  88. Copyright: 2016 VyOS maintainers and contributors
  89. Built by: maintainers@vyos.net
  90. Built on: Wed Feb 17 09:57:31 UTC 2016
  91. Build ID: 1602170957-4459750
  92. System type: x86 64-bit
  93. Boot via: image
  94. HW model: PowerEdge R210
  95. HW S/N: 8P2705J
  96. HW UUID: 44454C4C-5000-1032-8037-B8C04F30354A

I noticed that both routers having the SNMP issues are the VRRP masters. I switches VRRP to Backup en rebooted the routers. SNMP daemon seems stable at this moment.

syncer triaged this task as Normal priority.Aug 1 2017, 4:40 AM
syncer changed the edit policy from "Task Author" to "Custom Policy".
syncer added a project: VyOS 1.1.x.
syncer changed Difficulty level from Easy (less than an hour) to Normal (likely a few hours).
syncer set Version to -.
syncer closed this task as Wontfix.

This require further debugging
we will not spent time on 1.1.x series