Page MenuHomeVyOS Platform

Kernel panic when QAT uses
Open, Requires assessmentPublicBUG

Description

This issue related to the new kernel and QAT driver. To reproduce this issue needs only enable QAT and pass some traffic via tunnel/vti

[  182.257269] kernel BUG at mm/slub.c:304!
[  182.305889] invalid opcode: 0000 [#1] SMP NOPTI
[  345.567183] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G           O      5.10.28-amd64-vyos #1
[  345.668898] Hardware name: Dell EMC VEP1445-V220/VEP1445-V220-CPU, BIOS 3.48.0.9-4 06/26/2019
[  345.772705] RIP: 0010:__slab_free+0x18b/0x340
[  345.826505] Code: 1f 44 00 00 eb 9c 41 f7 46 08 00 0d 21 00 0f 85 26 ff ff ff 4d 85 ed 0f 85 1d ff ff ff 80 4c 24 5b 80 45 31 ff e9 54 ff ff ff <0f> 0b 49 3b 5c 24 28 75 c4 48 8b 44 24 28 49 89 4c 24 28 49 89 44
[  346.053221] RSP: 0018:ffffaed800118e00 EFLAGS: 00010246
[  346.117440] RAX: ffff8f4ee3d45d00 RBX: 000000008020001e RCX: ffff8f4ee3d45c00
[  346.204574] RDX: ffff8f4ee3d45c00 RSI: ffffde6a048f5100 RDI: ffff8f4ec0043600
[  346.291709] RBP: ffffaed800118e98 R08: 0000000000000001 R09: ffffffffc09efa48
[  346.378845] R10: ffff8f4ee3d45c00 R11: 0000000000000001 R12: ffffde6a048f5100
[  346.465981] R13: ffff8f4ee3d45c00 R14: ffff8f4ec0043600 R15: ffffffffb50060c0
[  346.553116] FS:  0000000000000000(0000) GS:ffff8f522fc80000(0000) knlGS:0000000000000000
[  346.651710] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  346.722178] CR2: 000056184b1a89b8 CR3: 000000005020a000 CR4: 00000000003506e0
[  346.809314] Call Trace:
[  346.840202]  <IRQ>
[  346.865880]  ? skb_release_all+0x9/0x20
[  346.913429]  ? xfrm_input+0x2d8/0x1110
[  346.959940]  ? kmem_cache_free+0x39c/0x3c0
[  347.010620]  esp_input_done2+0x258/0x3a0 [esp4]
[  347.066503]  esp_input_done+0xd/0x20 [esp4]
[  347.118232]  adf_handle_response+0x40/0xc0 [intel_qat]
[  347.181408]  adf_response_handler+0x78/0xd0 [intel_qat]
[  347.245618]  tasklet_action_common.isra.21+0x54/0xc0
[  347.306712]  __do_softirq+0xd2/0x227
[  347.351138]  asm_call_irq_on_stack+0x12/0x20
[  347.403897]  </IRQ>
[  347.430618]  do_softirq_own_stack+0x32/0x40
[  347.482336]  irq_exit_rcu+0x98/0xa0
[  347.525721]  common_interrupt+0x73/0x130
[  347.574315]  asm_common_interrupt+0x1e/0x40
[  347.626034] RIP: 0010:cpuidle_enter_state+0xc6/0x2c0
[  347.687127] Code: 89 c6 e8 9d e9 ba ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 ac 01 00 00 31 ff e8 81 d9 bf ff fb 66 0f 1f 44 00 00 <85> db 0f 88 b3 00 00 00 48 63 c3 4c 2b 34 24 48 8d 14 40 48 8d 14
[  347.913845] RSP: 0018:ffffaed80008fe80 EFLAGS: 00000246
[  347.978064] RAX: ffff8f522fca2800 RBX: 0000000000000003 RCX: 000000000000001f
[  348.065199] RDX: 000000506f2e3e8b RSI: 000000003a2e8ba3 RDI: 0000000000000000
[  348.152334] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000022080
[  348.239470] R10: 000000cb911aaa18 R11: ffff8f522fca18c4 R12: ffff8f522fcab200
[  348.326605] R13: ffffffffb50b4a60 R14: 000000506f2e3e8b R15: 0000000000000000
[  348.413743]  cpuidle_enter+0x24/0x40
[  348.458169]  do_idle+0x24b/0x2a0
[  348.498430]  cpu_startup_entry+0x14/0x20
[  348.547023]  start_secondary+0x110/0x150
[  348.595617]  secondary_startup_64_no_verify+0xb0/0xbb
Apr 18 20:10:51 [  348.657753] Modules linked in: ip_vti jitterentropy_rng drbg ansi_cprng echainiv af_packet twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common cast5_generic cast_common ctr ecb des_generic libdes cbc algif_skcipher camellia_generic camellia_x86_64 xcbc sha512_ssse3 sha512_generic md4 algif_hash af_alg xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key usdm_drv(O) qat_c3xxx(O) intel_qat(O) dh_generic uio authenc fuse nft_chain_nat xt_CT xt_tcpudp nft_compat nfnetlink_cthelper nft_counter nf_tables nfnetlink nf_nat_pptp nf_conntrack_pptp nf_nat_h323 nf_conntrack_h323 nf_nat_sip nf_conntrack_sip nf_nat_tftp nf_nat_ftp nf_nat nf_conntrack_tftp nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ath10k_pci ath10k_core ath mac80211 cfg80211 pnd2_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel libarc4
QAT-VEP1445 kern[  348.657825]  iTCO_wdt iTCO_vendor_support tpm_crb pcspkr tpm_tis tpm_tis_core tpm evdev crypto_simd cryptd glue_helper rng_core rapl button intel_cstate acpi_cpufreq mpls_iptunnel mpls_router ip_tunnel mpls_gso br_netfilter bridge stp llc ip_tables x_tables autofs4 nls_cp437 vfat fat ohci_hcd uhci_hcd ehci_hcd squashfs zstd_decompress lz4_decompress loop overlay ext4 crc32c_generic crc16 mbcache jbd2 nls_ascii usb_storage sd_mod t10_pi mmc_block ahci libahci sdhci_pci cqhci crc32c_intel sdhci xhci_pci ixgbe libata i2c_i801 xfrm_algo i2c_smbus mmc_core mdio scsi_mod xhci_hcd i2c_ismt igb i2c_algo_bit thermal
[  350.389334] ---[ end trace d859569d05404950 ]---
el:   350.448482] RIP: 0010:__slab_free+0x18b/0x340
[  350.516714] Code: 1f 44 00 00 eb 9c 41 f7 46 08 00 0d 21 00 0f 85 26 ff ff ff 4d 85 ed 0f 85 1d ff ff ff 80 4c 24 5b 80 45 31 ff e9 54 ff ff ff <0f> 0b 49 3b 5c 24 28 75 c4 48 8b 44 24 28 49 89 4c 24 28 49 89 44
[  350.743439] RSP: 0018:ffffaed800118e00 EFLAGS: 00010246
[  350.807650] RAX: ffff8f4ee3d45d00 RBX: 000000008020001e RCX: ffff8f4ee3d45c00
;1;31m-[  350.894787] RDX: ffff8f4ee3d45c00 RSI: ffffde6a048f5100 RDI: ffff8f4ec0043600
[  350.998588] RBP: ffffaed800118e98 R08: 0000000000000001 R09: ffffffffc09efa48
[  351.085726] R10: ffff8f4ee3d45c00 R11: 0000000000000001 R12: ffffde6a048f5100
[  351.172859] R13: ffff8f4ee3d45c00 R14: ffff8f4ec0043600 R15: ffffffffb50060c0
[  351.259995] FS:  0000000000000000(0000) GS:ffff8f522fc80000(0000) knlGS:0000000000000000
[  351.358589] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  351.429058] CR2: 000056184b1a89b8 CR3: 000000005020a000 CR4: 00000000003506e0
[  351.516193] Kernel panic - not syncing: Fatal exception in interrupt
[  351.594004] Kernel Offset: 0x33000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  351.727083] Rebooting in 60 seconds..

Details

Difficulty level
Hard (possibly days)
Version
1.4-rolling-202104091411
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)