Filtered by vendor Linux Subscriptions
Total 17575 CVE
CVE Vendors Products Updated CVSS v3.1
CVE-2025-68232 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: veth: more robust handing of race to avoid txq getting stuck Commit dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") introduced a race condition that can lead to a permanently stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra Max). The race occurs in veth_xmit(). The producer observes a full ptr_ring and stops the queue (netif_tx_stop_queue()). The subsequent conditional logic, intended to re-wake the queue if the consumer had just emptied it (if (__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a "lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and traffic halts. This failure is caused by an incorrect use of the __ptr_ring_empty() API from the producer side. As noted in kernel comments, this check is not guaranteed to be correct if a consumer is operating on another CPU. The empty test is based on ptr_ring->consumer_head, making it reliable only for the consumer. Using this check from the producer side is fundamentally racy. This patch fixes the race by adopting the more robust logic from an earlier version V4 of the patchset, which always flushed the peer: (1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier are removed. Instead, after stopping the queue, we unconditionally call __veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled, making it solely responsible for re-waking the TXQ. This handles the race where veth_poll() consumes all packets and completes NAPI *before* veth_xmit() on the producer side has called netif_tx_stop_queue. The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule NAPI. (2) On the consumer side, the logic for waking the peer TXQ is moved out of veth_xdp_rcv() and placed at the end of the veth_poll() function. This placement is part of fixing the race, as the netif_tx_queue_stopped() check must occur after rx_notify_masked is potentially set to false during NAPI completion. This handles the race where veth_poll() consumes all packets, but haven't finished (rx_notify_masked is still true). The producer veth_xmit() stops the TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning not starting NAPI. Then veth_poll() change rx_notify_masked to false and stops NAPI. Before exiting veth_poll() will observe TXQ is stopped and wake it up.
CVE-2025-68235 1 Linux 1 Linux Kernel 2025-12-18 N/A
In the Linux kernel, the following vulnerability has been resolved: nouveau/firmware: Add missing kfree() of nvkm_falcon_fw::boot nvkm_falcon_fw::boot is allocated, but no one frees it. This causes a kmemleak warning. Make sure this data is deallocated.
CVE-2025-68249 1 Linux 1 Linux Kernel 2025-12-18 N/A
In the Linux kernel, the following vulnerability has been resolved: most: usb: hdm_probe: Fix calling put_device() before device initialization The early error path in hdm_probe() can jump to err_free_mdev before &mdev->dev has been initialized with device_initialize(). Calling put_device(&mdev->dev) there triggers a device core WARN and ends up invoking kref_put(&kobj->kref, kobject_release) on an uninitialized kobject. In this path the private struct was only kmalloc'ed and the intended release is effectively kfree(mdev) anyway, so free it directly instead of calling put_device() on an uninitialized device. This removes the WARNING and fixes the pre-initialization error path.
CVE-2025-68302 1 Linux 1 Linux Kernel 2025-12-18 N/A
In the Linux kernel, the following vulnerability has been resolved: net: sxgbe: fix potential NULL dereference in sxgbe_rx() Currently, when skb is null, the driver prints an error and then dereferences skb on the next line. To fix this, let's add a 'break' after the error message to switch to sxgbe_rx_refill(), which is similar to the approach taken by the other drivers in this particular case, e.g. calxeda with xgmac_rx(). Found during a code review.
CVE-2025-68237 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: mtdchar: fix integer overflow in read/write ioctls The "req.start" and "req.len" variables are u64 values that come from the user at the start of the function. We mask away the high 32 bits of "req.len" so that's capped at U32_MAX but the "req.start" variable can go up to U64_MAX which means that the addition can still integer overflow. Use check_add_overflow() to fix this bug.
CVE-2025-68213 1 Linux 1 Linux Kernel 2025-12-18 7.0 High
In the Linux kernel, the following vulnerability has been resolved: idpf: fix possible vport_config NULL pointer deref in remove Attempting to remove the driver will cause a crash in cases where the vport failed to initialize. Following trace is from an instance where the driver failed during an attempt to create a VF: [ 1661.543624] idpf 0000:84:00.7: Device HW Reset initiated [ 1722.923726] idpf 0000:84:00.7: Transaction timed-out (op:1 cookie:2900 vc_op:1 salt:29 timeout:60000ms) [ 1723.353263] BUG: kernel NULL pointer dereference, address: 0000000000000028 ... [ 1723.358472] RIP: 0010:idpf_remove+0x11c/0x200 [idpf] ... [ 1723.364973] Call Trace: [ 1723.365475] <TASK> [ 1723.365972] pci_device_remove+0x42/0xb0 [ 1723.366481] device_release_driver_internal+0x1a9/0x210 [ 1723.366987] pci_stop_bus_device+0x6d/0x90 [ 1723.367488] pci_stop_and_remove_bus_device+0x12/0x20 [ 1723.367971] pci_iov_remove_virtfn+0xbd/0x120 [ 1723.368309] sriov_disable+0x34/0xe0 [ 1723.368643] idpf_sriov_configure+0x58/0x140 [idpf] [ 1723.368982] sriov_numvfs_store+0xda/0x1c0 Avoid the NULL pointer dereference by adding NULL pointer check for vport_config[i], before freeing user_config.q_coalesce.
CVE-2025-68314 1 Linux 1 Linux Kernel 2025-12-18 N/A
In the Linux kernel, the following vulnerability has been resolved: drm/msm: make sure last_fence is always updated Update last_fence in the vm-bind path instead of kernel managed path. last_fence is used to wait for work to finish in vm_bind contexts but not used for kernel managed contexts. This fixes a bug where last_fence is not waited on context close leading to faults as resources are freed while in use. Patchwork: https://patchwork.freedesktop.org/patch/680080/
CVE-2025-68299 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: afs: Fix delayed allocation of a cell's anonymous key The allocation of a cell's anonymous key is done in a background thread along with other cell setup such as doing a DNS upcall. In the reported bug, this is triggered by afs_parse_source() parsing the device name given to mount() and calling afs_lookup_cell() with the name of the cell. The normal key lookup then tries to use the key description on the anonymous authentication key as the reference for request_key() - but it may not yet be set and so an oops can happen. This has been made more likely to happen by the fix for dynamic lookup failure. Fix this by firstly allocating a reference name and attaching it to the afs_cell record when the record is created. It can share the memory allocation with the cell name (unfortunately it can't just overlap the cell name by prepending it with "afs@" as the cell name already has a '.' prepended for other purposes). This reference name is then passed to request_key(). Secondly, the anon key is now allocated on demand at the point a key is requested in afs_request_key() if it is not already allocated. A mutex is used to prevent multiple allocation for a cell. Thirdly, make afs_request_key_rcu() return NULL if the anonymous key isn't yet allocated (if we need it) and then the caller can return -ECHILD to drop out of RCU-mode and afs_request_key() can be called. Note that the anonymous key is kind of necessary to make the key lookup cache work as that doesn't currently cache a negative lookup, but it's probably worth some investigation to see if NULL can be used instead.
CVE-2025-68306 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: Bluetooth: btusb: mediatek: Fix kernel crash when releasing mtk iso interface When performing reset tests and encountering abnormal card drop issues that lead to a kernel crash, it is necessary to perform a null check before releasing resources to avoid attempting to release a null pointer. <4>[ 29.158070] Hardware name: Google Quigon sku196612/196613 board (DT) <4>[ 29.158076] Workqueue: hci0 hci_cmd_sync_work [bluetooth] <4>[ 29.158154] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) <4>[ 29.158162] pc : klist_remove+0x90/0x158 <4>[ 29.158174] lr : klist_remove+0x88/0x158 <4>[ 29.158180] sp : ffffffc0846b3c00 <4>[ 29.158185] pmr_save: 000000e0 <4>[ 29.158188] x29: ffffffc0846b3c30 x28: ffffff80cd31f880 x27: ffffff80c1bdc058 <4>[ 29.158199] x26: dead000000000100 x25: ffffffdbdc624ea3 x24: ffffff80c1bdc4c0 <4>[ 29.158209] x23: ffffffdbdc62a3e6 x22: ffffff80c6c07000 x21: ffffffdbdc829290 <4>[ 29.158219] x20: 0000000000000000 x19: ffffff80cd3e0648 x18: 000000031ec97781 <4>[ 29.158229] x17: ffffff80c1bdc4a8 x16: ffffffdc10576548 x15: ffffff80c1180428 <4>[ 29.158238] x14: 0000000000000000 x13: 000000000000e380 x12: 0000000000000018 <4>[ 29.158248] x11: ffffff80c2a7fd10 x10: 0000000000000000 x9 : 0000000100000000 <4>[ 29.158257] x8 : 0000000000000000 x7 : 7f7f7f7f7f7f7f7f x6 : 2d7223ff6364626d <4>[ 29.158266] x5 : 0000008000000000 x4 : 0000000000000020 x3 : 2e7325006465636e <4>[ 29.158275] x2 : ffffffdc11afeff8 x1 : 0000000000000000 x0 : ffffffdc11be4d0c <4>[ 29.158285] Call trace: <4>[ 29.158290] klist_remove+0x90/0x158 <4>[ 29.158298] device_release_driver_internal+0x20c/0x268 <4>[ 29.158308] device_release_driver+0x1c/0x30 <4>[ 29.158316] usb_driver_release_interface+0x70/0x88 <4>[ 29.158325] btusb_mtk_release_iso_intf+0x68/0xd8 [btusb (HASH:e8b6 5)] <4>[ 29.158347] btusb_mtk_reset+0x5c/0x480 [btusb (HASH:e8b6 5)] <4>[ 29.158361] hci_cmd_sync_work+0x10c/0x188 [bluetooth (HASH:a4fa 6)] <4>[ 29.158430] process_scheduled_works+0x258/0x4e8 <4>[ 29.158441] worker_thread+0x300/0x428 <4>[ 29.158448] kthread+0x108/0x1d0 <4>[ 29.158455] ret_from_fork+0x10/0x20 <0>[ 29.158467] Code: 91343000 940139d1 f9400268 927ff914 (f9401297) <4>[ 29.158474] ---[ end trace 0000000000000000 ]--- <0>[ 29.167129] Kernel panic - not syncing: Oops: Fatal exception <2>[ 29.167144] SMP: stopping secondary CPUs <4>[ 29.167158] ------------[ cut here ]------------
CVE-2025-68216 1 Linux 1 Linux Kernel 2025-12-18 N/A
In the Linux kernel, the following vulnerability has been resolved: LoongArch: BPF: Disable trampoline for kernel module function trace The current LoongArch BPF trampoline implementation is incompatible with tracing functions in kernel modules. This causes several severe and user-visible problems: * The `bpf_selftests/module_attach` test fails consistently. * Kernel lockup when a BPF program is attached to a module function [1]. * Critical kernel modules like WireGuard experience traffic disruption when their functions are traced with fentry [2]. Given the severity and the potential for other unknown side-effects, it is safest to disable the feature entirely for now. This patch prevents the BPF subsystem from allowing trampoline attachments to kernel module functions on LoongArch. This is a temporary mitigation until the core issues in the trampoline code for kernel module handling can be identified and fixed. [root@fedora bpf]# ./test_progs -a module_attach -v bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_module_attach:PASS:skel_open 0 nsec test_module_attach:PASS:set_attach_target 0 nsec test_module_attach:PASS:set_attach_target_explicit 0 nsec test_module_attach:PASS:skel_load 0 nsec libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP test_module_attach:FAIL:skel_attach skeleton attach failed: -524 Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED Successfully unloaded bpf_testmod.ko. [1]: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C13FMT58oqQ@mail.gmail.com/ [2]: https://lore.kernel.org/loongarch/CAK3+h2wYcpc+OwdLDUBvg2rF9rvvyc5amfHT-KcFaK93uoELPg@mail.gmail.com/
CVE-2025-68252 1 Linux 1 Linux Kernel 2025-12-18 N/A
In the Linux kernel, the following vulnerability has been resolved: misc: fastrpc: Fix dma_buf object leak in fastrpc_map_lookup In fastrpc_map_lookup, dma_buf_get is called to obtain a reference to the dma_buf for comparison purposes. However, this reference is never released when the function returns, leading to a dma_buf memory leak. Fix this by adding dma_buf_put before returning from the function, ensuring that the temporarily acquired reference is properly released regardless of whether a matching map is found. Rule: add
CVE-2025-68225 1 Linux 1 Linux Kernel 2025-12-18 N/A
In the Linux kernel, the following vulnerability has been resolved: lib/test_kho: check if KHO is enabled We must check whether KHO is enabled prior to issuing KHO commands, otherwise KHO internal data structures are not initialized.
CVE-2025-68250 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: hung_task: fix warnings caused by unaligned lock pointers The blocker tracking mechanism assumes that lock pointers are at least 4-byte aligned to use their lower bits for type encoding. However, as reported by Eero Tamminen, some architectures like m68k only guarantee 2-byte alignment of 32-bit values. This breaks the assumption and causes two related WARN_ON_ONCE checks to trigger. To fix this, the runtime checks are adjusted to silently ignore any lock that is not 4-byte aligned, effectively disabling the feature in such cases and avoiding the related warnings. Thanks to Geert Uytterhoeven for bisecting!
CVE-2025-68308 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: can: kvaser_usb: leaf: Fix potential infinite loop in command parsers The `kvaser_usb_leaf_wait_cmd()` and `kvaser_usb_leaf_read_bulk_callback` functions contain logic to zero-length commands. These commands are used to align data to the USB endpoint's wMaxPacketSize boundary. The driver attempts to skip these placeholders by aligning the buffer position `pos` to the next packet boundary using `round_up()` function. However, if zero-length command is found exactly on a packet boundary (i.e., `pos` is a multiple of wMaxPacketSize, including 0), `round_up` function will return the unchanged value of `pos`. This prevents `pos` to be increased, causing an infinite loop in the parsing logic. This patch fixes this in the function by using `pos + 1` instead. This ensures that even if `pos` is on a boundary, the calculation is based on `pos + 1`, forcing `round_up()` to always return the next aligned boundary.
CVE-2025-68222 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: pinctrl: s32cc: fix uninitialized memory in s32_pinctrl_desc s32_pinctrl_desc is allocated with devm_kmalloc(), but not all of its fields are initialized. Notably, num_custom_params is used in pinconf_generic_parse_dt_config(), resulting in intermittent allocation errors, such as the following splat when probing i2c-imx: WARNING: CPU: 0 PID: 176 at mm/page_alloc.c:4795 __alloc_pages_noprof+0x290/0x300 [...] Hardware name: NXP S32G3 Reference Design Board 3 (S32G-VNP-RDB3) (DT) [...] Call trace: __alloc_pages_noprof+0x290/0x300 (P) ___kmalloc_large_node+0x84/0x168 __kmalloc_large_node_noprof+0x34/0x120 __kmalloc_noprof+0x2ac/0x378 pinconf_generic_parse_dt_config+0x68/0x1a0 s32_dt_node_to_map+0x104/0x248 dt_to_map_one_config+0x154/0x1d8 pinctrl_dt_to_map+0x12c/0x280 create_pinctrl+0x6c/0x270 pinctrl_get+0xc0/0x170 devm_pinctrl_get+0x50/0xa0 pinctrl_bind_pins+0x60/0x2a0 really_probe+0x60/0x3a0 [...] __platform_driver_register+0x2c/0x40 i2c_adap_imx_init+0x28/0xff8 [i2c_imx] [...] This results in later parse failures that can cause issues in dependent drivers: s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c0-pins/i2c0-grp0: could not parse node property s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c0-pins/i2c0-grp0: could not parse node property [...] pca953x 0-0022: failed writing register: -6 i2c i2c-0: IMX I2C adapter registered s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c2-pins/i2c2-grp0: could not parse node property s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c2-pins/i2c2-grp0: could not parse node property i2c i2c-1: IMX I2C adapter registered s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c4-pins/i2c4-grp0: could not parse node property s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c4-pins/i2c4-grp0: could not parse node property i2c i2c-2: IMX I2C adapter registered Fix this by initializing s32_pinctrl_desc with devm_kzalloc() instead of devm_kmalloc() in s32_pinctrl_probe(), which sets the previously uninitialized fields to zero.
CVE-2025-68298 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: Bluetooth: btusb: mediatek: Avoid btusb_mtk_claim_iso_intf() NULL deref In btusb_mtk_setup(), we set `btmtk_data->isopkt_intf` to: usb_ifnum_to_if(data->udev, MTK_ISO_IFNUM) That function can return NULL in some cases. Even when it returns NULL, though, we still go on to call btusb_mtk_claim_iso_intf(). As of commit e9087e828827 ("Bluetooth: btusb: mediatek: Add locks for usb_driver_claim_interface()"), calling btusb_mtk_claim_iso_intf() when `btmtk_data->isopkt_intf` is NULL will cause a crash because we'll end up passing a bad pointer to device_lock(). Prior to that commit we'd pass the NULL pointer directly to usb_driver_claim_interface() which would detect it and return an error, which was handled. Resolve the crash in btusb_mtk_claim_iso_intf() by adding a NULL check at the start of the function. This makes the code handle a NULL `btmtk_data->isopkt_intf` the same way it did before the problematic commit (just with a slight change to the error message printed).
CVE-2025-68227 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: mptcp: Fix proto fallback detection with BPF The sockmap feature allows bpf syscall from userspace, or based on bpf sockops, replacing the sk_prot of sockets during protocol stack processing with sockmap's custom read/write interfaces. ''' tcp_rcv_state_process() syn_recv_sock()/subflow_syn_recv_sock() tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) bpf_skops_established <== sockops bpf_sock_map_update(sk) <== call bpf helper tcp_bpf_update_proto() <== update sk_prot ''' When the server has MPTCP enabled but the client sends a TCP SYN without MPTCP, subflow_syn_recv_sock() performs a fallback on the subflow, replacing the subflow sk's sk_prot with the native sk_prot. ''' subflow_syn_recv_sock() subflow_ulp_fallback() subflow_drop_ctx() mptcp_subflow_ops_undo_override() ''' Then, this subflow can be normally used by sockmap, which replaces the native sk_prot with sockmap's custom sk_prot. The issue occurs when the user executes accept::mptcp_stream_accept::mptcp_fallback_tcp_ops(). Here, it uses sk->sk_prot to compare with the native sk_prot, but this is incorrect when sockmap is used, as we may incorrectly set sk->sk_socket->ops. This fix uses the more generic sk_family for the comparison instead. Additionally, this also prevents a WARNING from occurring: result from ./scripts/decode_stacktrace.sh: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 337 at net/mptcp/protocol.c:68 mptcp_stream_accept \ (net/mptcp/protocol.c:4005) Modules linked in: ... PKRU: 55555554 Call Trace: <TASK> do_accept (net/socket.c:1989) __sys_accept4 (net/socket.c:2028 net/socket.c:2057) __x64_sys_accept (net/socket.c:2067) x64_sys_call (arch/x86/entry/syscall_64.c:41) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) RIP: 0033:0x7f87ac92b83d ---[ end trace 0000000000000000 ]---
CVE-2025-68287 1 Linux 1 Linux Kernel 2025-12-18 7.0 High
In the Linux kernel, the following vulnerability has been resolved: usb: dwc3: Fix race condition between concurrent dwc3_remove_requests() call paths This patch addresses a race condition caused by unsynchronized execution of multiple call paths invoking `dwc3_remove_requests()`, leading to premature freeing of USB requests and subsequent crashes. Three distinct execution paths interact with `dwc3_remove_requests()`: Path 1: Triggered via `dwc3_gadget_reset_interrupt()` during USB reset handling. The call stack includes: - `dwc3_ep0_reset_state()` - `dwc3_ep0_stall_and_restart()` - `dwc3_ep0_out_start()` - `dwc3_remove_requests()` - `dwc3_gadget_del_and_unmap_request()` Path 2: Also initiated from `dwc3_gadget_reset_interrupt()`, but through `dwc3_stop_active_transfers()`. The call stack includes: - `dwc3_stop_active_transfers()` - `dwc3_remove_requests()` - `dwc3_gadget_del_and_unmap_request()` Path 3: Occurs independently during `adb root` execution, which triggers USB function unbind and bind operations. The sequence includes: - `gserial_disconnect()` - `usb_ep_disable()` - `dwc3_gadget_ep_disable()` - `dwc3_remove_requests()` with `-ESHUTDOWN` status Path 3 operates asynchronously and lacks synchronization with Paths 1 and 2. When Path 3 completes, it disables endpoints and frees 'out' requests. If Paths 1 or 2 are still processing these requests, accessing freed memory leads to a crash due to use-after-free conditions. To fix this added check for request completion and skip processing if already completed and added the request status for ep0 while queue.
CVE-2025-68310 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump Do not block PCI config accesses through pci_cfg_access_lock() when executing the s390 variant of PCI error recovery: Acquire just device_lock() instead of pci_dev_lock() as powerpc's EEH and generig PCI AER processing do. During error recovery testing a pair of tasks was reported to be hung: mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working INFO: task kmcheck:72 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000 Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<000000065256f572>] schedule_preempt_disabled+0x22/0x30 [<0000000652570a94>] __mutex_lock.constprop.0+0x484/0x8a8 [<000003ff800673a4>] mlx5_unload_one+0x34/0x58 [mlx5_core] [<000003ff8006745c>] mlx5_pci_err_detected+0x94/0x140 [mlx5_core] [<0000000652556c5a>] zpci_event_attempt_error_recovery+0xf2/0x398 [<0000000651b9184a>] __zpci_event_error+0x23a/0x2c0 INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000 Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<0000000652172e28>] pci_wait_cfg+0x80/0xe8 [<0000000652172f94>] pci_cfg_access_lock+0x74/0x88 [<000003ff800916b6>] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core] [<000003ff80098824>] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core] [<000003ff80074b62>] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core] [<0000000652512242>] devlink_health_do_dump.part.0+0x82/0x168 [<0000000652513212>] devlink_health_report+0x19a/0x230 [<000003ff80075a12>] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core] No kernel log of the exact same error with an upstream kernel is available - but the very same deadlock situation can be constructed there, too: - task: kmcheck mlx5_unload_one() tries to acquire devlink lock while the PCI error recovery code has set pdev->block_cfg_access by way of pci_cfg_access_lock() - task: kworker mlx5_crdump_collect() tries to set block_cfg_access through pci_cfg_access_lock() while devlink_health_report() had acquired the devlink lock. A similar deadlock situation can be reproduced by requesting a crdump with > devlink health dump show pci/<BDF> reporter fw_fatal while PCI error recovery is executed on the same <BDF> physical function by mlx5_core's pci_error_handlers. On s390 this can be injected with > zpcictl --reset-fw <BDF> Tests with this patch failed to reproduce that second deadlock situation, the devlink command is rejected with "kernel answers: Permission denied" - and we get a kernel log message of: mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5 because the config read of VSC_SEMAPHORE is rejected by the underlying hardware. Two prior attempts to address this issue have been discussed and ultimately rejected [see link], with the primary argument that s390's implementation of PCI error recovery is imposing restrictions that neither powerpc's EEH nor PCI AER handling need. Tests show that PCI error recovery on s390 is running to completion even without blocking access to PCI config space.
CVE-2025-68233 1 Linux 1 Linux Kernel 2025-12-18 5.5 Medium
In the Linux kernel, the following vulnerability has been resolved: drm/tegra: Add call to put_pid() Add a call to put_pid() corresponding to get_task_pid(). host1x_memory_context_alloc() does not take ownership of the PID so we need to free it here to avoid leaking. [[email protected]: reword commit message]