Bug 90262

Summary: GHES/AEPI kernel panic when a network device is detected (on rare PCIe hardware)
Product: systemd Reporter: Jason S. McMullan <jason.mcmullan>
Component: generalAssignee: Kay Sievers <kay>
Status: RESOLVED FIXED QA Contact: systemd-bugs
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: systemd-stable/v208 udev net_id PCIe config read patch
systemd/master udev net_id PCIe config read patch

Description Jason S. McMullan 2015-05-01 00:01:06 UTC
Created attachment 115490 [details]
systemd-stable/v208 udev net_id PCIe config read patch

Due to fread() buffering when fetching the PCIe config space, some
rare PCIe hardware will generate a PCIe Completion Timeout when
unknown PCIe config space values are read, causing a kernel panic
on Dell r720/r730 and other systems which have AEPI/GHES reporting
enabled in the Linux kernel and motherboard BIOS.

The original code in src/udev/udev-builtin-net_id.c used fread(),
which on some libc implementations (ie glibc 2.17) would pre-read
a full 4K (PAGE_SIZE) of the PCI config space, when only 64 bytes
were requested.
    
I have recently come across PCIe hardware which responds with
Completion Timeouts when accesses above 256 bytes are attempted.
    
This can cause server systems - such as the Dell r720/r730 - that
have GHES/AEPI support to cause an immediate kernel panic due to
the failed PCI transaction.
    
Attached are patches against systemd-stable/v208 (the version that
I originally found the issue in) and systemd/master (head of line)
which correct this issue by using read() instead of fread().
Comment 1 Jason S. McMullan 2015-05-01 00:02:04 UTC
Created attachment 115491 [details] [review]
systemd/master udev net_id PCIe config read patch
Comment 2 Jason S. McMullan 2015-05-22 16:45:26 UTC
Kay, any idea when these patches can be integrated into the systemd mainline?
Comment 3 Tom Gundersen 2015-05-22 18:57:04 UTC
Hi Jason,

Thanks for the patch. I now pushed this upstream.

Cheers,

Tom

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.