Created attachment 140177 [details] The attached file is the output from /sys/class/drm/card0/error Sorry if this is more noise on top of 102397 and 102470 (both a chrome and chromium), but I found this similar issue in the dmesg output : [Tue Jun 12 19:31:46 2018] [drm] stuck on render ring [Tue Jun 12 19:31:46 2018] [drm] GPU HANG: ecode 9:0:0x85dffffb, in electron [21270], reason: Ring hung, action: reset [Tue Jun 12 19:31:46 2018] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [Tue Jun 12 19:31:46 2018] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [Tue Jun 12 19:31:46 2018] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [Tue Jun 12 19:31:46 2018] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [Tue Jun 12 19:31:46 2018] [drm] GPU crash dump saved to /sys/class/drm/card0/error [Tue Jun 12 19:31:46 2018] drm/i915: Resetting chip after gpu hang [Tue Jun 12 19:31:48 2018] [drm] RC6 on This was the very last dmesg entry and grepping for "\[drm\]" only resulted in the same output. $ dmesg -T |grep "\[drm\]" [Tue Jun 12 19:31:46 2018] [drm] stuck on render ring [Tue Jun 12 19:31:46 2018] [drm] GPU HANG: ecode 9:0:0x85dffffb, in electron [21270], reason: Ring hung, action: reset [Tue Jun 12 19:31:46 2018] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [Tue Jun 12 19:31:46 2018] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [Tue Jun 12 19:31:46 2018] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [Tue Jun 12 19:31:46 2018] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [Tue Jun 12 19:31:46 2018] [drm] GPU crash dump saved to /sys/class/drm/card0/error [Tue Jun 12 19:31:48 2018] [drm] RC6 on Also, I've attached the /sys/class/drm/card0/error log (named sys_class_drm_card0_error_20180618). Thanks.
Hmm, first things first, please update both kernel and userspace drivers, both are quite old.
Here's some extra info regarding my hardware: -------------------------------------------------------------------------------- $ lspci 00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 07) 00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 06) 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31) 00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31) 00:16.3 Serial controller: Intel Corporation Sunrise Point-H KT Redirection (rev 31) 00:17.0 SATA controller: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] (rev 31) 00:1b.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Root Port #17 (rev f1) 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #1 (rev f1) 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #9 (rev f1) 00:1d.2 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #11 (rev f1) 00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31) 00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31) 00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31) 00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31) 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31) 04:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 04) -------------------------------------------------------------------------------- $ sudo dmidecode --type bios # dmidecode 3.0 Getting SMBIOS data from sysfs. SMBIOS 3.0 present. Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: American Megatrends Inc. Version: 1802 Release Date: 07/06/2016 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 16384 kB Characteristics: PCI is supported APM is supported BIOS is upgradeable BIOS shadowing is allowed Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 kB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) ACPI is supported USB legacy is supported BIOS boot specification is supported Targeted content distribution is supported UEFI is supported BIOS Revision: 5.11 Handle 0x0058, DMI type 13, 22 bytes BIOS Language Information Language Description Format: Long Installable Languages: 8 en|US|iso8859-1 fr|FR|iso8859-1 zh|CN|unicode <BAD INDEX> <BAD INDEX> <BAD INDEX> <BAD INDEX> <BAD INDEX> Currently Installed Language: en|US|iso8859-1 -------------------------------------------------------------------------------- $ sudo dmidecode --type baseboard # dmidecode 3.0 Getting SMBIOS data from sysfs. SMBIOS 3.0 present. Handle 0x0002, DMI type 2, 15 bytes Base Board Information Manufacturer: ASUSTeK COMPUTER INC. Product Name: Q170M-C Version: Rev X.0x Serial Number: *************** Asset Tag: Default string Features: Board is a hosting board Board is replaceable Location In Chassis: Default string Chassis Handle: 0x0003 Type: Motherboard Contained Object Handles: 0 Handle 0x0024, DMI type 10, 8 bytes On Board Device 1 Information Type: Video Status: Enabled Description: To Be Filled By O.E.M. On Board Device 2 Information Type: Ethernet Status: Enabled Description: To Be Filled By O.E.M. Handle 0x003E, DMI type 41, 11 bytes Onboard Device Reference Designation: Onboard IGD Type: Video Status: Enabled Type Instance: 1 Bus Address: 0000:00:02.0 Handle 0x003F, DMI type 41, 11 bytes Onboard Device Reference Designation: Onboard LAN Type: Ethernet Status: Enabled Type Instance: 1 Bus Address: 0000:00:19.0 Handle 0x0040, DMI type 41, 11 bytes Onboard Device Reference Designation: Onboard 1394 Type: Other Status: Enabled Type Instance: 1 Bus Address: 0000:03:1c.2 -------------------------------------------------------------------------------- $ sudo dmidecode --type processor # dmidecode 3.0 Getting SMBIOS data from sysfs. SMBIOS 3.0 present. Handle 0x0045, DMI type 4, 48 bytes Processor Information Socket Designation: LGA1151 Type: Central Processor Family: Core i5 Manufacturer: Intel(R) Corporation ID: E3 06 05 00 FF FB EB BF Signature: Type 0, Family 6, Model 94, Stepping 3 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) DS (Debug store) ACPI (ACPI supported) MMX (MMX technology supported) FXSR (FXSAVE and FXSTOR instructions supported) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) SS (Self-snoop) HTT (Multi-threading) TM (Thermal monitor supported) PBE (Pending break enabled) Version: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz Voltage: 1.0 V External Clock: 100 MHz Max Speed: 3600 MHz Current Speed: 3200 MHz Status: Populated, Enabled Upgrade: Other L1 Cache Handle: 0x0042 L2 Cache Handle: 0x0043 L3 Cache Handle: 0x0044 Serial Number: To Be Filled By O.E.M. Asset Tag: To Be Filled By O.E.M. Part Number: To Be Filled By O.E.M. Core Count: 4 Core Enabled: 4 Thread Count: 4 Characteristics: 64-bit capable Multi-Core Execute Protection Enhanced Virtualization Power/Performance Control -------------------------------------------------------------------------------- $ sudo dmidecode --type memory # dmidecode 3.0 Getting SMBIOS data from sysfs. SMBIOS 3.0 present. Handle 0x0046, DMI type 16, 23 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: None Maximum Capacity: 64 GB Error Information Handle: Not Provided Number Of Devices: 4 Handle 0x0047, DMI type 17, 40 bytes Memory Device Array Handle: 0x0046 Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 4096 MB Form Factor: DIMM Set: None Locator: DIMM_A1 Bank Locator: BANK 0 Type: DDR4 Type Detail: Synchronous Speed: 2133 MHz Manufacturer: Kingston Serial Number: 33322423 Asset Tag: 9876543210 Part Number: 9905678-033.A00G Rank: 1 Configured Clock Speed: 2133 MHz Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.2 V (only the first RAM slot is in use)
(In reply to Chris Wilson from comment #1) > Hmm, first things first, please update both kernel and userspace drivers, > both are quite old. I agree. However, in this current environment, we are kinda stuck on kernel version 4.4.0-31-generic. But thanks for the reply and I'll bring that up with my team. In the meantime, I can gather a few non-production systems, update them (apt update && apt upgrade && apt dist-upgrade), and run them under the same load to see what happens. I assume that we'll have trouble reproducing the issue because I've haven't notice this before (not to say it hasn't ever happened before, I just notice it today). Thanks for the reply.
Please try using https://cgit.freedesktop.org/drm-tip and send dmesg with the following added to the kernel params: drm.debug=0x1e log_buf_len=4M Do you have a feel for how often the issue happens?
Reporter, any updates to this?
No feedback in many months, closing as resolved works for me. Please re-open is still the case after testing latest https://cgit.freedesktop.org/drm-tip and send dmesg with drm.debug=0x1e log_buf_len=4M?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.