Bug 87163 - GPU hang when using Chromium
Summary: GPU hang when using Chromium
Status: CLOSED DUPLICATE of bug 62373
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-09 17:46 UTC by Ricardo M. Correia
Modified: 2017-07-24 22:50 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Contents of /sys/class/drm/card1/error (52.22 KB, text/plain)
2014-12-09 17:46 UTC, Ricardo M. Correia
no flags Details

Description Ricardo M. Correia 2014-12-09 17:46:25 UTC
Created attachment 110637 [details]
Contents of /sys/class/drm/card1/error

I'm getting GPU hangs when using Chromium.

systemd's journal reports the following, when using kernel 3.14.25:

Dec 08 17:26:14 wizylap kernel: [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Dec 08 17:26:14 wizylap kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Dec 08 17:26:14 wizylap kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Dec 08 17:26:14 wizylap kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Dec 08 17:26:14 wizylap kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Dec 08 17:26:14 wizylap kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE) [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE)
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE) Backtrace:
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE)
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE) [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE) [mi] mieq is *NOT* the cause.  It is a victim.
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE) [mi] EQ overflow continuing.  100 events have been dropped.
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE)
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE) Backtrace:
Dec 08 17:26:24 wizylap display-manager-start[2025]: (EE)
Dec 08 17:26:25 wizylap kernel: [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Dec 08 17:26:25 wizylap display-manager-start[2025]: [mi] Increasing EQ size to 1024 to prevent dropped events.
Dec 08 17:26:25 wizylap display-manager-start[2025]: [mi] EQ processing has resumed after 199 dropped events.
Dec 08 17:26:25 wizylap display-manager-start[2025]: [mi] This may be caused my a misbehaving driver monopolizing the server's resources.
Dec 08 17:26:59 wizylap kernel: [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring


When using kernel 3.17.6, systemd reports the following:


Dec 08 19:29:11 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Dec 08 19:29:11 wizylap kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Dec 08 19:29:11 wizylap kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Dec 08 19:29:11 wizylap kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Dec 08 19:29:11 wizylap kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Dec 08 19:29:11 wizylap kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Dec 08 19:29:21 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Dec 08 19:42:46 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Dec 08 19:42:54 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Dec 08 19:44:16 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Dec 08 19:44:30 wizylap display-manager-start[2056]: (EE) [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.
Dec 08 19:44:30 wizylap display-manager-start[2056]: (EE)
Dec 08 19:44:30 wizylap display-manager-start[2056]: (EE) Backtrace:
Dec 08 19:44:30 wizylap display-manager-start[2056]: (EE)
Dec 08 19:44:30 wizylap display-manager-start[2056]: (EE) [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
Dec 08 19:44:30 wizylap display-manager-start[2056]: (EE) [mi] mieq is *NOT* the cause.  It is a victim.
Dec 08 19:44:31 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Dec 08 19:44:31 wizylap display-manager-start[2056]: [mi] Increasing EQ size to 1024 to prevent dropped events.
Dec 08 19:44:31 wizylap display-manager-start[2056]: [mi] EQ processing has resumed after 61 dropped events.
Dec 08 19:44:31 wizylap display-manager-start[2056]: [mi] This may be caused my a misbehaving driver monopolizing the server's resources.
Dec 08 19:45:05 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Dec 08 19:45:05 wizylap kernel: [drm] no progress on render ring
Dec 08 19:45:05 wizylap kernel: [drm] GPU HANG: ecode -1:0x00000000, reason: Ring hung, action: reset


Here's some more info from chrome://gpu:

Graphics Feature Status
Canvas: Hardware accelerated
Flash: Hardware accelerated
Flash Stage3D: Hardware accelerated
Flash Stage3D Baseline profile: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Disabled
Rasterization: Hardware accelerated
Threaded Rasterization: Enabled
Video Decode: Hardware accelerated
Video Encode: Hardware accelerated
WebGL: Hardware accelerated
Driver Bug Workarounds
clear_uniforms_before_first_program_use
count_all_in_varyings_packing
disable_post_sub_buffers_for_onscreen_surfaces
scalarize_vec_and_mat_constructor_args
Problems Detected
Clear uniforms before first program use on all platforms: 124764, 349137
Applied Workarounds: clear_uniforms_before_first_program_use
Mesa drivers in Linux handle varyings without static use incorrectly: 333885
Applied Workarounds: count_all_in_varyings_packing
Disable partial swaps on linux drivers: 339493
Applied Workarounds: disable_post_sub_buffers_for_onscreen_surfaces
Always rewrite vec/mat constructors to be consistent: 398694
Applied Workarounds: scalarize_vec_and_mat_constructor_args
Raster is using a single thread.
Disabled Features: multiple_raster_threads
Version Information
Data exported   12/9/2014, 6:19:37 PM
Chrome version  Chrome/39.0.2171.71
Operating system    Linux 3.17.6
Software rendering list version 0
Driver bug list version 7.7
ANGLE commit id unknown hash
2D graphics backend Skia
Command Line Args   --ppapi-flash-path=/nix/store/ln5sj7v6ynxf1vs50y707jj3bh21rbif-chromium-binary-plugins-flash/lib/libpepflashplayer.so --ppapi-flash-version=15.0.0.239 --flag-switches-begin --ignore-gpu-blacklist --flag-switches-end
Driver Information
Initialization time 237
Sandboxed   true
GPU0    VENDOR = 0x0000, DEVICE= 0x0000
Optimus false
AMD switchable  false
Driver vendor   Mesa
Driver version  10.2.9
Driver date 
Pixel shader version    1.30
Vertex shader version   1.30
Machine model name  
Machine model version   
GL_VENDOR   Intel Open Source Technology Center
GL_RENDERER Mesa DRI Intel(R) Sandybridge Mobile
GL_VERSION  3.0 Mesa 10.2.9
...
indow system binding vendor SGI
Window system binding version   1.4
Window system binding extensions    GLX_ARB_create_context GLX_ARB_create_context_profile GLX_ARB_create_context_robustness GLX_ARB_fbconfig_float GLX_ARB_framebuffer_sRGB GLX_ARB_multisample GLX_EXT_create_context_es2_profile GLX_EXT_framebuffer_sRGB GLX_EXT_import_context GLX_EXT_texture_from_pixmap GLX_EXT_visual_info GLX_EXT_visual_rating GLX_MESA_copy_sub_buffer GLX_OML_swap_method GLX_SGI_swap_control GLX_SGIS_multisample GLX_SGIX_fbconfig GLX_SGIX_pbuffer GLX_SGIX_visual_select_group GLX_INTEL_swap_event
Window manager  Xfwm4
Compositing manager No
Direct rendering    Yes
Reset notification strategy 0x8252
GPU process crash count 0
Log Messages
[3040:3040:1209/171937:ERROR:gpu_video_decode_accelerator.cc(301)] : Not implemented reached in void content::GpuVideoDecodeAccelerator::Initialize(media::VideoCodecProfile, IPC::Message*)HW video decode acceleration not available.
[3040:3040:1209/171937:ERROR:gpu_video_decode_accelerator.cc(301)] : Not implemented reached in void content::GpuVideoDecodeAccelerator::Initialize(media::VideoCodecProfile, IPC::Message*)HW video decode acceleration not available.


I have attached the contents of /sys/class/drm/card1/error.

I have also filed an issue at https://github.com/NixOS/nixpkgs/issues/5272 but I was told to file an issue here.
Comment 1 Chris Wilson 2014-12-10 08:31:18 UTC

*** This bug has been marked as a duplicate of bug 62373 ***
Comment 2 Ricardo M. Correia 2014-12-10 20:12:01 UTC
Are you sure this is a duplicate of #62373?

Sorry if I wasn't clear, but I'm not experiencing hard lockups.

I do experience GPU hangs, which manifest themselves visibly as temporary hangs (a few seconds long), and eventually (after several minutes) Chromium stops responding, but all other applications keep working correctly.

If I start Chromium with "--disable-gpu", the GPU hangs go away.
Comment 3 Donjan 2015-03-25 15:28:04 UTC
This is most certainly not a duplicate of #62373 as the symptoms don't match (no hard lockups, using SMPlayer as default without any freezes so far).

I regularly (roughly twice a week) observe this bug while watching YouTube videos with Chromium on two otherwise flawlessly working Intel HD Graphics (Sandy Bridge and Ivy Bridge) laptops. It tends to crop up more often when using external monitors, and there have been no lockups without Chromium running so far.

The workaround is to either wait a few minutes or (which I generally do) to switch to a virtual console and back, which kills Chromium.

With the latter, my syslog at the corresponding time says:

Mar 25 15:59:39 sorin kernel: [13603.322727] [drm] stuck on render ring
Mar 25 15:59:39 sorin kernel: [13603.322729] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Mar 25 15:59:39 sorin kernel: [13603.322729] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Mar 25 15:59:39 sorin kernel: [13603.322729] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Mar 25 15:59:39 sorin kernel: [13603.322730] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Mar 25 15:59:39 sorin kernel: [13603.322730] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Mar 25 15:59:39 sorin kernel: [13603.515187] detected fb_set_par error, error code: -22
Mar 25 15:59:41 sorin kernel: [13605.297718] traps: compiz[2561] trap divide error ip:7fcb2dafd8de sp:7fff9114a610 error:0 in libstaticswitcher.so[7fcb2daf0$
Mar 25 15:59:43 sorin kernel: [13607.239522] [UFW BLOCK] IN=eth0 OUT= MAC=01:00:5e:00:00:01:c8:2a:14:12:08:00:08:00 SRC=143.129.131.124 DST=224.0.0.1 LEN=44$
Mar 25 15:59:45 sorin gnome-session[2500]: WARNING: Application 'compiz.desktop' killed by signal 8
Mar 25 15:59:45 sorin gnome-session[2500]: WARNING: App 'compiz.desktop' respawning too quickly
Mar 25 15:59:45 sorin gnome-session[2500]: CRITICAL: We failed, but the fail whale is dead. Sorry....

Can also provide the 2.3MB drm/card0/error or any other logfiles if needed.
Comment 4 Donjan 2015-03-25 15:28:19 UTC
Status change.
Comment 5 Ricardo M. Correia 2015-03-25 16:19:17 UTC
For the record, I am also using an external monitor.
Comment 6 Chris Wilson 2015-03-25 17:28:45 UTC
It's the same bug. The GPU is hanging on the same instruction, and that is known to be irksome on that platform.

*** This bug has been marked as a duplicate of bug 62373 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.