Created attachment 95634 [details] dmesg(error) System Environment: -------------------------- Platform: Pieview kernel: drm-intel-next-queued/e8e6e6012d68c4967e8f26fdd39ac95c247d4789 Bug detailed description: --------------------------- It randomly causes system hang on Pinevuew with -nightly or -queued kernel. It happens 1 in 5 runs. It also randomly has [drm:i915_reset] *ERROR* Failed to reset chip: -19. output(hang): IGT-Version: 1.5-g20087e7 (i686) (Linux: 3.14.0-rc6_drm-intel-next-queued_e8e6e6_20140311+ i686) dmesg(hang): [ 48.431468] console [netcon0] enabled [ 48.431589] netconsole: network logging started [ 48.437292] console [netcon0] disabled [ 48.446037] netpoll: netconsole: local port 6665 [ 48.446184] netpoll: netconsole: local IPv4 address 0.0.0.0 [ 48.446343] netpoll: netconsole: interface 'enp2s0' [ 48.446485] netpoll: netconsole: remote port 6666 [ 48.446622] netpoll: netconsole: remote IPv4 address 10.239.47.171 [ 48.446798] netpoll: netconsole: remote ethernet address 74:d0:2b:95:69:65 [ 48.446985] netpoll: netconsole: local IP 10.239.47.176 [ 48.448644] console [netcon0] enabled [ 48.448767] netconsole: network logging started [ 55.945098] [drm:i915_gem_open], [ 55.945248] [drm:intel_crtc_cursor_set], cursor off [ 55.945354] [drm:intel_crtc_set_config], [CRTC:3] [NOFB] [ 55.945474] [drm:intel_set_config_compute_mode_changes], computed changes for [CRTC:3], mode_changed=0, fb_changed=0 [ 55.945680] [drm:intel_modeset_stage_output_state], [CONNECTOR:5:LVDS-1] to [CRTC:4] [ 55.945841] [drm:intel_crtc_cursor_set], cursor off [ 55.945945] [drm:intel_crtc_set_config], [CRTC:4] [FB:14] #connectors=1 (x y) (0 0) [ 55.946151] [drm:intel_set_config_compute_mode_changes], computed changes for [CRTC:4], mode_changed=0, fb_changed=0 [ 55.949340] [drm:intel_modeset_stage_output_state], [CONNECTOR:5:LVDS-1] to [CRTC:4] [ 55.952422] [drm:i915_gem_open], [ 63.707022] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... render ring idle [ 133.707023] [drm] no progress on render ring [ 133.716255] [drm] GPU HANG: ecode -1:0x00000000, reason: Ring hung, action: reset [ 133.720785] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 133.725398] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 133.730065] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 133.734848] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 133.739694] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 157.002008] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 1, t=60002 jiffies, g=4000, c=3999, q=54) [ 157.002008] INFO: Stall ended before state dump start output(error): IGT-Version: 1.5-g20087e7 (i686) (Linux: 3.14.0-rc6_drm-intel-next-queued_e8e6e6_20140311+ i686) Test assertion failure function gem_execbuf, file drmtest.c:581: Last errno: 5, Input/output error Failed assertion: ret == 0 Test assertion failure function gem_execbuf, file drmtest.c:581: Last errno: 5, Input/output error Failed assertion: ret == 0 Test assertion failure function gem_execbuf, file drmtest.c:581: Last errno: 5, Input/output error Failed assertion: ret == 0 Test assertion failure function gem_execbuf, file drmtest.c:581: Last errno: 5, Input/output error Failed assertion: ret == 0 Test assertion failure function gem_execbuf, file drmtest.c:581: Last errno: 5, Input/output error Failed assertion: ret == 0 Test assertion failure function gem_execbuf, file drmtest.c:581: Last errno: 22, Invalid argument Failed assertion: ret == 0 Test assertion failure function gem_execbuf, file drmtest.c:581: Test assertion failure function gem_execbuf, file drmtest.c:581: Last errno: 5, Input/output error Failed assertion: ret == 0 Last errno: 5, Input/output error Failed assertion: ret == 0 child 0 failed with exit status 99 Subtest forked-interruptible-faulting-reloc-thrash-inactive: FAIL gem_reloc_vs_gpu: drmtest.c:1296: children_exit_handler: Assertion `ret == 0' failed. Aborted (core dumped) Reproduce steps: ------------------------- 1. ./gem_reloc_vs_gpu --run-subtest forked-interruptible-faulting-reloc-thrash-inactive
A real gpu hang without the error state attached. Can we have it please?
System hangs fast, I can't get the error.
Can you please retest with latest -nightly? The refcount fix from Chris might help ...
Created attachment 97328 [details] dmesg It still causes system hang on latest -nightly kernel.
output: IGT-Version: 1.6-g99b8f80 (i686) (Linux: 3.14.0_drm-intel-nightly_cf8c74_20140414+ i686)
Have you run your pnv box through memtest recently?
Run 10 cycles on latest -nightly kernel, It works well. Close it.
Verified.Fixed.
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.