Created attachment 79875 [details] i915_error_state System Environment: -------------------------- Platform: Ironlake/Sandybridge/Ivybridge/Haswell Kernel: (drm-intel-nightly) 095f1c4ffba0e4242cb43812bc000814749b8484 Bug detailed description: ------------------------- It fails on drm-intel-next-queued kernel. It works well on drm-intel-fixes kernel. Run kms_flip/flip-vs-panning-vs-hang, It fails and causes GPU hang. After run kms_flip/flip-vs-panning-vs-hang, run following cases, also fail. igt/gem_cpu_concurrent_blit/early-read igt/gem_double_irq_loop igt/gem_dummy_reloc_loop/blt igt/gem_dummy_reloc_loop/bsd igt/gem_dummy_reloc_loop/mixed igt/gem_dummy_reloc_loop/render igt/gem_exec_big igt/gem_exec_faulting_reloc igt/gem_fenced_exec_thrash igt/gem_gtt_speed igt/gem_hangcheck_forcewake igt/gem_ringfill/blitter igt/gem_ring_sync_loop igt/gem_set_tiling_vs_blt/tiled-to-tiled igt/gem_set_tiling_vs_blt/tiled-to-untiled igt/gem_storedw_loop_bsd igt/gem_storedw_loop_render igt/gem_tiled_blits igt/gem_wait_render_timeout Bisect shows:f6fccec0256c8754a8f39776070bd5ecf2eed28a is the first bad commit. commit f6fccec0256c8754a8f39776070bd5ecf2eed28a Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> AuthorDate: Fri May 24 17:16:07 2013 +0300 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Sat May 25 13:24:52 2013 +0200 drm/i915: track ring progression using seqnos Instead of relying in acthd, track ring seqno progression to detect if ring has hung. v2: put hangcheck stuff inside struct (Chris Wilson) v3: initialize hangcheck.seqno (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> output: running testcase: flip-vs-panning-vs-hang Beginning flip-vs-panning-vs-hang on crtc 3, connector 7 1280x1024 60 1280 1328 1440 1688 1024 1025 1028 1066 0x5 0x48 108000 ..gem_set_domain:425 failed, ret=-1, errno=5 Aborted (core dumped) dmesg: [ 365.381043] [drm:i915_driver_open], [ 365.381062] [drm:intel_crtc_set_config], [CRTC:3] [FB:35] #connectors=1 (x y) (0 0) [ 365.381067] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 365.381070] [drm:intel_crtc_set_config], [CRTC:5] [NOFB] [ 365.381071] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 365.381076] [drm:i915_driver_open], [ 365.381236] [drm:i915_getparam], Unknown parameter 22 [ 365.381284] [drm:drm_mode_getresources], CRTC[2] CONNECTORS[7] ENCODERS[7] [ 365.381292] [drm:drm_mode_getresources], CRTC[2] CONNECTORS[7] ENCODERS[7] [ 365.381305] [drm:drm_mode_getconnector], [CONNECTOR:7:?] [ 365.381308] [drm:drm_helper_probe_single_connector_modes], [CONNECTOR:7:VGA-1] [ 365.381312] [drm:intel_ironlake_crt_detect_hotplug], ironlake hotplug adpa=0x83f40018, result 1 [ 365.381316] [drm:intel_crt_detect], CRT detected via hotplug [ 365.394203] [drm:drm_edid_to_eld], ELD: no CEA Extension found [ 365.394218] [drm:drm_helper_probe_single_connector_modes], [CONNECTOR:7:VGA-1] probed modes : [ 365.394223] [drm:drm_mode_debug_printmodeline], Modeline 24:"1280x1024" 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5 [ 365.394229] [drm:drm_mode_debug_printmodeline], Modeline 30:"1280x1024" 75 135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5 [ 365.394235] [drm:drm_mode_debug_printmodeline], Modeline 25:"1152x864" 75 108000 1152 1216 1344 1600 864 865 868 900 0x40 0x5 [ 365.394242] [drm:drm_mode_debug_printmodeline], Modeline 31:"1024x768" 75 78800 1024 1040 1136 1312 768 769 772 800 0x40 0x5 [ 365.394248] [drm:drm_mode_debug_printmodeline], Modeline 32:"1024x768" 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa [ 365.394254] [drm:drm_mode_debug_printmodeline], Modeline 33:"800x600" 75 49500 800 816 896 1056 600 601 604 625 0x40 0x5 [ 365.394260] [drm:drm_mode_debug_printmodeline], Modeline 26:"800x600" 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5 [ 365.394266] [drm:drm_mode_debug_printmodeline], Modeline 27:"640x480" 75 31500 640 656 720 840 480 481 484 500 0x40 0xa [ 365.394272] [drm:drm_mode_debug_printmodeline], Modeline 28:"640x480" 60 25200 640 656 752 800 480 490 492 525 0x40 0xa [ 365.394277] [drm:drm_mode_debug_printmodeline], Modeline 29:"720x400" 70 28320 720 738 846 900 400 412 414 449 0x40 0x6 [ 365.394290] [drm:drm_mode_getconnector], [CONNECTOR:7:?] [ 366.113993] [drm:drm_mode_addfb], [FB:36] [ 366.137475] [drm:drm_mode_addfb], [FB:37] [ 366.163280] [drm:drm_mode_addfb], [FB:38] [ 366.163404] [drm:drm_mode_setcrtc], [CRTC:3] [ 366.163408] [drm:drm_mode_setcrtc], [CONNECTOR:7:VGA-1] [ 366.163409] [drm:intel_crtc_set_config], [CRTC:3] [FB:36] #connectors=1 (x y) (0 0) [ 366.163414] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 366.163420] [drm:ironlake_update_plane], Writing base 00547000 00000000 0 0 10240 [ 366.185725] [drm:i915_driver_open], [ 366.185750] [drm:i915_ring_stop_set], Stopping rings 0x0000000f [ 366.185776] [drm:drm_mode_setcrtc], [CRTC:3] [ 366.185779] [drm:drm_mode_setcrtc], [CONNECTOR:7:VGA-1] [ 366.185780] [drm:intel_crtc_set_config], [CRTC:3] [FB:36] #connectors=1 (x y) (0 0) [ 366.185782] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 371.804157] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [ 371.804168] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state [ 371.806856] [drm:i915_error_work_func], resetting chip [ 371.806905] [drm] Simulated gpu hang, resetting stop_rings [ 371.806917] [drm:gm45_get_vblank_counter], trying to get vblank count for disabled pipe B [ 371.806981] [drm:ironlake_update_plane], Writing base 00547000 00000000 0 0 10240 [ 371.807013] [drm:i915_driver_open], [ 371.807026] [drm:i915_error_state_write], Resetting error state [ 371.807079] [drm:i915_driver_open], [ 371.807088] [drm:i915_ring_stop_set], Stopping rings 0x0000000f [ 371.807179] [drm:drm_mode_setcrtc], [CRTC:3] [ 371.807182] [drm:drm_mode_setcrtc], [CONNECTOR:7:VGA-1] [ 371.807184] [drm:intel_crtc_set_config], [CRTC:3] [FB:37] #connectors=1 (x y) (10 0) [ 371.807188] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 371.817147] [drm:ironlake_update_plane], Writing base 00F47000 00000028 10 0 10240 [ 373.808129] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [ 373.808178] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state [ 373.810827] [drm:i915_error_work_func], resetting chip [ 373.810868] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [ 373.810869] [drm:i915_reset] *ERROR* Failed to reset chip. [ 373.810879] [drm:ironlake_update_plane], Writing base 00F47000 00000028 10 0 10240 [ 383.810060] [drm:i915_gem_wait_for_error] *ERROR* Timed out waiting for the gpu reset to complete [ 383.810195] [drm:intel_crtc_set_config], [CRTC:3] [FB:35] #connectors=1 (x y) (0 0) [ 383.810199] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 383.810204] [drm:ironlake_update_plane], Writing base 00047000 00000000 0 0 5120 [ 383.814011] [drm:intel_crtc_set_config], [CRTC:5] [NOFB] [ 383.814016] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 383.814020] [drm:intel_crtc_set_config], [CRTC:3] [FB:35] #connectors=1 (x y) (0 0) [ 383.814023] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 383.814038] [drm:intel_crtc_set_config], [CRTC:3] [FB:35] #connectors=1 (x y) (0 0) [ 383.814041] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 384.227548] [drm:intel_crtc_set_config], [CRTC:3] [FB:35] #connectors=1 (x y) (0 0) [ 384.227555] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 384.227557] [drm:intel_crtc_set_config], [CRTC:5] [NOFB] [ 384.227559] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 413.453681] [drm:i915_driver_open], [ 413.453697] [drm:intel_crtc_set_config], [CRTC:3] [FB:35] #connectors=1 (x y) (0 0) [ 413.453705] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 413.453708] [drm:intel_crtc_set_config], [CRTC:5] [NOFB] [ 413.453709] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 413.453714] [drm:i915_driver_open], [ 413.585963] [drm:intel_crtc_set_config], [CRTC:3] [FB:35] #connectors=1 (x y) (0 0) [ 413.585970] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] [ 413.585972] [drm:intel_crtc_set_config], [CRTC:5] [NOFB] [ 413.585974] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to [CRTC:3] Reproduce steps: ---------------- 1. ./kms_flip --run-subtest flip-vs-panning-vs-hang 2. ./gem_exec_big
Run kms_flip/flip-vs-panning-vs-hang, then run following cases, they also fail: igt/drm_vma_limiter_gtt igt/gem_cs_prefetch igt/gem_ctx_bad_destroy igt/gem_ctx_basic igt/prime_self_import/with_fd_dup igt/prime_self_import/with_one_bo igt/prime_self_import/with_two_bos igt/gem_mmap_offset_exhaustion igt/gem_partial_pwrite_pread/reads igt/gem_pipe_control_store_loop igt/gem_ringfill/render igt/gem_threaded_access_tiled igt/gem_tiled_partial_pwrite_pread/reads igt/gem_tiled_pread igt/kms_flip/blocking-absolute-wf_vblank igt/kms_flip/blocking-wf_vblank igt/kms_flip/delayed-flip-vs-dpms igt/kms_flip/delayed-flip-vs-modeset igt/kms_flip/delayed-wf_vblank-vs-dpms igt/kms_flip/flip-vs-absolute-wf_vblank igt/kms_flip/flip-vs-bad-tiling igt/kms_flip/flip-vs-dpms igt/kms_flip/flip-vs-dpms-off-vs-modeset igt/kms_flip/flip-vs-panning-vs-hang igt/kms_flip/flip-vs-wf_vblank igt/kms_flip/plain-flip igt/kms_flip/plain-flip-fb-recreate igt/kms_flip/single-buffer-flip-vs-dpms-off-vs-modeset igt/kms_flip/wf_vblank-ts-check igt/kms_flip/wf_vblank-vs-dpms igt/kms_flip/wf_vblank-vs-modeset igt/gem_fence_thrash/bo-write-verify-none igt/gem_fence_thrash/bo-write-verify-threaded-x igt/gem_fence_thrash/bo-write-verify-threaded-y igt/gem_fence_thrash/bo-write-verify-x igt/gem_fence_thrash/bo-write-verify-y igt/gem_exec_bad_domains/cpu-domain igt/gem_exec_bad_domains/gtt-domain igt/drm_vma_limiter_gtt
This is a side-effect of Mika's patch. It would appear that we are able to detect the hangs much quicker and two simulated hangs are being promoted to 'wedged'.
Created attachment 79887 [details] [review] Avoid promoting a simulated GPU hang to wedged
(In reply to comment #3) > Created attachment 79887 [details] [review] [review] > Avoid promoting a simulated GPU hang to wedged Fixed by this patch.
Fixed.
Verified.Fixed.
(for credit)
Closing old Verified+Fixed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.