Summary: | [BSW] GPU hang at the second cycle to execute S4 command | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | wendy.wang | ||||||||||||||||||||||||||||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||||||||||||||||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||||||||||||||||||||||
Severity: | blocker | ||||||||||||||||||||||||||||||||||
Priority: | high | CC: | intel-gfx-bugs | ||||||||||||||||||||||||||||||||
Version: | unspecified | ||||||||||||||||||||||||||||||||||
Hardware: | Other | ||||||||||||||||||||||||||||||||||
OS: | All | ||||||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||||||||||||
Attachments: |
|
Created attachment 110000 [details]
s4-good-dmesg-after-i915- disable
(-fixes, -next-queued, -nightly) branch's behavior about this bug will update on next day. Works for me on fab2 w/ V43 BIOS. Can you try the same? Helo Ville, We are having problem to upgrade BIOS to V43 on BSW fab1 or fab2: because of once upgrade BIOS to V43, we always see board debug LED show 0000 and cannot boot up system and we are still under analysis. Then double checked S4 behavior again: Fab2+ V41.0 BIOS or Fab2+V40.0 BIOS, Second cycle to execute S4 command"echo disk > /sys/power/state", we did see the system cannot enter S4 failure. And disable the i915 display via "modprobe.blacklist=i915" parameter,can successfully multi-times to put system enter into S4, there is no S4 problem. I attached S4-fail-dmesg.log and S4-good-dmesg-i915-disable.log for your analysis. Fab1 board+ V41.0 BIOS has the same S4 failures as FAB2. I'm not sure if this is the regression now, as I have not find a workable Kernel right now. Created attachment 110032 [details]
s4faildmesg-fab2
Created attachment 110033 [details]
s4gooddmesg-i915disable-fab2
There's a GPU hang in there. Are you able to run any GPU workload after a single S4 cycle? Hello Ville, Test on BSW FAB2 B1 CPU with V40 BIOS. KSC is 1.05 Kernel tag: drm-intel-testing 2014-11-21 Reproduce scenario 1: 1. Boot up system and xinit & 2. Put system enter into S4 with command :"echo disk > /sys/power/state" 3. Resume system back with pressing power button 4. Check /sys/kernel/debug/dri/0/i915_error_state reg, there is no error 5. Pkill x and restart X, will have GPU hang error: pls see attached gpuhang_after_s4-after_xinit.log Reproduce scenario 2: 1. Boot up system and Put system enter into S4 with command :"echo disk > /sys/power/state" 2. Resume system back with pressing power button 3. There is no GPU hang error in /sys/kernel/debug/dri/0/i915_error_state reg 4. Second time to execute "echo disk > /sys/power/state", GPU will hang, pls refer to attached i915_error_state_2nd_S4_execute.log Created attachment 110101 [details]
gpuhang_after_s4-after_xinit.log
Created attachment 110102 [details]
i915_error_state_2nd_S4_execute.log
Based on the error state it just hung on the first command it tried to execute from the blitter ring, which in this case was an LRI. So the CS seems pretty much dead here if even a simple LRI doesn't work. What does 'intel_reg_read 0x9400' say? Created attachment 110163 [details] [review] [PATCH] drm/i915: Don't frob Gunit registers on CHV Random idea of the day. Please try this and report back. (In reply to Ville Syrjala from comment #11) > Based on the error state it just hung on the first command it tried to > execute from the blitter ring, which in this case was an LRI. So the CS > seems pretty much dead here if even a simple LRI doesn't work. > > What does 'intel_reg_read 0x9400' say? After GPU hang, checked as below: root@x-bsw03:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tools# intel_reg_read 0x9400 0x9400 : 0x80 (In reply to Ville Syrjala from comment #12) > Created attachment 110163 [details] [review] [review] > [PATCH] drm/i915: Don't frob Gunit registers on CHV > > Random idea of the day. Please try this and report back. Hello Ville, applied your patch base on latest drm-intel-nightly branch kernel, still cannot do 2nd cycle S4 entering. Dmesg file will report you tomorrow. Created attachment 110293 [details]
v47_s4_dmesg.log
Tested S4 with BIOS v47, with i915 driver loaded, we still observed GPU hang issue at 2nd S4 entering cycle, log files attached: v47_s4_dmesg.log v47_S4_i915_error.log Configuration: Platform Board: Braswell RVP Fab2 CPU : B1 1.36GHz 2Cores/4Thread 6/12/2 E6XC Software Linux distribution: Ubuntu 14.04 LTS 64 bits GFX Kernel tag: drm-intel-testing 2014-11-21 BIOS : BSW_SPI_1_r8_BRASWEL_X64_R_0047_00_ME-2.0.0.1033 Ksc : 1.05 Created attachment 110294 [details]
v47_S4_i915_error.log
(In reply to wendy.wang from comment #17) > Created attachment 110294 [details] > v47_S4_i915_error.log The mime types of your attachments are wrong, please fix. Created attachment 110337 [details]
v47_S4_i915_error--reattach
Re-attached v47_S4_i915_error.log, pls check, thanks.
(In reply to Ville Syrjala from comment #18) > (In reply to wendy.wang from comment #17) > > Created attachment 110294 [details] > > v47_S4_i915_error.log > > The mime types of your attachments are wrong, please fix. Sent you email about the logS for V47 bios+ S4 test results v47_S4_i915_error.log is zip file. (In reply to Ville Syrjala from comment #12) > Created attachment 110163 [details] [review] [review] > [PATCH] drm/i915: Don't frob Gunit registers on CHV > > Random idea of the day. Please try this and report back. Hello Ville, About this patch, it's hard to describe the S4 symptom I've observed, so list here with 2 kinds of status: Status 1: After boot up the kernel with this patch, at the 2nd trying to do S4, I saw system hang up: with Keyboard no response. Status 2: encountered other call trace when doing 2nd or 3rd time S4 command, seems not related to i915. in this scenario, I did not see GPU hang problem. Captured some dmesg log, if you are interesting in them. Created attachment 110343 [details]
patch_s4_calltrace_dmesg2.log
Created attachment 110344 [details]
patch_s4_dmesg1.log
Created attachment 110345 [details]
Patch_S4_Resume-V45-dmesg3.log
Created attachment 110346 [details]
Patch_S4_Resume-V45-dmesg4.log
Created attachment 111261 [details]
With HDMI connected dmesg log
I connected a HP2309P monitor and try S3/S4, I am able to suspend/resume with S4 for 4 times. The attached is the dmesg log
I've manual test S4 ten times with latest nightly kernel(75ce8a) and drm-intel-testing-2015-01-30, S4 can works well. BIOS version: v55. verified this bug. Closing old verified. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 109999 [details] s4-fail-dmesg ==System Environment== BSW RVP FAB1 with B1 CPU BIOS: V41.0 KSC: 1.05 ==Failed Kernel== BSW alpha release kernel: tag: drm-intel-testing 2014-11-21 ==Bug detailed description== ----------------------------- With i915 driver loaded the fail symptom as below: 1. 1st cycle to execute "echo disk > /sys/power/state" 2. test machine will enter S4 successfully, then automatically resume back from S4. 3. Second time to try executing "echo disk > /sys/power/state", system will fail to enter into S4 s4-fail-dmesg log file attached. if disable i915 display with "modprobe.blacklist=i915" parameter, do not have this S4 issue, which mean second time still can put system into S4, and system will not automatically wake up. Attached one good dmesg file for compare: s4-good-dmesg-i915-disable log