Bug 70740 - HiZ on SNB causes GPU hang with WebGL web app
HiZ on SNB causes GPU hang with WebGL web app
Status: RESOLVED FIXED
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965
unspecified
All All
: medium critical
Assigned To: Chad Versace
Intel 3D Bugs Mailing List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-21 21:30 UTC by Joe Konno
Modified: 2013-12-21 00:30 UTC (History)
7 users (show)

See Also:
i915 platform:
i915 features:


Attachments
intel_gpu_abrt.tar (2.63 MB, application/octet-stream)
2013-11-14 23:40 UTC, Xinkai Chen
Details
intel_gpu_abrt.tar.zip (414.59 KB, text/plain)
2013-12-16 21:20 UTC, mystic_x
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joe Konno 2013-10-21 21:30:58 UTC
If HiZ enabled on any recent Mesa on a "sandybridge" platform, the Chrome browser with WebGL enabled will hang the GPU within 10-30 minutes. Not seen on "ivybridge".

To reproduce on a "sandybridge" machine:
  1. Build Mesa with HiZ enabled
  2. Enable WebGL in the Chrome browser
  3. Log into Google Maps, and "try the new maps"

Expected result: Can enjoy WebGL web application to heart's content.

Actual result: Within 10-30 minutes, GPU hangs. Game over.
Comment 1 Kenneth Graunke 2013-10-28 19:35:18 UTC
Joe,

I just landed a bunch of Sandybridge GPU hang fixes on Mesa master.

Currently we have two known GPU hangs:
- IPEHR = 0x7905 (3DSTATE_DEPTH_BUFFER)
  => bug #70151 - should be fixed with Mesa master as of today
- IPEHR = 0x0b160001 (MI_SEMAPHORE_MBOX) - 
  => bug #54226 - still unsolved; a kernel issue.

Either /sys/class/drm/card0/error (modern kernels) or /sys/kernel/debug/dri/0/i915_error_state (older kernels) will tell you which issue you have.  You can decode it with intel_error_decode from intel-gpu-tools.  Or just attach it.

The one that I fixed apparently is much more common, and we've had other reports of Google Maps triggering it.  So most likely, your bug is fixed.

Could you try master?  Thanks!

--Ken
Comment 2 Gavin Hindman 2013-10-28 19:59:09 UTC
We'll probably need to cherrypick - are the commits in bug #70151  the proper set?
Comment 3 Kenneth Graunke 2013-10-28 21:12:44 UTC
Correct.

I spoke to Carl this morning, and the plan is to skip them for 9.2.3 (which comes out this Friday) but include them in the 9.2.4 release (in about two weeks).  That way, there's more time for people to test them, and make sure they don't actually cause new GPU hang issues.  (Sandybridge is notoriously picky about flushing...)
Comment 4 James Ausmus 2013-11-04 22:54:41 UTC
According to reporter, it still crashes. Render state output below:

render command stream:
  HEAD: 0x07a02ad0
  TAIL: 0x00003598
  CTL: 0x0001f001
  ACTHD: 0x04b461c0
  IPEIR: 0x00000000
  IPEHR: 0x7a000002
  INSTDONE: 0xfffffffb
    busy: HIZ
  BBADDR: 0x4b461c18000020b
  INSTPS: 0x8000020b
  INSTPM: 0x00000080
  FADDR: 0x04b46380
  RC PSMI: 0x00000010
  FAULT_REG: 0x00000000
  SYNC_0: 0x00000000 [last synced 0x00000000]
  SYNC_1: 0x0000bbaf [last synced 0x0000bbae]
  seqno: 0x0000bbd4
  waiting: yes
  ring->head: 0x000178a0
  ring->tail: 0x00003598
Comment 5 Xinkai Chen 2013-11-14 23:40:27 UTC
Created attachment 89240 [details]
intel_gpu_abrt.tar

sudo intel_gpu_abrt
No protocol specified
Can't open display :0
intel_gpu_abrt.tar has been created.

System environment:
-- chipset:
-- system architecture: x86_64
-- xf86-video-intel:
-- xserver: 1.14.4
-- mesa: 9.2.3
-- libdrm: 2.4.47
-- kernel: 3.12.0-1-ARCH
-- Linux distribution: Archlinux up-to-date
-- Machine or mobo model:
-- Display connector:

I searched bug with IPEHR number 0x7a000002, and I found this bug. The hang happened during a Hearthstone game via wine, after I installed kernel 3.12 and mesa 9.2.3 today.

This hang takes longer to recover, compared with bug #70151 to which I was also a victim. The screen doesn't simply freeze during the hanging, but rather it looks like a 30-second powerpoint slide of corrupted images.
Comment 6 mystic_x 2013-12-16 21:20:42 UTC
Created attachment 90845 [details]
intel_gpu_abrt.tar.zip

I think I'm running into the same issue with WebGL. Also searched for IPEHR number 0x7a000002 and found this report.

System environment:
-- chipset: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
-- system architecture: x86_64
-- xf86-video-intel: 2.21.15-1
-- xserver: 1.14.5
-- mesa: 9.2.5-1
-- libdrm: 2.4.50
-- kernel: 3.12.5-1-ARCH
-- Linux distribution: Arch Linux up-to-date
-- Machine or mobo model: Lenovo x220i
-- Display connector:

Reproducing steps:
- Install Chromium Version 31.0.1650.63 (238485)
- Activate new Google Maps with WebGL
- Browse Maps

Took about a minute to recover.
Comment 7 Chad Versace 2013-12-19 19:43:09 UTC
Google has committed a fix to the ChromeOS tree that fixes HiZ hangs in Google Maps. The fix is here: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/8bc07bb70163c3706fb4ba5f980e57dc942f56dd/media-libs/mesa/files/9.1-i965-Add-workaround-for-HIZ-resolves.patch

I have refactored the fix to be more suitable for upstreaming to master. I'm validating the fix, and will submit it to mesa-dev (and attach it to this ticket) if validation succeeds.
Comment 8 Chad Versace 2013-12-20 12:53:41 UTC
Sent a fix to mesa-dev. I verified the fix on SNB Chrome OS.

http://www.mail-archive.com/mesa-dev@lists.freedesktop.org/msg49726.html
Comment 9 Chad Versace 2013-12-21 00:30:08 UTC
Fixed by the following commit, at least on Chrome OS. If the patch didn't fix the hang for your distro, please re-open the bug.

commit 1a928816a1b717201f3b3cc998a42731b280e6ba                                                                                                                                                       
Author: Chad Versace <chad.versace@linux.intel.com>                                                                                                                                                   
Date:   Fri Dec 20 04:39:03 2013 -0800                                                                                                                                                                

    i965/gen6: Fix HiZ hang in WebGL Google Maps

    Emitting flushes before depth and hiz resolves at the top of blorp's
    state emission fixes the hang. Marchesin and I found the fix
    experimentally, as opposed to adhering to a documented hardware
    workaround.  A more minimal fix likely exists, but this gets the job
    done.

    Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS.
    Tested by zooming in and out continuously for 2 hours.

    This patch is based on
    https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/8bc07bb70163c3706fb4ba5f980e57dc942f56dd

    CC: mesa-stable@lists.freedesktop.org
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740
    Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>                                                                                                                                          
    Signed-off-by: Chad Versace <chad.versace@linux.intel.com>                                                                                                                                        
    Reviewed-by: Paul Berry <stereotype441@gmail.com>                                                                                                                                                 
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>