Bug 96291 - [BDW/SKL] GPU hangs with programs using compute shaders
Summary: [BDW/SKL] GPU hangs with programs using compute shaders
Status: VERIFIED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Kenneth Graunke
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 96743
  Show dependency treegraph
 
Reported: 2016-05-31 13:16 UTC by Eero Tamminen
Modified: 2016-06-30 08:44 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Eero Tamminen 2016-05-31 13:16:49 UTC
Setup:
- SKL GT2
- Ubuntu 16.04:
  - latest Ubuntu kernel 4.4.0-22 (upstream v4.4.8 + i915 driver from v4.6-rc3)
- GfxBench v4: https://gfxbench.com/linux-download/
- Mesa from today (f87352d76966b6e4b0ab5fa9129ccd1ade0c2e7a)

Use-case:
- Start GfxBench
- Run Manhattan v3.1 test
- Run Car Chase test

Expected outcome:
- No hangs

Actual outcome:
- recovered GPU hangs and long pauses in Manhattan v3.1 output
- recovered GPU hangs and no output from Car Chase

Notes:
- I remember running these tests early this year using Mesa GL/GLSL overrides, after CS was added for GLES, with *no* hangs i.e. this seems regression
- I get hangs also with Ubuntu's Mesa 11.2
- Hangs are compute shading related as I get hangs only with tests using CS
- SynMark CSDof tests has similar hangs, but CSCloth doesn't.  CSDof requires SIMD32 for CS, but GfxBench tests and CSCloth don't

I tested also these kernels, but they didn't help:
- Ubuntu kernel 4.4.0-2 (Upstream 4.4 with i915 from same version)
- drm-intel-nightly kernel 4.6-rc4 (right after SKL PIPE_CONTROL fix)
- yesterday's drm-intel-nightly (4.7.0-pre)
Comment 1 Kenneth Graunke 2016-06-01 22:45:57 UTC
Gl43CSDof hits an assertion in a debug build.  I'm pretty sure that the 'hsw-cs-cross-thread-constants-v4' of git://people.freedesktop.org/~jljusten/mesa should fix the hangs.

I don't think Gl43CSDof is rendering correctly, though.
Comment 2 Kenneth Graunke 2016-06-02 06:07:42 UTC
Jordan's patches have landed.  Are you still seeing GPU hangs?
Comment 3 Kenneth Graunke 2016-06-02 07:14:40 UTC
gl_manhattan31 and Gl43CSDof no longer seem to hang, but Car Chase still does.  The GPU hang I'm seeing actually looks like it's the scalar TCS backend's fault.  If I switch to the vec4 TCS backend, I get a different hang, though.
Comment 4 Eero Tamminen 2016-06-02 08:44:00 UTC
(In reply to Kenneth Graunke from comment #3)
> gl_manhattan31 and Gl43CSDof no longer seem to hang, but Car Chase still
> does.

I can reproduce these findings on my SKL GT2.

CSDof has also still some misrendering at the bottom of its screen (at least in FullHD fullscreen).
Comment 5 Kenneth Graunke 2016-06-03 09:19:48 UTC
Hmm...the Car Chase hangs look like some kind of memory corruption.  I'm seeing batchbuffers with compute commands where the first part of the batchbuffer is full of garbage.  The upload BO appears to be directly before it in the address space, and altering the size of that helps with the GPU hangs (but it may just be moving the buffers around).  I see corruption in the images when it doesn't hang, so maybe it's overwriting the image with garbage instead.
Comment 6 Eero Tamminen 2016-06-03 09:58:45 UTC
Are you able to reproduce the garbage at the bottom of CSDof window?  Maybe it has same cause.
Comment 7 Kenneth Graunke 2016-06-03 10:15:16 UTC
Yes, I can.  That's a good idea - it's likely to be simpler.  I've got some hacks that appear to make the CSDof corruption go away...just need to narrow it down.
Comment 8 Kenneth Graunke 2016-06-03 10:36:41 UTC
Altering the brw_cs.c call to brw_get_scratch_bo to multiply the size by 3 appears to fix CSDof's rendering issues, and the hangs in Car Chase.  That doesn't seem necessary, though.

Curro did find something that looked like a hardware bug with swapping out scratch buffers a while back.  That may be the real problem.
Comment 9 Eero Tamminen 2016-06-09 10:44:02 UTC
With these patches, remaining SKL GPU hangs in CSDof & CarChase seem to go away:
- https://patchwork.freedesktop.org/patch/92546/
- https://patchwork.freedesktop.org/patch/92545/
Comment 10 Kenneth Graunke 2016-06-10 01:48:21 UTC
FWIW, Curro and I found piles more problems on Haswell today.  I'm working on a v2 series which is up to 8 patches so far.  We also found Ivybridge and Baytrail problems, which I think should be fixed, but I still need to test those.  I also need to port all of these fixes over to the Vulkan driver.
Comment 11 Kenneth Graunke 2016-06-12 08:04:20 UTC
Fixed by commits:

3f48548a6f65fe90b97956c7be73268917c6f2f9..0fb85ac08d61d365e67c8f79d6955e9f89543560

      i965: Fix shared local memory size for Gen9+.
      i965: Set subslice_total on Gen7/7.5 platforms.
      i965: Allocate scratch space for the maximum number of compute threads.
      i965: Account for poor address calculations in Haswell CS scratch size.
      i965: Fix Haswell CS per-thread scratch space encoding.
      i965: Fix CS scratch size calculations on Ivybridge and Baytrail.
      i965: Assert that the scratch spaces are in range.
      i965: Use the correct number of threads for compute shaders.

All are marked for cherry-picking to the 12.0 branch as well.
Comment 12 Eero Tamminen 2016-06-13 12:46:44 UTC
Verified with BYT, BDW & SKL. CSDof & CarChase work on all of them, although on BYT they still need overrides as Mesa is still at GL 3.3 there.
Comment 13 Eero Tamminen 2016-06-23 15:42:58 UTC
While CarChase has been working fine on SKL, it seems to be getting occasional GPU hangs on BYT & HSW (GT3e).  As there's been some trouble with latest drm-intel kernels, I'm not sure yet whether it's Mesa issue.

(CSDof has been working without problems on HSW, don't know about BYT yet.)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.