Setup: - SKL GT2 - Ubuntu 16.04: - latest Ubuntu kernel 4.4.0-22 (upstream v4.4.8 + i915 driver from v4.6-rc3) - GfxBench v4: https://gfxbench.com/linux-download/ - Mesa from today (f87352d76966b6e4b0ab5fa9129ccd1ade0c2e7a) Use-case: - Start GfxBench - Run Manhattan v3.1 test - Run Car Chase test Expected outcome: - No hangs Actual outcome: - recovered GPU hangs and long pauses in Manhattan v3.1 output - recovered GPU hangs and no output from Car Chase Notes: - I remember running these tests early this year using Mesa GL/GLSL overrides, after CS was added for GLES, with *no* hangs i.e. this seems regression - I get hangs also with Ubuntu's Mesa 11.2 - Hangs are compute shading related as I get hangs only with tests using CS - SynMark CSDof tests has similar hangs, but CSCloth doesn't. CSDof requires SIMD32 for CS, but GfxBench tests and CSCloth don't I tested also these kernels, but they didn't help: - Ubuntu kernel 4.4.0-2 (Upstream 4.4 with i915 from same version) - drm-intel-nightly kernel 4.6-rc4 (right after SKL PIPE_CONTROL fix) - yesterday's drm-intel-nightly (4.7.0-pre)
Gl43CSDof hits an assertion in a debug build. I'm pretty sure that the 'hsw-cs-cross-thread-constants-v4' of git://people.freedesktop.org/~jljusten/mesa should fix the hangs. I don't think Gl43CSDof is rendering correctly, though.
Jordan's patches have landed. Are you still seeing GPU hangs?
gl_manhattan31 and Gl43CSDof no longer seem to hang, but Car Chase still does. The GPU hang I'm seeing actually looks like it's the scalar TCS backend's fault. If I switch to the vec4 TCS backend, I get a different hang, though.
(In reply to Kenneth Graunke from comment #3) > gl_manhattan31 and Gl43CSDof no longer seem to hang, but Car Chase still > does. I can reproduce these findings on my SKL GT2. CSDof has also still some misrendering at the bottom of its screen (at least in FullHD fullscreen).
Hmm...the Car Chase hangs look like some kind of memory corruption. I'm seeing batchbuffers with compute commands where the first part of the batchbuffer is full of garbage. The upload BO appears to be directly before it in the address space, and altering the size of that helps with the GPU hangs (but it may just be moving the buffers around). I see corruption in the images when it doesn't hang, so maybe it's overwriting the image with garbage instead.
Are you able to reproduce the garbage at the bottom of CSDof window? Maybe it has same cause.
Yes, I can. That's a good idea - it's likely to be simpler. I've got some hacks that appear to make the CSDof corruption go away...just need to narrow it down.
Altering the brw_cs.c call to brw_get_scratch_bo to multiply the size by 3 appears to fix CSDof's rendering issues, and the hangs in Car Chase. That doesn't seem necessary, though. Curro did find something that looked like a hardware bug with swapping out scratch buffers a while back. That may be the real problem.
With these patches, remaining SKL GPU hangs in CSDof & CarChase seem to go away: - https://patchwork.freedesktop.org/patch/92546/ - https://patchwork.freedesktop.org/patch/92545/
FWIW, Curro and I found piles more problems on Haswell today. I'm working on a v2 series which is up to 8 patches so far. We also found Ivybridge and Baytrail problems, which I think should be fixed, but I still need to test those. I also need to port all of these fixes over to the Vulkan driver.
Fixed by commits: 3f48548a6f65fe90b97956c7be73268917c6f2f9..0fb85ac08d61d365e67c8f79d6955e9f89543560 i965: Fix shared local memory size for Gen9+. i965: Set subslice_total on Gen7/7.5 platforms. i965: Allocate scratch space for the maximum number of compute threads. i965: Account for poor address calculations in Haswell CS scratch size. i965: Fix Haswell CS per-thread scratch space encoding. i965: Fix CS scratch size calculations on Ivybridge and Baytrail. i965: Assert that the scratch spaces are in range. i965: Use the correct number of threads for compute shaders. All are marked for cherry-picking to the 12.0 branch as well.
Verified with BYT, BDW & SKL. CSDof & CarChase work on all of them, although on BYT they still need overrides as Mesa is still at GL 3.3 there.
While CarChase has been working fine on SKL, it seems to be getting occasional GPU hangs on BYT & HSW (GT3e). As there's been some trouble with latest drm-intel kernels, I'm not sure yet whether it's Mesa issue. (CSDof has been working without problems on HSW, don't know about BYT yet.)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.