Bug 96258 - [NVC0] Hang when running compute program
Summary: [NVC0] Hang when running compute program
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/nouveau (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Nouveau Project
QA Contact: Nouveau Project
Depends on:
Reported: 2016-05-28 13:49 UTC by Ilia Mirkin
Modified: 2016-06-01 03:32 UTC (History)
0 users

See Also:
i915 platform:
i915 features:

compute shader dump (202.46 KB, text/plain)
2016-05-28 13:49 UTC, Ilia Mirkin
compute shader dump - NV50_PROG_OPTIMIZE=1 (fail) (217.46 KB, text/plain)
2016-05-28 13:53 UTC, Ilia Mirkin
compute shader dump - NV50_PROG_OPTIMIZE=0 (success) (377.82 KB, text/plain)
2016-05-28 13:54 UTC, Ilia Mirkin

Description Ilia Mirkin 2016-05-28 13:49:36 UTC
Created attachment 124145 [details]
compute shader dump

This happened on GK208, but I assume it'll happen everywhere. This is with the trace from bug 94858.

[259564.842264] nouveau 0000:02:00.0: fifo: read fault at 0000000000 engine 00 [GR] client 03 [GPC0/L1_1] reason 02 [PTE] on channel 6 [007f940000 X[1608]]
[259564.842268] nouveau 0000:02:00.0: fifo: gr engine fault on channel 6, recovering...
[259595.772036] nouveau 0000:02:00.0: X[1608]: failed to idle channel 6 [X[1608]]
[259610.772211] nouveau 0000:02:00.0: X[1608]: failed to idle channel 6 [X[1608]]
Comment 1 Ilia Mirkin 2016-05-28 13:53:01 UTC
Created attachment 124146 [details]
compute shader dump - NV50_PROG_OPTIMIZE=1 (fail)
Comment 2 Ilia Mirkin 2016-05-28 13:54:56 UTC
Created attachment 124147 [details]
compute shader dump - NV50_PROG_OPTIMIZE=0 (success)

Looks like one of the "level 1" optimizations cause the fail. Now to figure which one...
Comment 3 Samuel Pitoiset 2016-05-28 14:02:45 UTC
Interesting, let's see if I can reproduce that issue on GF119. Hopefully I will be able to.
Comment 4 Samuel Pitoiset 2016-05-28 14:08:37 UTC
I don't have this read fault on my GF119, but I have lot of:
[ 8473.891952] nouveau 0000:01:00.0: gr: DATA_ERROR 00000028 [CP_NO_REG_SPACE_STRIPED] ch 7 [007f9ed000 glretrace[6719]] subc 1 class 90c0 mthd 0368 data 00001000
[ 8473.906657] nouveau 0000:01:00.0: gr: DATA_ERROR 00000028 [CP_NO_REG_SPACE_STRIPED] ch 7 [007f9ed000 glretrace[6719]] subc 1 class 90c0 mthd 0368 data 00001000

Disabling compiler opttimizations doesn't change anything.
Comment 5 Ilia Mirkin 2016-05-28 17:17:29 UTC
OK, I've pushed a fix for the GK208 issue (an issue in unspilling predicates):

commit c7731a07408c5d4169625d4a78962d2887419080
Author: Ilia Mirkin <imirkin@alum.mit.edu>
Date:   Sat May 28 13:07:12 2016 -0400

    gk110/ir: fix unspilling of predicates from registers
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96258
    Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
    Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>

However the issue around thread sizes remains for all except the GK10x keplers. On fermi we have 32K registers, on kepler+ we have 64K (not counting the mythical GK210). So we have to tell the RA to restrict the number of registers used based on thread size (or use 1024 as the number of threads when that information is not provided).
Comment 6 Ilia Mirkin 2016-05-28 17:25:22 UTC
Also an observation - there's a bit of tearing in the sheet as it falls. Could be synchronization fail, or something else. I see it with NV50_PROG_OPTIMIZE=0 as well.
Comment 7 Ilia Mirkin 2016-06-01 03:32:45 UTC
And the fermi issue is fixed now too by (a) fixing the GPR file size to take the thread count and # of SM regs into account and (b) fixing BitSet to work with multiple-of-32 numbers of registers (had been all 32n-1 up until now).

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.