Bug 82628 - bisected: GALLIUM_HUD hangs radeon 7970M (PRIME)
Summary: bisected: GALLIUM_HUD hangs radeon 7970M (PRIME)
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: Other Linux (All)
: medium normal
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-14 18:14 UTC by Christoph Haag
Modified: 2014-08-16 16:52 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
journalctl log including the hang in the report (251.86 KB, text/plain)
2014-08-14 18:14 UTC, Christoph Haag
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christoph Haag 2014-08-14 18:14:09 UTC
Created attachment 104629 [details]
journalctl log including the hang in the report

I'm using only DRI_PRIME=1 GALLIUM_HUD="fps,VRAM-usage+GTT-usage". I have not tested much different combinations.

Tested on linux 3.16 mainline.

I'm not 100% sure, but I think this is my bisect result:
1cfcd0164e1be7d7b05b693f60a262ad735b7565 is the first bad commit

I haven't tested that much applications because it has a good chance of a hard lockup of the whole machine. But often it recovers after killing the application with the HUD.

With the Unreal Engine Cave Effects demo I think I can reproduce it every time and in fact it is immediately visible whether it will work (when changing mesa versions) because when it works immediately a black window with the HUD is visible until the demo loads and when it won't work, a few fat white garbage lines will be visible in the window or something like that.

I only tried glxgears as another application a few times and sometimes it worked and sometimes it produced the lockup.

radeon 0000:01:00.0: ring 0 stalled for more than 10003msec
radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000011 last fence id 0x000000000000000d on ring 0)
radeon 0000:01:00.0: Saved 269 dwords of commands on ring 0.
radeon 0000:01:00.0: GPU softreset: 0x0000004D
radeon 0000:01:00.0:   GRBM_STATUS               = 0xB3525028
radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x2F800002
radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x2F800002
radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x40000000
radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00408002
radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x84228647
radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83146
radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100100
radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
radeon 0000:01:00.0:   SRBM_STATUS               = 0x200400C0
radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
radeon 0000:01:00.0: GPU reset succeeded, trying to resume

Without the HUD it always works.

Oh and sometimes with the buggy mesa revisions *and* with the HUD it just won't start and there is immediately the message that the kernel rejected the CS stream or whatever the message is. I have not seen this without the HUD.
Comment 1 Christoph Haag 2014-08-15 23:58:33 UTC
It's now reverted but for the record so that it's not forgotten:

Whether it makes sense or not, the GPU does not hang if I change the count from one to two:

diff --git a/src/gallium/auxiliary/hud/hud_context.c b/src/gallium/auxiliary/hud/hud_context.c
index a05d3c4..2d8bdca 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -532,7 +532,7 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex)
    pipe_resource_reference(&hud->text.vbuf.buffer, NULL);
 
    /* draw the rest */
-   cso_set_vertex_elements(cso, 1, hud->velems);
+   cso_set_vertex_elements(cso, 2, hud->velems);
    LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
       if (pane)
          hud_pane_draw_colored_objects(hud, pane);
Comment 2 Marek Olšák 2014-08-16 16:52:39 UTC
Don't worry. It won't forgotten. Closing.


bug/show.html.tmpl processed on Feb 24, 2017 at 15:06:26.
(provided by the Example extension).