Summary: | [vblank, suspend/resume] glxgears window black after resuming (S3 and S4) or switching VT back | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | fangxun <xunx.fang> | ||||||||||||||||
Component: | DRM/Intel | Assignee: | Jesse Barnes <jbarnes> | ||||||||||||||||
Status: | CLOSED FIXED | QA Contact: | |||||||||||||||||
Severity: | major | ||||||||||||||||||
Priority: | high | CC: | jian.j.zhao, keithp | ||||||||||||||||
Version: | unspecified | ||||||||||||||||||
Hardware: | All | ||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||
Attachments: |
|
Created attachment 33979 [details]
Xorg log
Created attachment 33980 [details]
dmesg_after_resume
Bisect result: It is Xf86_video_intel commit caused this problem. The last good commit is 4902f546be19e3d5bb47f6c75e2199dc4856c0f4. Ater this commit, glxgears failed because DRI2 issue until commit 1a76fa5574e8e8f88ac3518a4e4494e1af301dc1. This issue can be reproduced on commit 1a76fa5574e8e8f88ac3518a4e4494e1af301dc1, so I think it is the first bad commit. commit 1a76fa5574e8e8f88ac3518a4e4494e1af301dc1 Author: Keith Packard <keithp@keithp.com> Date: Fri Jan 29 23:28:46 2010 -0800 Initialize DRI2 info rec version 4 list of driver names With DRI2 supporting multiple subsystems, the video driver must initialize the list of driver names instead of just passing the single driver name used by Mesa. Without this, the X server will fail to initialize DRI2 as the numDrivers field in this structure will be uninitialized. Signed-off-by: Keith Packard <keithp@keithp.com> (In reply to comment #3) > Bisect result: > > It is Xf86_video_intel commit caused this problem. > The last good commit is 4902f546be19e3d5bb47f6c75e2199dc4856c0f4. > Ater this commit, glxgears failed because DRI2 issue until commit > 1a76fa5574e8e8f88ac3518a4e4494e1af301dc1. This issue can be reproduced on > commit 1a76fa5574e8e8f88ac3518a4e4494e1af301dc1, so I think it is the first bad > commit. > > commit 1a76fa5574e8e8f88ac3518a4e4494e1af301dc1 > Author: Keith Packard <keithp@keithp.com> > Date: Fri Jan 29 23:28:46 2010 -0800 > > Initialize DRI2 info rec version 4 list of driver names > > With DRI2 supporting multiple subsystems, the video driver must > initialize the list of driver names instead of just passing the single > driver name used by Mesa. Without this, the X server will fail to > initialize DRI2 as the numDrivers field in this structure will be > uninitialized. > > Signed-off-by: Keith Packard <keithp@keithp.com> It think this commit is a red herring. It looks like this patch will just re-enable the DRI2 paths in the driver. My guess is that the bug actually lies there. I'm also suspicious that it has the same root cause as bug #27040 and / or bug #27190. There is a patch series referenced in those bugs. I'd like to see this bug tested with this patch series. Tested with the patch series, it still fails. By the way, with recent test, we find S3 and switch back to cosole mode may also cause glxgears window blank. I've seen the same behaviour for at least 4 - 5 months (from the first time I tested it). Also, my computer doesn't blank the OpenGL app...it reboots (G45 graphics on a DG45FC motherboard), so I'm not sure this is a regression. Anyways, my bug report (which is probably a dupe of this one) is bug #26451 After resume, do you see interrupts coming in for the i915 device (just grep i915 /proc/interrupts)? It would be good to see where glxgears is blocked in the server too, what was the last request it sent before the hang? After resume, I see interrupts coming in for the i915 device. Following is Backtrace: glxgears Backtrace: #0 0x00000030de2d4f38 in poll () from /lib64/libc.so.6 #1 0x00007fba53aba88a in _xcb_conn_wait (c=0x1ce9b20, cond=<value optimized out>, vector=0x0, count=0x0) at xcb_conn.c:306 #2 0x00007fba53abc8fc in xcb_wait_for_reply (c=0x1ce9b20, request=2153, e=0x7fff6cd52bb8) at xcb_in.c:390 #3 0x00007fba5444362f in _XReply (dpy=0x1ce9010, rep=0x7fff6cd52c20, extra=0, discard=0) at xcb_io.c:454 #4 0x00007fba547b5e13 in DRI2GetBuffersWithFormat (dpy=0x1ce9010, drawable=<value optimized out>, width=0x1cfbc84, height=0x1cfbc88, attachments=0x7fff6cd52d20, count=2, outCount=0x7fff6cd52d5c) at dri2.c:441 #5 0x00007fba547b4729 in dri2GetBuffersWithFormat (driDrawable=<value optimized out>, width=0x1cfbc84, height=0x1cfbc88, attachments=<value optimized out>, count=<value optimized out>, out_count=0x7fff6cd52d5c, loaderPrivate=0x1cfbb90) at dri2_glx.c:444 #6 0x00007fba53116cca in intel_update_renderbuffers (context=<value optimized out>, drawable=0x1cfbc50) at intel_context.c:252 #7 0x00007fba53117313 in intel_prepare_render (intel=0x1d023e0) at intel_context.c:395 #8 0x00007fba531359e0 in brw_try_draw_prims (max_index=<value optimized out>, min_index=<value optimized out>, ib=<value optimized out>, nr_prims=<value optimized out>, prim=<value optimized out>, arrays=<value optimized out>, ctx=<value optimized out>) at brw_draw.c:340 #9 brw_draw_prims (max_index=<value optimized out>, min_index=<value optimized out>, ib=<value optimized out>, nr_prims=<value optimized out>, prim=<value optimized out>, arrays=<value optimized out>, ctx=<value optimized out>) at brw_draw.c:441 #10 0x00007fba531f4fc5 in vbo_exec_DrawArrays (mode=6, start=0, count=4) at vbo/vbo_exec_array.c:525 #11 0x00007fba53274354 in _mesa_meta_Clear (ctx=0x1d023e0, buffers=0) at drivers/common/meta.c:1466 #12 0x00007fba53115a47 in intelClear (ctx=0x1d023e0, mask=<value optimized out>) at intel_clear.c:182 #13 0x000000000040290e in draw () at glxgears.c:252 #14 0x00000000004031af in draw_gears () at glxgears.c:314 #15 draw_frame () at glxgears.c:339 #16 event_loop () at glxgears.c:689 #17 main () at glxgears.c:769 X server Backtrace: #0 0x00000030de2d6f53 in __select_nocancel () from /lib64/libc.so.6 #1 0x000000000046b1bb in WaitForSomething (pClientsReady=0x38b56b0) at WaitFor.c:229 #2 0x0000000000429128 in Dispatch () at dispatch.c:375 #3 0x00000000004217c5 in main (argc=2, argv=0x7fff629107c8, envp=<value optimized out>) at main.c:286 promoting to P1. Jesse, can you reproduce? With research we find this issue disappear if pageflip is disabled. (In reply to comment #10) > With research we find this issue disappear if pageflip is disabled. How do you disable pageflip? Option "PageFlip" "false" in xorg.conf seems to be ignored... I disable OptionPageFlip on drmmode_display.c(xf86_video_intel component). --- a/src/drmmode_display.c +++ b/src/drmmode_display.c @@ -1461,6 +1461,7 @@ Bool drmmode_pre_init(ScrnInfoPtr scrn, int fd, int cpp) gp.value = &has_flipping; (void)drmCommandWriteRead(intel->drmSubFD, DRM_I915_GETPARAM, &gp, sizeof(gp)); + has_flipping=0; if (has_flipping) { xf86DrvMsg(scrn->scrnIndex, X_INFO, "Kernel page flipping support detected, enabling\n"); Created attachment 34921 [details] [review] disable page flipping but leave events Can you try this patch instead? It should disable page flipping but leave the vblank event code in place, which could narrow down the problem. With your patch, this issue still happens. Ok, so that means it's probably related to the vblank event code. Thanks for the update. Current 2D driver has some workarounds for vblank event handling & suspend/resume, can you test again with the latest bits? Tested on G45 with current bits. It still fails. The X 1.8 branch just got some fixes for issues like this, I'm retesting now with the latest bits to see if I can reproduce. Works for me now on GM45 with current X server master (with the autoconf patch applied) & xf86-video-intel (with the patch from bug 28252 applied). Tested on piketon and GM45, with compiz enabled, it still fails after resuming (S3 and S4) or switching VT back. If compiz disabled, it works when switching VT back, but fails after resuming (S3 and S4). Jesse, it seems not work on all platforms. We tested with the newest kernel on for-linus and code. And it works well on Piketon, but it still fails on GM45. And on G45 it can't be tested because with the newest kernel it will be black screen when it boot. As bug #27733 shows. So reopen it just for tracking it until it works well on G45, GM45. Which versions were you running? GM45 worked for me with compiz; I didn't see any failures. (In reply to comment #22) > Which versions were you running? GM45 worked for me with compiz; I didn't see > any failures. I tested with kernel in for-linus branch (e3a815fcd38043b8f1bb526123d8ab6ae01deb77). And other components as following: Libdrm: (master)73a42a645201a85ce2fe4fc77754df67e5097fc9 Mesa: (master)31a74a6df77daea9084c34b86f217f23a55e6b91 Xserver: (master)5d4e2c594059ffb536c8e506c2623320d3c6a787 Xf86_video_intel: (master)6db1e5231b7a0e79611f771d4efea686f7849e04 If you still see this can you capture some more information? If you can VT switch after resume, you can probably ssh in as well and gdb the server or glxgears to see what they're waiting for. If they're stuck in a "poll" or "select" call, please check the /proc/<pid>/wchan file to see what kernel mutex they're waiting on. Ah I see this issue on my GM45 now with master of everything. Checking it out... Seems I can reproduce it with a simple VT switch too, so something is wrong with the way X consumes DRM events. Also ignore comment #24 I see you already collected backtraces. Created attachment 36481 [details] [review] don't sync redirected windows I don't know why yet, but somehow running under compiz causes this problem. If both clients and the compositor are using events, when you VT switch back the client hangs. This patch worked around the problem for me, can you confirm? Yes, I confirm this patch fixes. Created attachment 36587 [details] [review] another approach to avoiding client hangs at VT switch time Here's a server patch that should also fix the problem, closer to the root cause this time. Created attachment 36588 [details] [review] keep DRI2 clients suspended at VT switch This one isn't strictly necessary, but makes DRI2 behave like GLX across VT switch. Bug fixed in X master: commit 28e33ae6f69f716ece5d68e63fc52557236c5f6e Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Wed Jun 30 07:59:04 2010 -0700 OS support: fix writeable client vs IgnoreClient behavior I'll request that it go into the 1.8 branch as well. Works fine with current code, so marking it as verified. Closing old verified. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 33978 [details] Screenshot showing the problem Platform: G45 Mesa: (7.8)54af54277a7a469ed2b9821ef6ed7ed464381f91 Xserver: (master)f2eacb4646beb25d055de22868f93e6b24f229b6 Xf86_video_intel:(master)318aa9ed799197810e2039dbe3ec51559dcc888c Libdrm: (master)04fd3872ee8bd8d5e2c27740508c67c2d51dbc11 Kernel: (master)60b341b778cc2929df16c0a504c91621b3c6a4ad Bug detailed description: ------------------------- Start glxgears on gnome desktop, then do S4(Suspend/resume from disk). After system restore, glxgears stop printing info like fps, and glxgears window is black. It works well on X window(don't start gnome). This issue happens on all platform. It is regression. It works fine with code on January 18th. I will bisect this on next week. Reproduce steps: ---------------- 1.Start X and gnome-session 2.run glxgears 3.echo disk > /sys/power/state 4.press power button to restore