Created attachment 85267 [details] xrandr output When using the new virtual output support in version 2.99.901 of the driver, the external screen only shows the background image of the desktop and the mouse cursor. Any window partially or fully moved to the virtual screen is not displayed. Hardware: Intel HD 3000 NVIDIA Quadro NVS 4200M Software: X.Org Intel driver 2.99.901 NVIDIA driver 325.15 X.Org X server 1.13.4 Please let me know if any additional information would be helpful in tracking down this issue.
My suspicion is that I've forgotten my X and missed a flag somewhere... If you apply diff --git a/tools/virtual.c b/tools/virtual.c index 8e241e5..ee8b3c6 100644 --- a/tools/virtual.c +++ b/tools/virtual.c @@ -56,7 +56,7 @@ #include <fcntl.h> #include <assert.h> -#if 0 +#if 1 #define DBG(x) printf x #else #define DBG(x) you will get lots of debug output if you then use: intel-virtual-output -f > i-v-o.txt and then repeat the test of dragging the window onto the virtual screen. What I am looking for in the i-v-o.txt is whether we are detecting and then painting the damaged regions on the virtual screen.
Created attachment 85273 [details] intel-virtual-output debug output Right, I recompiled intel-virtual-output with debugging support. It seems some damaged regions are detected, however to me it seems like the positions are invalid. Some kind of overflow? Unfortunately the log file grew quite large rather quickly, nevertheless I attached it. During testing I encountered a problem with intel-virtual-output itself: Once I killed the process and tried to start intel-virtual-output again, it fails: display_open((null)) :0.0: singleton registered? 1 :0.0: pinging singleton :0.0: send command '4567P' *** buffer overflow detected ***: intel-virtual-output terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x7fd3457b68c7] /lib64/libc.so.6(+0xff6c0)[0x7fd3457b46c0] /lib64/libc.so.6(+0xfea39)[0x7fd3457b3a39] /lib64/libc.so.6(_IO_default_xsputn+0x85)[0x7fd34572e015] /lib64/libc.so.6(_IO_vfprintf+0x1ae1)[0x7fd3456fc4c1] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x7fd3457b3add] /lib64/libc.so.6(__sprintf_chk+0x83)[0x7fd3457b3a23] intel-virtual-output[0x406ab3] intel-virtual-output[0x4031ee] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fd3456d74bd] intel-virtual-output[0x405ce9] Not sure whether restarting intel-virtual-output should work at all. In this case I could try to recompile it with debugging symbols.
Whoops, that assert failure was a silly use of sprintf. commit f01c5eae2a3d6efebe9332a8654303356e622bec Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Sep 5 17:48:20 2013 +0100 intel-virtual-output: Reserve space for the '\0' in the sprintf Even though we don't use the trailing NUL byte for our comparisons, sprintf will write it, so we need make space for it. Reported-by: Severin Strobl <fd@severin-strobl.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68987 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> The idea is that only one intel-virtual-output exists (otherwise we have fun with multiple processes conflicting over the allocation of VIRTUAL), so if you run it a second time it sends the request for new screens to the first process. (There's a timeout for it detection the old process expired.) The problem appears to lie in damage tracking. We get nothing reported, so we don't draw any updates.
I double checked that I see damage for both composited and uncomposited desktops (thinking the the GL compositor might be upsetting how I receive damage), but everything looks ok. Can you describe your DE so that I can see if I can replicate your setup?
Currently I am still on GNOME 2 (version 2.32.1) using Metacity, so nothing fancy. Not sure what else I can report about my setup. Are there any other components that might interfere?
The repaint algorithm is extremely simple: 1. Register for Damage tracking on the root window of the Intel X server 2. Wait for Damage events 3. Mark the outputs as damaged, and activate a timer 4. Wait for timer to expire 5. Paint damaged regions on remote outputs (essentially GetImage(local root window), PutImage(remote root window) Now, that I mention it... It might not be the damage tracking but the timer that's failing. But to answer your question, the only thing that could perturb this is if something prevented the Damage from being sent by X or managed to hide itself from the GetImage. (Neither of which are likely, and would be X or DDX bugs.)
Nope, it is definitely a lack of damage events.
I believe I found it. The clue was that you did get a damage event, exactly one. Can you please update and test with commit 12de799598ffbbe056bd4455efefa62d4b081d98 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Sep 5 19:36:13 2013 +0100 intel-virtual-output: Flush the damage received message back to the local display After processing the Damage notification, we need to send a message back to the Xserver to clear the pending damage before we will be sent more events. To make sure that message is sent we need to flush the output, as we may never flush the output queue otherwise. Reported-by: Severin Strobl <fd@severin-strobl.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68987 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Created attachment 85289 [details] output 12de799598ffbbe056bd4455efefa62d4b081d98 Unfortunately this does not seem to fix the problem. I get the same behaviour as before. Using the version 12de799598ffbbe056bd4455efefa62d4b081d98 I performed another debug run, the output is attached.
Drat the issue still appears to be the same, the solitary damage event. Hmm, the reply will be flushed by the XShmGetImage in any case. So I need to find another reason for the lack of damage. Can you please comment out the call to XSetErrorHandler() (and cross your fingers that you don't die too early with races with multiple RandR clients) and see if that reports an issue? Grabbing xtrace and running: xtrace [-n] -f i-v-o.xtrace intel-virtual-output -f would be useful as well. (The -n may or may not be necessary depending on your setup)
(In reply to comment #10) > Can you please comment out the call to XSetErrorHandler() (and cross your > fingers that you don't die too early with races with multiple RandR clients) > and see if that reports an issue? Not sure this generated any useful information: display_open((null)) :0.0: singleton registered? 1 :0.0: pinging singleton :0.0: send command '4567P' :0.0: wait for act '4567R' :0.0: reading event type 33 :0.0: client message '4567P' No reply from singleton; assuming control :0.0: reading event type 28 display_init_damage(:0.0) record_mouse(:0.0) record_callback display_open(:0.1) bumblebee_open query result 'Value: :8 ' display_open(:8) last_display_add_clones__randr(:8) display_init_randr_hpd(:8) :8 - noutputs=8 claim_virtual(1) claim_virtual(VIRTUAL1): rr_output=68 clone_output_init(:0.0, VIRTUAL1) :0.0-VIRTUAL1 use shm? 1 (use shm pixmap? 1) clone_output_init(:8, VGA-0) :8-VGA-0 use shm? 1 (use shm pixmap? 0) :8-VGA-0 wants depth 24 :0.0 is depth 24, want 24 :8 is depth 24, want 24 :8-VGA-0 using depth 24, requires xrender for src? 0, for dst? 0 X Error of failed request: BadAccess (attempt to access private resource denied) Major opcode of failed request: 139 (RANDR) Minor opcode of failed request: 17 (RRDestroyMode) Serial number of failed request: 51 Current serial number in output stream: 52 Will try xtrace next...
Created attachment 85290 [details] x11trace output Here the trace generated by x11trace.
(In reply to comment #11) > (In reply to comment #10) > > Can you please comment out the call to XSetErrorHandler() (and cross your > > fingers that you don't die too early with races with multiple RandR clients) > > and see if that reports an issue? > > Not sure this generated any useful information: > > X Error of failed request: BadAccess (attempt to access private resource > denied) > Major opcode of failed request: 139 (RANDR) > Minor opcode of failed request: 17 (RRDestroyMode) > Serial number of failed request: 51 > Current serial number in output stream: 52 Yes, that's why errors are being ignored. It seems to be impossible to atomically adjust the outputs and modes (especially we require round trips with the server) - so we end up with other RandR clients using our modes that we want to clean up.
Looks ok, no errors, just the damage stops coming. What may be significant is that it occurs when we reconfigure the screen for the virtual outputs. There have been a number of Xserver bugs in the past where Damage breaks across a modeset, so I wonder if this is another... Maybe this will workaround the issue? diff --git a/tools/virtual.c b/tools/virtual.c index 257f37f..4da2f3a 100644 --- a/tools/virtual.c +++ b/tools/virtual.c @@ -1760,9 +1760,22 @@ static int display_init_damage(struct display *display) if (display->damage == 0) return EACCES; + display->flush = 1; return 0; } +static void display_reset_damage(struct display *display) +{ + Damage damage; + + damage = XDamageCreate(display->dpy, display->root, XDamageReportRawRectangles); + if (damage) { + XDamageDestroy(display->dpy, display->damage); + display->damage = damage; + display->flush = 1; + } +} + static void display_init_randr_hpd(struct display *display) { int major, minor; @@ -2632,8 +2645,10 @@ int main(int argc, char **argv) ret--; } - if (reconfigure) + if (reconfigure) { context_update(&ctx); + display_reset_damage(ctx.display); + } if (rr_update) { for (i = 0; i < ctx.nclone; i++)
Created attachment 85292 [details] [review] w/a?
Yes, this did it! Nice work! Thanks for addressing this issue so quickly. Now it works exactly as expected. One tiny problem I noticed: When dragging around windows on the external screen, every now and then some parts of the screen are not correctly redrawn, so parts of the window are still visible at the old position until a redraw of the area in question becomes necessary by dragging the window across it again.
commit 2263f8f26fb3c4fabceac2e192e95c5d26a4d173 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Sep 5 21:55:49 2013 +0100 intel-virtual-overlay: Reset damage across modesets Some versions of the Xserver lose Damage tracking across the modeset, causing a loss of damage notifications and repainting to cease on the virtual outputs. We can workaround this by reattaching the damage every time we receive notification that the local Screen configuration changes. Reported-and-tested-by: Severin Strobl <fd@severin-strobl.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68987 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> But it sounds like we have other Damage races to fix as well.
I've tweaked the damage tracking (gone from RawRectangles to BoundingBoxes, and tweaked the ordering of clearing the damage), this seems to be behaving as I would expect here. Can you please give the updated i-v-o a twirl?
It seems better now, so it is harder to get it to produce the artifacts. However vary rarely I still have sections of the screen which are not updated correctly. However this does require some serious trying.
And it is definitely a missing paint as opposed it to painting the wrong image in the wrong location, or painting garbage instead? The former would indicate that the damage is still iffy, whereas the latter would be a driver bug in not managing the coherency of the shared memory transport.
Created attachment 85399 [details] pictures of the output using i-v-o Sorry for the delay. So to me it looks like a missing paint of the area the window left, as the window itself is not damaged in any way and the remaining parts can clearly be identified as parts of the window at the old position. I did some tests again and took some pictures (sorry for the crappy quality), just to make sure I'm not mistaken.
Hmm, looks like damage. I've another little test for you to try, can you update to commit 0ceba648211009a464e46855a52737219ab9b534 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Sep 8 13:51:14 2013 +0100 intel-virtual-output: Debug option for forcing full redraws References: https://bugs.freedesktop.org/show_bug.cgi?id=68987 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> and then set FORCE_FULL_REDRAW to 1: diff --git a/tools/virtual.c b/tools/virtual.c index c363a3e..1f3113e 100644 --- a/tools/virtual.c +++ b/tools/virtual.c @@ -62,7 +62,7 @@ #define DBG(x) #endif -#define FORCE_FULL_REDRAW 0 +#define FORCE_FULL_REDRAW 1 struct display { Display *dpy; and see what happens?
Forcing the complete redraw eliminates the problem, of course at the cost of increased CPU load. For some strange reason, with deactivated full redraw I encountered the image corruption only extremely rarely today. I was able to reproduce it twice after trying for several minutes. It was with a different external screen (resolution), so I'll try again this evening with the setup I used on the weekend.
I haven't spotted anything else of significance in intel-virtual-overlay, so I think we may have a driver bug. In particular, we may be suffering from an old bug in the kernel (thinking that you may be forced to use an older kernel for compatability with nvidia). If using 3.11 is possible that will rule out the known bugs that I think could result in similar corruption.
Hmm, something else you can try (as well as confirming the issue on the latest stable versions) is whether the same issue occurs if you use the nouveau driver. (And then using the nouveau driver, we can also test the latest unstable versions!)
I think i-v-o is correct as can be, and this is an old Xserver bug...
Due to some travelling, I was not able to test the newer versions of i-v-o for quite a while. Using the 2.99.905 snapshot and a 3.11 kernel, I did indeed not observe any problems in the damage tracking any more, it works perfectly! One minor issue I noticed (and did not encounter before): Something weird is happening to the mouse pointer the moment the external display becomes active. Outside of the windows open at this moment, the mouse pointer is not drawn any more. Starting another grahical application fixes this and brings back the mouse pointer for all screens. Unfortunatelly I have no idea whether this is actually related to i-v-o, or just some unrelated bug.
The cursor disappearing sounds like an artifact of how the cursor image is captured. At one point I thought about adding an explicit query to record the cursor image, sounds like I need to check it again.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.