I'm using Ubuntu 14.04 since installing the Wily HWE stack, the X-Server began crashing when undocking the notebook (Lenovo T450s). I tried a newer driver (GIT snapshot up to commit e41040fb55e84fc72612ae503944a99814a5c9d3 , 15th Feb. 2016), but the X-Server still crashes. The stacktrace looks a bit different when using the GIT version, but still seems to be in the same area in the code (SNA page flips). The stackstace for the GIT version is in stacktrace_nodebug.txt. I have also compiled the driver with --enable-debug, then the X-Server exits due to an assertion (as seen in Xorg.0.log.debug). I'm also attaching the stacktrace for this case (stacktrace_debug.txt). The kernel is the Ubuntu Wily kernel (4.2.0-27-generic).
Created attachment 121894 [details] stack trace, without --enable-debug
Created attachment 121895 [details] stack trace, with --enable-debug
Created attachment 121896 [details] Xorg.0.log.debug, with --enable-debug
Hmm. First thought is that the kernel returned an event for an operation that failed, can you please quickly try with diff --git a/src/sna/sna_dri2.c b/src/sna/sna_dri2.c index fcfbd9d..7db6f61 100644 --- a/src/sna/sna_dri2.c +++ b/src/sna/sna_dri2.c @@ -1914,6 +1914,7 @@ sna_dri2_flip(struct sna_dri2_event *info) return false; } + assert(!info->queued); if (!sna_page_flip(info->sna, bo, sna_dri2_flip_handler, info->type == FLIP_ASYNC ? NULL : info)) return false; and --enable-debug.
Ok, found one scenario that could explain the traces after a failed flip when undocking, so please test with: commit 64b1b1f10da59f15a91141c9f76d7d09517f8ea8 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 23 09:32:57 2016 +0000 sna/dri2: Ensure the flipqueue is drained on pageflip failure If we fail to queue a flip for a CRTC, we attempt to restore the original mode. However, if the failure is on a second CRTC, the queued flip on the first will still complete causing us to process the event twice. References: https://bugs.freedesktop.org/show_bug.cgi?id=94250 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When using the first patch (comment 4), the code did not exit at the new assertion, but at the one from before (from comment 3). I think I saw it one time exiting with (but failed to save the log and core dump): (EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed And the new patch (comment 5) unfortunately did not fix the problem. After a couple of tries the code seems to exit at one of two assertions. I will add the Xorg logs and corresponding stack traces. One of the assertions is "'info->front == front' failed" from above, so I got the log and stack trace for this one.
Created attachment 121918 [details] Xorg.0.log.1 Xorg log for assertion (EE) sna_dri2_event_free:1663 assertion '!info->signal' failed
Created attachment 121919 [details] stack_trace1.txt stack trace for assertion (EE) sna_dri2_event_free:1663 assertion '!info->signal' failed
Created attachment 121920 [details] Xorg.0.log.2 Xorg log for assertion (EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed
Created attachment 121921 [details] stack_trace2.txt stack trace for assertion (EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed
(In reply to cs_gon from comment #7) > Created attachment 121918 [details] > Xorg.0.log.1 > > Xorg log for assertion > > (EE) sna_dri2_event_free:1663 assertion '!info->signal' failed That is not a huge issue, just a bit of debug code needs cancelling along the error path. (In reply to cs_gon from comment #10) > Created attachment 121921 [details] > stack_trace2.txt > > stack trace for assertion > > (EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed This is actually more of an issue. This should reduce the assert commit 3593a2d18928f74ee470f824dc34b8b5b148ce2d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 23 14:36:10 2016 +0000 sna/dri2: Reset front pointer on frame event across a modeset If the root window's pixmap is changed (e.g. to resizing the framebuffer) then an outstanding flip becomes invalid. The invalid flip is marked as having a stale front, and that triggers an assertion if the user then tries to schedule flip before the pending flip event is processed. References: https://bugs.freedesktop.org/show_bug.cgi?id=9425 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> but it is still a little bit of a puzzle how we did not drain the event queue first. That's probably explained by commit 4cd43faa646e368624079b73b216f6546ede5c16 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 23 14:40:39 2016 +0000 sna: Flush the DRM event queue after a modeset Changing the mode will cause the DRM events(pageflips, vblanks) to be completed and the queue flushed. After applying the CRTC, drain the event queue. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The last two patches may have fixed it, or at least made things a lot better. I found out before, that disconnecting the external monitor from the docking station, instead of undocking the notebook, did cause this problem more frequently (maybe 75% of the time), and I was now unable to reproduce the issue when disconnecting the monitor. Though during a lot of docking and undocking, I got this assertion one time: (EE) sna_dri2_schedule_flip:3080 assertion 'info->front == NULL' failed I will attach the Xorg log and stack trace again.
Created attachment 121925 [details] Xorg.0.log.3 Xorg log for assertion: (EE) sna_dri2_schedule_flip:3080 assertion 'info->front == NULL' failed
Created attachment 121926 [details] stack_trace3.txt stack trace for assertion (EE) sna_dri2_schedule_flip:3080 assertion 'info->front == NULL' failed
After a bit more thought, the assertion was incorrect and we just expect the stale pointer and not NULL. Sadly, I squashed the change with commit d1672806a5222f00dcc2eb24ccddd03f727f71bc Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 24 10:33:22 2016 +0000 sna/dri2: Add active-scanout tracking to single CRTC flips
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.