Summary: | Intel driver crashes when undocking Lenovo T450s | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | cs_gon | ||||||||||||||||||||
Component: | Driver/intel | Assignee: | Chris Wilson <chris> | ||||||||||||||||||||
Status: | RESOLVED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||||||||||
Severity: | major | ||||||||||||||||||||||
Priority: | medium | ||||||||||||||||||||||
Version: | git | ||||||||||||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||
Attachments: |
|
Description
cs_gon
2016-02-22 16:51:20 UTC
Created attachment 121894 [details]
stack trace, without --enable-debug
Created attachment 121895 [details]
stack trace, with --enable-debug
Created attachment 121896 [details]
Xorg.0.log.debug, with --enable-debug
Hmm. First thought is that the kernel returned an event for an operation that failed, can you please quickly try with diff --git a/src/sna/sna_dri2.c b/src/sna/sna_dri2.c index fcfbd9d..7db6f61 100644 --- a/src/sna/sna_dri2.c +++ b/src/sna/sna_dri2.c @@ -1914,6 +1914,7 @@ sna_dri2_flip(struct sna_dri2_event *info) return false; } + assert(!info->queued); if (!sna_page_flip(info->sna, bo, sna_dri2_flip_handler, info->type == FLIP_ASYNC ? NULL : info)) return false; and --enable-debug. Ok, found one scenario that could explain the traces after a failed flip when undocking, so please test with: commit 64b1b1f10da59f15a91141c9f76d7d09517f8ea8 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 23 09:32:57 2016 +0000 sna/dri2: Ensure the flipqueue is drained on pageflip failure If we fail to queue a flip for a CRTC, we attempt to restore the original mode. However, if the failure is on a second CRTC, the queued flip on the first will still complete causing us to process the event twice. References: https://bugs.freedesktop.org/show_bug.cgi?id=94250 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> When using the first patch (comment 4), the code did not exit at the new assertion, but at the one from before (from comment 3). I think I saw it one time exiting with (but failed to save the log and core dump): (EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed And the new patch (comment 5) unfortunately did not fix the problem. After a couple of tries the code seems to exit at one of two assertions. I will add the Xorg logs and corresponding stack traces. One of the assertions is "'info->front == front' failed" from above, so I got the log and stack trace for this one. Created attachment 121918 [details]
Xorg.0.log.1
Xorg log for assertion
(EE) sna_dri2_event_free:1663 assertion '!info->signal' failed
Created attachment 121919 [details]
stack_trace1.txt
stack trace for assertion
(EE) sna_dri2_event_free:1663 assertion '!info->signal' failed
Created attachment 121920 [details]
Xorg.0.log.2
Xorg log for assertion
(EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed
Created attachment 121921 [details]
stack_trace2.txt
stack trace for assertion
(EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed
(In reply to cs_gon from comment #7) > Created attachment 121918 [details] > Xorg.0.log.1 > > Xorg log for assertion > > (EE) sna_dri2_event_free:1663 assertion '!info->signal' failed That is not a huge issue, just a bit of debug code needs cancelling along the error path. (In reply to cs_gon from comment #10) > Created attachment 121921 [details] > stack_trace2.txt > > stack trace for assertion > > (EE) sna_dri2_schedule_flip:3078 assertion 'info->front == front' failed This is actually more of an issue. This should reduce the assert commit 3593a2d18928f74ee470f824dc34b8b5b148ce2d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 23 14:36:10 2016 +0000 sna/dri2: Reset front pointer on frame event across a modeset If the root window's pixmap is changed (e.g. to resizing the framebuffer) then an outstanding flip becomes invalid. The invalid flip is marked as having a stale front, and that triggers an assertion if the user then tries to schedule flip before the pending flip event is processed. References: https://bugs.freedesktop.org/show_bug.cgi?id=9425 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> but it is still a little bit of a puzzle how we did not drain the event queue first. That's probably explained by commit 4cd43faa646e368624079b73b216f6546ede5c16 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 23 14:40:39 2016 +0000 sna: Flush the DRM event queue after a modeset Changing the mode will cause the DRM events(pageflips, vblanks) to be completed and the queue flushed. After applying the CRTC, drain the event queue. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> The last two patches may have fixed it, or at least made things a lot better. I found out before, that disconnecting the external monitor from the docking station, instead of undocking the notebook, did cause this problem more frequently (maybe 75% of the time), and I was now unable to reproduce the issue when disconnecting the monitor. Though during a lot of docking and undocking, I got this assertion one time: (EE) sna_dri2_schedule_flip:3080 assertion 'info->front == NULL' failed I will attach the Xorg log and stack trace again. Created attachment 121925 [details]
Xorg.0.log.3
Xorg log for assertion:
(EE) sna_dri2_schedule_flip:3080 assertion 'info->front == NULL' failed
Created attachment 121926 [details]
stack_trace3.txt
stack trace for assertion
(EE) sna_dri2_schedule_flip:3080 assertion 'info->front == NULL' failed
After a bit more thought, the assertion was incorrect and we just expect the stale pointer and not NULL. Sadly, I squashed the change with commit d1672806a5222f00dcc2eb24ccddd03f727f71bc Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 24 10:33:22 2016 +0000 sna/dri2: Add active-scanout tracking to single CRTC flips |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.