Bug 94685 - Recent change makes Xorg hangs
Summary: Recent change makes Xorg hangs
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 94686 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-03-24 15:21 UTC by Alexandre
Modified: 2016-03-26 16:46 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg.log with full debug (5.05 MB, application/gzip)
2016-03-24 18:53 UTC, Alexandre
no flags Details

Description Alexandre 2016-03-24 15:21:15 UTC
I've traced this too 02f535e8f3659f1147c6f2e698bd5d8730dec19b - after reversal it was working again.

To reproduce, hover the mouse over a recent kde5 menu and Xorg hangs in an infinite loop.

Connecting via ssh I was able to ask gdb for this live backtrace:

#0  0x00007f3b82cde577 in ioctl () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f3b83faf558 in drmIoctl ()
   from /usr/lib/x86_64-linux-gnu/libdrm.so.2
#2  0x00007f3b7e6e32da in sna_present_queue ()
   from /usr/lib/xorg/modules/drivers/intel_drv.so
#3  0x00007f3b7e6e3448 in vblank_complete ()
   from /usr/lib/xorg/modules/drivers/intel_drv.so
#4  0x00007f3b7e6e395e in sna_present_vblank_handler ()
   from /usr/lib/xorg/modules/drivers/intel_drv.so
#5  0x00007f3b7e65109c in sna_mode_wakeup ()
   from /usr/lib/xorg/modules/drivers/intel_drv.so
#6  0x00007f3b7e654c10 in sna_wakeup_handler ()
   from /usr/lib/xorg/modules/drivers/intel_drv.so
#7  0x000055999747577a in WakeupHandler ()
#8  0x00005599975cbbaf in WaitForSomething ()
#9  0x00005599974709ce in ?? ()
#10 0x0000559997474bb3 in ?? ()
#11 0x00007f3b82c1d610 in __libc_start_main (main=0x55999745ef10, argc=8, 
    argv=0x7ffca32f6918, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7ffca32f6908) at libc-start.c:291
#12 0x000055999745ef49 in _start ()

lspci reports 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)

It's a Intel(R) Core(TM) i5-2410M system and the Xorg log reports:
[  1972.493] (II) intel: Driver for Intel(R) HD Graphics: 2000-6000
[  1972.493] (II) intel: Driver for Intel(R) Iris(TM) Graphics: 5100, 6100
[  1972.493] (II) intel: Driver for Intel(R) Iris(TM) Pro Graphics: 5200, 6200, P6300
[  1972.493] (II) intel(0): Using Kernel Mode Setting driver: i915, version 1.6.0 20151010
[  1972.493] (II) intel(0): SNA compiled from 2.99.917-581-g1b82b7b
[  1972.493] (--) intel(0): Integrated Graphics Chipset: Intel(R) HD Graphics 3000
[  1972.493] (--) intel(0): CPU: x86-64, sse2, sse3, ssse3, sse4.1, sse4.2, avx; using a maximum of 2 threads
[  1972.493] (II) intel(0): Creating default Display subsection in Screen section
Comment 1 Alexandre 2016-03-24 15:22:01 UTC
The xorg.log info was from a subsequent [i.e. working] build; The problematic one was not recorded
Comment 2 Chris Wilson 2016-03-24 15:33:02 UTC
Pretty please could I ask for an xorg.log with xf86-video-intel compiled with ./configure --enable-debug=full ?
Comment 3 Alexandre 2016-03-24 16:08:15 UTC
Sure, I'll come back with it in a few hours.
Comment 4 Chris Wilson 2016-03-24 18:32:56 UTC
So I think it was just because the test should be for 31bits and not 32bits (or at least you hit the same bug I just reproduced in a testcase here):

commit c186d4dda3b62b73af3caf2883a9cedfd97e3b45
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Mar 24 18:22:20 2016 +0000

    sna/present: Restrict vblank.sequence range to 31bits
    
    The kernel checks for past vblanks using an int32_t comparison, so we
    can only program up to 31bits into the future (and similarly programing
    a timer that large would also overflow).
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=94685
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

The debug log would still be very helpful, thanks.
Comment 5 Chris Wilson 2016-03-24 18:41:52 UTC
*** Bug 94686 has been marked as a duplicate of this bug. ***
Comment 6 Alexandre 2016-03-24 18:53:42 UTC
Created attachment 122525 [details]
Xorg.log with full debug

The Xorg.log with debug. After the freeze, Xorg was hard-killed by alt+sysrq+i and the log was copied from the console.

The bug seems to be affected by timing: With debug enabled, it was harder to reproduce the freeze.
Comment 7 FadeMind 2016-03-24 19:00:00 UTC
@Chris Wilson 

After install xf86-video-intel-git 1:2.99.917+587+gc186d4d-1
based on https://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=c186d4dda3b62b73af3caf2883a9cedfd97e3b45

and while of standard desktop usage I don't have freeze. So IMO is solved.

LOGS - clean - no freeze 

https://www.dropbox.com/s/agky6qmyvmoao4g/xorg-dbg-c186d4dda3b.tar.xz?dl=1
Comment 8 Chris Wilson 2016-03-24 19:16:13 UTC
(In reply to FadeMind from comment #7)
> @Chris Wilson 
> 
> After install xf86-video-intel-git 1:2.99.917+587+gc186d4d-1
> based on
> https://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/
> ?id=c186d4dda3b62b73af3caf2883a9cedfd97e3b45
> 
> and while of standard desktop usage I don't have freeze. So IMO is solved.
> 
> LOGS - clean - no freeze 
> 
> https://www.dropbox.com/s/agky6qmyvmoao4g/xorg-dbg-c186d4dda3b.tar.xz?dl=1

Yup, it caught one bogus MSC:

[    98.669] sna_present_queue_vblank(pipe=0, event=9756, msc=4294973158, last swap=5863)
[    98.672] sna_present_queue_vblank:365 assertion 'msc - swap->msc < 1ull<<31' failed

which matches the bug Alexandre is hitting:

[ 14602.968] sna_present_queue_vblank(pipe=0, event=2503, msc=4295836851, last swap=869556)
[ 14602.968] sna_present_queue: target msc=4295836851, seq=869555 (last_msc=869556)


Thank you both.
Comment 9 Joakim Tjernlund 2016-03-24 21:13:32 UTC
(In reply to Chris Wilson from comment #4)
> So I think it was just because the test should be for 31bits and not 32bits
> (or at least you hit the same bug I just reproduced in a testcase here):
> 
> commit c186d4dda3b62b73af3caf2883a9cedfd97e3b45
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Thu Mar 24 18:22:20 2016 +0000
> 
>     sna/present: Restrict vblank.sequence range to 31bits
>     
>     The kernel checks for past vblanks using an int32_t comparison, so we
>     can only program up to 31bits into the future (and similarly programing
>     a timer that large would also overflow).
>     
>     References: https://bugs.freedesktop.org/show_bug.cgi?id=94685
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> The debug log would still be very helpful, thanks.

+ if (warn_unless(msc - swap->msc < 1ull<<31))

Are you not allowed to used the lower bits here(as in 0x7ffffff)?
Comment 10 Chris Wilson 2016-03-24 21:27:33 UTC
Only the lowest bits. The warn_unless() is backwards, I should convert it to a warn_on() instead.
Comment 11 Joakim Tjernlund 2016-03-25 11:51:54 UTC
(In reply to Chris Wilson from comment #10)
> Only the lowest bits. The warn_unless() is backwards, I should convert it to
> a warn_on() instead.

Right, it is backwards. But then again, so am I :)
Comment 12 Alexandre 2016-03-26 16:46:01 UTC
(In reply to Chris Wilson from comment #8)
> (In reply to FadeMind from comment #7)
> > @Chris Wilson 
> > 
> > After install xf86-video-intel-git 1:2.99.917+587+gc186d4d-1
> > based on
> > https://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/
> > ?id=c186d4dda3b62b73af3caf2883a9cedfd97e3b45
> > 
> > and while of standard desktop usage I don't have freeze. So IMO is solved.
> > 
> > LOGS - clean - no freeze 
> > 
> > https://www.dropbox.com/s/agky6qmyvmoao4g/xorg-dbg-c186d4dda3b.tar.xz?dl=1
> 
> Yup, it caught one bogus MSC:
> 
> [    98.669] sna_present_queue_vblank(pipe=0, event=9756, msc=4294973158,
> last swap=5863)
> [    98.672] sna_present_queue_vblank:365 assertion 'msc - swap->msc <
> 1ull<<31' failed
> 
> which matches the bug Alexandre is hitting:
> 
> [ 14602.968] sna_present_queue_vblank(pipe=0, event=2503, msc=4295836851,
> last swap=869556)
> [ 14602.968] sna_present_queue: target msc=4295836851, seq=869555
> (last_msc=869556)
> 
> 
> Thank you both.

Thank you, I can confirm it's fixed .


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.