Bug 99358 - Xorg crashes with SIGSEGV in sna_set_cursor_position()
Summary: Xorg crashes with SIGSEGV in sna_set_cursor_position()
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 99084 99431 99548 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-01-11 10:25 UTC by Igor Mammedov
Modified: 2018-12-10 19:47 UTC (History)
13 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg log (44.67 KB, text/x-log)
2017-01-11 10:25 UTC, Igor Mammedov
no flags Details
Take the input lock for RecolorCursor (2.54 KB, patch)
2017-01-16 22:23 UTC, Chris Wilson
no flags Details | Splinter Review
Take input lock for xf86TransprentCursor (1.06 KB, patch)
2017-01-16 22:38 UTC, Chris Wilson
no flags Details | Splinter Review
Take input lock for CheckHWCursor (1.90 KB, patch)
2017-01-20 09:52 UTC, Chris Wilson
no flags Details | Splinter Review
Take input lock for CheckHWCursor (1.90 KB, patch)
2017-01-20 09:55 UTC, Chris Wilson
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Igor Mammedov 2017-01-11 10:25:21 UTC
Created attachment 128887 [details]
Xorg log

Crash happens randomly and it could take from half an hour to 2 days.
It seems that crash happens when moving cursor.

I've used xorg-x11-drv-intel from the latest git at commit 028c946df08 but crash happens anyway.

Here is crash backtrace:
Process 1715 (Xorg) of user 16585 dumped core.          
                Stack trace of thread 1728:
                #0  0x00007fdd4e5f0d54 sna_set_cursor_position (intel_drv.so)
                #1  0x00000000004bbea2 xf86MoveCursor (Xorg)
                #2  0x0000000000585eb3 miPointerMoveNoEvent (Xorg)
                #3  0x0000000000586cb4 miPointerSetPosition (Xorg)
                #4  0x000000000044d64e positionSprite.part.7 (Xorg)
                #5  0x000000000044de53 fill_pointer_events (Xorg)
                #6  0x000000000044f6df GetPointerEvents (Xorg)
                #7  0x000000000044fc90 QueuePointerEvents (Xorg)
                #8  0x00007fdd4c101cb5 xf86libinput_handle_motion (libinput_drv.so)
                #9  0x00007fdd4c102880 xf86libinput_read_input (libinput_drv.so)
                #10 0x000000000059cb1c InputReady (Xorg)
                #11 0x000000000059f181 ospoll_wait (Xorg)
                #12 0x000000000059c976 InputThreadDoWork (Xorg)
                #13 0x00007fdd530ac6ca start_thread (libpthread.so.0)
                #14 0x00007fdd52de6f7f __clone (libc.so.6)
                
                Stack trace of thread 1715:
                #0  0x00007fdd530b538d __lll_lock_wait (libpthread.so.0)
                #1  0x00007fdd530aeeca pthread_mutex_lock (libpthread.so.0)
                #2  0x000000000059c860 input_lock (Xorg)
                #3  0x00000000004bc386 xf86SetCursor (Xorg)
                #4  0x00000000004babf5 xf86CursorSetCursor (Xorg)
                #5  0x000000000058654b miPointerUpdateSprite (Xorg)
                #6  0x000000000058679a miPointerDisplayCursor (Xorg)
                #7  0x00000000004c9511 CursorDisplayCursor (Xorg)
                #8  0x0000000000518700 AnimCurDisplayCursor (Xorg)
                #9  0x000000000043fe48 ChangeToCursor (Xorg)
                #10 0x0000000000441287 WindowHasNewCursor (Xorg)
                #11 0x000000000046a948 ChangeWindowDeviceCursor (Xorg)
                #12 0x0000000000531dc6 ProcXIChangeCursor (Xorg)
                #13 0x0000000000437055 Dispatch (Xorg)
                #14 0x000000000043afd8 dix_main (Xorg)
                #15 0x00007fdd52cff401 __libc_start_main (libc.so.6)
                #16 0x0000000000424cfa _start (Xorg)
                
                Stack trace of thread 1722:
                #0  0x00007fdd530b2460 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fdd4e634539 __run__ (intel_drv.so)
                #2  0x00007fdd530ac6ca start_thread (libpthread.so.0)
                #3  0x00007fdd52de6f7f __clone (libc.so.6)

and gdb output:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  sna_set_cursor_position (scrn=<optimized out>, x=734, y=196) at sna_display.c:6332
6332				int xhot = sna->cursor.ref->bits->xhot;
[Current thread is 1 (Thread 0x7fdd49af3700 (LWP 1728))]
(gdb) bt
#0  0x00007fdd4e5f0d54 in sna_set_cursor_position (scrn=<optimized out>, x=734, y=196) at sna_display.c:6332
#1  0x00000000004bbea2 in xf86MoveCursor ()
#2  0x0000000000585eb3 in miPointerMoveNoEvent ()
#3  0x0000000000586cb4 in miPointerSetPosition ()
#4  0x000000000044d64e in positionSprite.part.7 ()
#5  0x000000000044de53 in fill_pointer_events ()
#6  0x000000000044f6df in GetPointerEvents ()
#7  0x000000000044fc90 in QueuePointerEvents ()
#8  0x00007fdd4c101cb5 in xf86libinput_handle_motion (pInfo=<optimized out>, pInfo=<optimized out>, event=
    0x7fdd44008b40) at xf86libinput.c:1254
#9  0x00007fdd4c101cb5 in xf86libinput_handle_event (event=event@entry=0x7fdd44008b40) at xf86libinput.c:1910
#10 0x00007fdd4c102880 in xf86libinput_read_input (pInfo=<optimized out>) at xf86libinput.c:1995
#11 0x000000000059cb1c in InputReady ()
#12 0x000000000059f181 in ospoll_wait ()
#13 0x000000000059c976 in InputThreadDoWork ()
#14 0x00007fdd530ac6ca in start_thread () at /lib64/libpthread.so.0
#15 0x00007fdd52de6f7f in clone () at /lib64/libc.so.6

(gdb) p sna->cursor
$1 = {cursors = 0x1cc6b80, info = 0x1712d60, ref = 0x1d9c310, serial = 5871, fg = 4294967295, bg = 4278190080, 
  size = 64, disable = false, active = true, last_x = 734, last_y = 196, max_size = 256, use_gtt = true, 
  num_stash = 0, stash = 0x1bd3310, scratch = 0x7fdd55411010}
(gdb) p sna->cursor.ref
$2 = (CursorPtr) 0x1d9c310
(gdb) p sna->cursor.ref->bits
$3 = (CursorBitsPtr) 0x1d9c348
(gdb) p sna->cursor.ref->bits->xhot
$4 = 4
(gdb) info locals
xhot = <optimized out>
yhot = <optimized out>
v = {v = {3.6462044663083995e-321, 2.6894028653599915e-317, 1.0000000000000444}}
hot = {v = {6.9459898994898221e-310, 2147483647, 6.9459898995133397e-310}}
crtc = 0x170a7b0
sna_crtc = 0x170a5b0
cursor = 0x1cc6bc0
arg = {flags = 0, crtc_id = 45, x = -2266, y = -601, width = 29351552, height = 0, handle = 0}
xf86_config = 0x1707af0
sna = 0x7fdd55453000
sigio = 0
c = 2


Reference to Fedora BZ https://bugzilla.redhat.com/show_bug.cgi?id=1384486 with the same issue.

According to above BZ, the issue mainly seen with docked Lenovo Thinkpads in multi-display setups but there is report [comment 50] where it's seen on desktop.

xorg-x11-server-Xorg-1.19.0-3.fc25.x86_64
xorg-x11-drv-libinput-0.23.0-2.fc25.x86_64

Xorg log is in attachment.
Comment 1 Brian J. Murrell 2017-01-13 16:58:28 UTC
So to clarify, I think "docked"/not-docked is a red-herring.

What seems to be the common factor in the downstream bug report at https://bugzilla.redhat.com/show_bug.cgi?id=1384486 is having a rotated screen.

I'm not sure if multiple screens has been determined to be a common factor but I know it's one I share as well as several others.
Comment 2 Chris Wilson 2017-01-16 22:23:13 UTC
SIGSEGV on what appears to be a valid pointer (at least gdb thinks it is). Most notable about the locals is the fb/bg which appear set, suggesting this is not an ARGB cursor but a bitmap. And there seems to be a missed lock in xfree86 for XRecolorCursor.
Comment 3 Chris Wilson 2017-01-16 22:23:39 UTC
Created attachment 128990 [details] [review]
Take the input lock for RecolorCursor
Comment 4 Chris Wilson 2017-01-16 22:38:25 UTC
Created attachment 128991 [details] [review]
Take input lock for xf86TransprentCursor

Another path missing a lock.
Comment 5 Igor Mammedov 2017-01-20 09:28:21 UTC
With patches from comments 3 and 4 applied it managed tnot crash for 2~days,
but it did crash in the end.

I've split line where it crashes to find out offending pointer so here it goes:

       Message: Process 1565 (Xorg) of user 16585 dumped core.
                
                Stack trace of thread 1577:
                #0  0x00007f28b2fce188 sna_set_cursor_position (intel_drv.so)
                #1  0x00000000004bc462 xf86MoveCursor (Xorg)
                #2  0x0000000000586063 miPointerMoveNoEvent (Xorg)
                #3  0x0000000000586e64 miPointerSetPosition (Xorg)
                #4  0x000000000044d6ae positionSprite (Xorg)
                #5  0x000000000044deb3 positionSprite (Xorg)
                #6  0x000000000044f75f GetPointerEvents (Xorg)
                #7  0x000000000044fd10 QueuePointerEvents (Xorg)
                #8  0x00007f28b0d10cb5 xf86libinput_handle_motion (libinput_drv.so)
                #9  0x00007f28b0d11880 xf86libinput_read_input (libinput_drv.so)
                #10 0x000000000059ccec InputReady (Xorg)
                #11 0x000000000059f351 ospoll_wait (Xorg)
                #12 0x000000000059cb46 InputThreadDoWork (Xorg)
                #13 0x00007f28b78706ca start_thread (libpthread.so.0)
                #14 0x00007f28b75aaf7f __clone (libc.so.6)
                
                Stack trace of thread 1565:
                #0  0x00007f28b787938d __lll_lock_wait (libpthread.so.0)
                #1  0x00007f28b7872eca pthread_mutex_lock (libpthread.so.0)
                #2  0x000000000059ca30 input_lock (Xorg)
                #3  0x00000000004bc246 xf86SetCursor (Xorg)
                #4  0x00000000004bacd5 xf86CursorSetCursor (Xorg)
                #5  0x00000000005866fb miPointerUpdateSprite (Xorg)
                #6  0x000000000058694a miPointerDisplayCursor (Xorg)
                #7  0x00000000004c9601 CursorDisplayCursor (Xorg)
                #8  0x0000000000518830 AnimCurDisplayCursor (Xorg)
                #9  0x000000000043fea8 ChangeToCursor (Xorg)
                #10 0x00000000004412e7 WindowHasNewCursor (Xorg)
                #11 0x000000000046a9c8 ChangeWindowDeviceCursor (Xorg)
                #12 0x0000000000531f76 ProcXIChangeCursor (Xorg)
                #13 0x00000000004370b5 Dispatch (Xorg)
                #14 0x000000000043b038 dix_main (Xorg)
                #15 0x00007f28b74c3401 __libc_start_main (libc.so.6)
                #16 0x0000000000424d1a _start (Xorg)
                
                Stack trace of thread 1566:
                #0  0x00007f28b7876460 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007f28b300b769 __run__ (intel_drv.so)
                #2  0x00007f28b78706ca start_thread (libpthread.so.0)
                #3  0x00007f28b75aaf7f __clone (libc.so.6)

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f28b2fce188 in sna_set_cursor_position (scrn=0x1a2b700, x=119, y=523) at sna_display.c:6333
6333	                        CursorBitsPtr bits = ref->bits;

(gdb) l
6331			if (crtc->transform_in_use) {
6332	                        CursorPtr ref = sna->cursor.ref;
6333	                        CursorBitsPtr bits = ref->bits;
6334				int xhot = bits->xhot;
6335				int yhot = sna->cursor.ref->bits->yhot;
6336				struct pict_f_vector v, hot;

(gdb) p sna->cursor.ref
$1 = (CursorPtr) 0x2478ef0
(gdb) p *sna->cursor.ref
$2 = {bits = 0x2478f28, foreRed = 0, foreGreen = 0, foreBlue = 0, backRed = 65535, backGreen = 65535, 
  backBlue = 65535, refcnt = 4, devPrivates = 0x2478f20, id = 20973559, serialNumber = 1368, name = 0}

(gdb) p sna->cursor
$3 = {cursors = 0x1eba540, info = 0x1a37c80, ref = 0x2478ef0, serial = 47981, fg = 4278190080, bg = 4294967295, 
  size = 64, disable = false, active = true, last_x = 119, last_y = 523, max_size = 256, use_gtt = true, 
  num_stash = 0, stash = 0x1e6f980, scratch = 0x7f28b99ac010}
Comment 6 Chris Wilson 2017-01-20 09:52:36 UTC
Created attachment 129063 [details] [review]
Take input lock for CheckHWCursor

Missed a rather important one where we update the hw cursor.
Comment 7 Chris Wilson 2017-01-20 09:55:44 UTC
Created attachment 129064 [details] [review]
Take input lock for CheckHWCursor

s/break/goto unlock/
Comment 8 Chris Wilson 2017-01-21 10:39:24 UTC
*** Bug 99431 has been marked as a duplicate of this bug. ***
Comment 9 Alex Fiestas 2017-01-22 00:46:20 UTC
I'm hitting the same issue, the transformation instead of rotation is scale, but I guess that what matters is having a transformation.

The issue is not reproducible with xorg-server 1.18.4.
Comment 10 Alex Fiestas 2017-01-22 00:56:11 UTC
The crash seems to be gone after all three patches are applied.

Before patches I had a 100% success ratio crashing Xorg within 10seconds of moving the mouse at the top left corner of my screen.

After patches I have not been able to crash it.

Thanks Chris!
Comment 11 Igor Mammedov 2017-01-27 11:44:05 UTC
(In reply to Chris Wilson from comment #7)
> Created attachment 129064 [details] [review] [review]
> Take input lock for CheckHWCursor
> 
> s/break/goto unlock/

it hasn't crashed for a week with this and previous 2 patches.
Taking in account other positive feedback, I'd consider issue as fixed.
Comment 12 ian.frost 2017-01-27 21:23:53 UTC
Hi 
Excuse the dumb question|
Where do I get these Patches from?
Comment 13 Peter Hutterer 2017-02-01 03:01:35 UTC
fwiw, these patches were in a scratch build here and it looks like they fix the crashes: https://bugzilla.redhat.com/show_bug.cgi?id=1384486#c77
Comment 14 languitar 2017-02-07 10:14:12 UTC
I am experiencing this as well with a rotated screen. Would be nice if the patches could get applied soon.
Comment 15 Peter Hutterer 2017-02-07 23:53:02 UTC
commits 7198a6d4e74f684cb383b3e0f70dd2bae405e6e7, 
cfddd919cce4178baba07959e5e862d02e166522 and
3eb964e25243056dd998f52d3b00171b71c89189
Comment 16 Chris Wilson 2017-02-10 20:03:39 UTC
*** Bug 99548 has been marked as a duplicate of this bug. ***
Comment 17 Chris Wilson 2017-02-10 22:21:55 UTC
*** Bug 99084 has been marked as a duplicate of this bug. ***
Comment 18 Tsu Jan 2017-08-15 18:27:03 UTC
The crash still happens with xorg-server-1.19.3-2 and with the same backtrace (miPointerSetPosition), although maybe less frequently. I have Intel with modesetting under Manjaro.

The crash is random and I haven't found a way to reproduce it but it may be related to fast cursor movements.
Comment 19 Tsu Jan 2017-08-15 18:56:28 UTC
On second thought, this may be another bug:

Stack trace of thread 9284:
#0  0x0000000000583652 n/a (Xorg)
#1  0x0000000000584504 miPointerSetPosition (Xorg)
#2  0x000000000044cfde n/a (Xorg)
#3  0x000000000044d7e3 n/a (Xorg)
#4  0x000000000044f08f GetPointerEvents (Xorg)
#5  0x000000000044f640 QueuePointerEvents (Xorg)
#6  0x00007f979de361e5 n/a (libinput_drv.so)
#7  0x00007f979de36d60 n/a (libinput_drv.so)
#8  0x000000000059a4bc n/a (Xorg)
#9  0x000000000059cbb1 n/a (Xorg)
#10 0x000000000059a316 n/a (Xorg)
#11 0x00007f97a5957049 start_thread (libpthread.so.0)
#12 0x00007f97a5697f0f __clone (libc.so.6)

Stack trace of thread 9282:
#0  0x00007f97a596054c __lll_lock_wait (libpthread.so.0)
#1  0x00007f97a5959976 pthread_mutex_lock (libpthread.so.0)
#2  0x000000000059a200 input_lock (Xorg)
#3  0x0000000000595752 TimerSet (Xorg)
#4  0x000000000054b4c2 AccessXFilterReleaseEvent (Xorg)
#5  0x000000000057ac39 n/a (Xorg)
#6  0x00000000004d4675 n/a (Xorg)
#7  0x00000000004369e5 n/a (Xorg)
#8  0x000000000043a968 n/a (Xorg)
#9  0x00007f97a55ca4ca __libc_start_main (libc.so.6)
#10 0x000000000042464a _start (Xorg)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.