Bug 97765

Summary: REGRESSION: Buffer overrun on launch in SmartScheduleClient
Product: xorg Reporter: Jeremy Huddleston Sequoia <jeremyhu>
Component: Server/GeneralAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Crash log
none
different ASan crash log from d6eff3c31e8289881a3aa9b858e5710d0f741db0
none
Crash log / ASan report for 8f1edf4bd3a1f050ce9eeb5eac45dd1a8f7a6d5e none

Description Jeremy Huddleston Sequoia 2016-09-11 07:31:22 UTC
master 527c6baa294d17c5eca1d87ac941844872e90dac (technically dd85834e3995671da908e825eaa7a228d11f0b3d which is in the pull queue to address macOS build failures).

Built with ASan, X11 crashes on launch.  I haven't tried master for a while, so I'm not quite sure when this regressed.

ASan report is:

Application Specific Information:
X.Org X Server 1.18.99.1 Build Date: 20160910
=================================================================
==16921==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000108ce3834 at pc 0x000108880766 bp 0x7000045f76c0 sp 0x7000045f76b8
READ of size 4 at 0x000108ce3834 thread T6
    #0 0x108880765 in SmartScheduleClient dispatch.c:365
    #1 0x10887ecc5 in Dispatch dispatch.c:422
    #2 0x1088c05f1 in dix_main main.c:301
    #3 0x1082aabba in server_thread quartzStartup.c:66
    #4 0x7fffc5f16aaa in _pthread_body (libsystem_pthread.dylib+0x3aaa)
    #5 0x7fffc5f169f6 in _pthread_start (libsystem_pthread.dylib+0x39f6)
    #6 0x7fffc5f161fc in thread_start (libsystem_pthread.dylib+0x31fc)
 
0x000108ce3834 is located 12 bytes to the left of global variable 'nClients' defined in 'dispatch.c:164:12' (0x108ce3840) of size 4
0x000108ce3834 is located 48 bytes to the right of global variable 'nextFreeClientID' defined in 'dispatch.c:162:12' (0x108ce3800) of size 4
SUMMARY: AddressSanitizer: global-buffer-overflow dispatch.c:365 in SmartScheduleClient
Shadow bytes around the buggy address:
  0x10002119c6b0: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
  0x10002119c6c0: 00 00 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
  0x10002119c6d0: 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
  0x10002119c6e0: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
  0x10002119c6f0: 00 f9 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
=>0x10002119c700: 04 f9 f9 f9 f9 f9[f9]f9 04 f9 f9 f9 f9 f9 f9 f9
  0x10002119c710: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
  0x10002119c720: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x10002119c730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002119c740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002119c750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
Thread T6 created by T0 here:
    #0 0x1091c83e9 in wrap_pthread_create (libclang_rt.asan_osx_dynamic.dylib+0x413e9)
    #1 0x1082aa941 in create_thread quartzStartup.c:78
    #2 0x1082aa79c in QuartzInitServer quartzStartup.c:95
    #3 0x10827f26a in X11ApplicationMain X11Application.m:1286
    #4 0x108290590 in X11ControllerMain X11Controller.m:984
    #5 0x1082aaf95 in server_main quartzStartup.c:127
    #6 0x10825512b in do_start_x11_server bundle-main.c:436
    #7 0x108258854 in _Xstart_x11_server mach_startupServer.c:189
    #8 0x108259bb6 in mach_startup_server mach_startupServer.c:399
    #9 0x7fffc5e26186 in mach_msg_server (libsystem_kernel.dylib+0x12186)
    #10 0x108255848 in main bundle-main.c:774
    #11 0x7fffc5cff254 in start (libdyld.dylib+0x5254)
Comment 1 Jeremy Huddleston Sequoia 2016-09-11 07:44:36 UTC
Created attachment 126457 [details]
Crash log
Comment 2 Jeremy Huddleston Sequoia 2016-09-11 08:03:48 UTC
Created attachment 126458 [details]
different ASan crash log from d6eff3c31e8289881a3aa9b858e5710d0f741db0
Comment 3 Jeremy Huddleston Sequoia 2016-09-11 08:15:49 UTC
$ git bisect log
git bisect start
# good: [7762a602c1dfdd8cfcf2b8c2281cf4d683d05216] dix/os: Merge priority computation into SmartScheduleClient
git bisect good 7762a602c1dfdd8cfcf2b8c2281cf4d683d05216
# bad: [527c6baa294d17c5eca1d87ac941844872e90dac] xkb: fix check for appending '|' character when applying rules
git bisect bad 527c6baa294d17c5eca1d87ac941844872e90dac
# bad: [28adb0d36982a0051ce6a9d1375aac0354ef2af4] xquartz: Update for removal of AddEnabledDevice and RemoveEnabledDevice
git bisect bad 28adb0d36982a0051ce6a9d1375aac0354ef2af4
# bad: [dff0c1471ff8532a9d6d85e640a0c4fe35db7c00] xquartz: Update for removal of AddEnabledDevice and RemoveEnabledDevice
git bisect bad dff0c1471ff8532a9d6d85e640a0c4fe35db7c00
# good: [4eccad7655518651f60eda35db4e5d5da84f6c19] xquartz: Update for removal of AddEnabledDevice and RemoveEnabledDevice
git bisect good 4eccad7655518651f60eda35db4e5d5da84f6c19
# good: [711c36558f50943c8342f25ad210281134887a3d] os: Add poll emulation for mingw [v2]
git bisect good 711c36558f50943c8342f25ad210281134887a3d
# bad: [30bc0732f959bbc63f318c06d48de080d495da32] os: Use ospoll for input thread [v2]
git bisect bad 30bc0732f959bbc63f318c06d48de080d495da32
# skip: [d6eff3c31e8289881a3aa9b858e5710d0f741db0] os: Add ospoll interface [v2]
git bisect skip d6eff3c31e8289881a3aa9b858e5710d0f741db0
# skip: [8f1edf4bd3a1f050ce9eeb5eac45dd1a8f7a6d5e] dix: Use list for ready clients
git bisect skip 8f1edf4bd3a1f050ce9eeb5eac45dd1a8f7a6d5e
# good: [d403aca70a07e1401cb93738f1af5961582a2e47] Switch poll() users to xserver_poll()
git bisect good d403aca70a07e1401cb93738f1af5961582a2e47
# bad: [f993091e7db81b0420e23c485378cba112278839] os: Switch server to poll(2) [v3]
git bisect bad f993091e7db81b0420e23c485378cba112278839
# only skipped commits left to test
# possible first bad commit: [f993091e7db81b0420e23c485378cba112278839] os: Switch server to poll(2) [v3]
# possible first bad commit: [8f1edf4bd3a1f050ce9eeb5eac45dd1a8f7a6d5e] dix: Use list for ready clients
# possible first bad commit: [d6eff3c31e8289881a3aa9b858e5710d0f741db0] os: Add ospoll interface [v2]
Comment 4 Jeremy Huddleston Sequoia 2016-09-11 08:24:01 UTC
    os: Switch server to poll(2) [v3]   # Original log's signature
    dix: Use list for ready clients     # Second log's signature
    os: Add ospoll interface [v2]       # Second log's signature

So the issue(s) were introduced by the ospoll changes.
Comment 5 Jeremy Huddleston Sequoia 2016-09-11 08:37:33 UTC
Created attachment 126459 [details]
Crash log / ASan report for 8f1edf4bd3a1f050ce9eeb5eac45dd1a8f7a6d5e
Comment 6 Jeremy Huddleston Sequoia 2016-09-11 08:39:15 UTC
'os: Add ospoll interface [v2]' was ok.

I suspect the issue landed in 8f1edf4bd3a1f050ce9eeb5eac45dd1a8f7a6d5e, and then the signature changed likely due to refactoring in f993091e7db81b0420e23c485378cba112278839.
Comment 7 Jeremy Huddleston Sequoia 2016-09-11 09:29:27 UTC
The report with the InitClient signature looks to be because of bad dependency tracking in dix/Makefile.am because something wasn't getting rebuilt that needed to.  It went away with a clean.  I'll look into that separately.

The report with the RandR signature is because of a race condition initializing the screen with client connection.  We're now loosing the race due to these changes.  There is a patch for that out on xorg-devel.

With the RandR race addressed, I'm able to reliably hit the original signature on:

commit 8f1edf4bd3a1f050ce9eeb5eac45dd1a8f7a6d5e
Author: Keith Packard <keithp@keithp.com>
Date:   Thu May 19 13:59:54 2016 -0700

    dix: Use list for ready clients
    
    This converts the dispatch loop into using a list of ready clients
    instead of an array. This changes the WaitForSomething API so that it
    notifies DIX when a client becomes ready to read, instead of returning
    the set of ready clients.
    
    Signed-off-by: Keith Packard <keithp@keithp.com>
    Reviewed-by: Adam Jackson <ajax@redhat.com>

That looks to be the commit that introduced the regression.
Comment 8 Jeremy Huddleston Sequoia 2016-09-11 10:08:33 UTC
Patch proposal sent to xorg-devel
Comment 9 Jeremy Huddleston Sequoia 2016-09-16 05:38:52 UTC
Merged d81f9ce12aa4ac54b9c2b8c74c2f827c1f3e739a

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.