Bug 81678

Summary: X crashes on start (integrated 7640G + discrete 7500M/7600M)
Product: xorg Reporter: Igor Gnatenko <i.gnatenko.brain>
Component: Lib/pciaccessAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: adeptsmail, jezekulfur, rkudyba
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.9.log
none
lspci-knn
none
dmesg
none
updated Xorg log after adding Option "NoTrapSignals" in xorg.conf
none
gdb backtrace after enabling Option "NoTrapSignals" none

Description Igor Gnatenko 2014-07-23 15:41:32 UTC
Created attachment 103343 [details]
Xorg.9.log

Version of component:
xorg-x11-drv-ati-7.2.0-3.20131101git3b38701.fc20.x86_64
xorg-x11-server-1.14.4-11.fc20.x86_64
kernel-3.15.4-200.fc20.x86_64 or kernel-3.15.5-200.fc20.x86_64

Steps to reproduce:
1. Start Xorg

Actual results:
[    40.725] (EE) Backtrace:
[    40.726] (EE) 0: /bin/Xorg (OsLookupColor+0x129) [0x4736c9]
[    40.726] (EE) 1: /lib64/libpthread.so.0 (__restore_rt+0x0) [0x3bdd60f74f]
[    40.726] (EE) 2: /lib64/libpciaccess.so.0 (pci_device_next+0x100) [0x3970602cb0]
[    40.727] (EE) 3: /lib64/libpciaccess.so.0 (pci_device_find_by_slot+0x3b) [0x3970602d4b]
[    40.727] (EE) 4: /lib64/libpciaccess.so.0 (pci_device_vgaarb_init+0xaf) [0x3970604a1f]
[    40.727] (EE) 5: /bin/Xorg (xf86UnmapLegacyIO+0x3319) [0x4b0359]
[    40.728] (EE) 6: /bin/Xorg (xf86BusConfig+0x62) [0x481752]
[    40.728] (EE) 7: /bin/Xorg (InitOutput+0x9cf) [0x48f90f]
[    40.728] (EE) 8: /bin/Xorg (_init+0x390b) [0x42bdbb]
[    40.846] (EE) 9: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x3bdce21d65]
[    40.847] (EE) 10: /bin/Xorg (_start+0x29) [0x428be5]
[    40.847] (EE) 11: ? (?+0x29) [0x29]
[    40.847] (EE) 
[    40.847] (EE) Segmentation fault at address 0x0
[    40.848] (EE) 
Fatal server error:
[    40.848] (EE) Caught signal 11 (Segmentation fault). Server aborting
[    40.848] (EE) 
[    40.848] (EE) 
Please consult the Fedora Project support 
     at http://wiki.x.org
 for help. 
[    40.848] (EE) Please also check the log file at "/var/log/Xorg.9.log" for additional information.
[    40.848] (EE) 
[    41.096] (EE) Server terminated with error (1). Closing log file.

Expected results:
X starts ok
Comment 1 Igor Gnatenko 2014-07-23 15:42:07 UTC
Created attachment 103344 [details]
lspci-knn
Comment 2 Igor Gnatenko 2014-07-23 15:42:38 UTC
Created attachment 103345 [details]
dmesg
Comment 3 Igor Gnatenko 2014-07-23 15:43:22 UTC
mesa 10.1.5
Comment 5 Michel Dänzer 2014-07-24 02:44:59 UTC
The log file you attached doesn't show the problem. What's the difference between the success and failure cases?

Anyway, looks like a libpciaccess bug at first glance. What version of that are you using?
Comment 6 Andrey Volkov 2014-07-25 07:15:22 UTC
(In reply to comment #5)
This bug occurs on my PC. So, I can answer your questions.

> What's the difference between the success and failure cases?

If X server crashes, then PC freezes or repeats error messages (see these photos):

http://4.firepic.org/4/images/2014-07/25/w2rz56ghj3mi.jpg
http://4.firepic.org/4/images/2014-07/25/8q1p2iqtwgef.jpg
http://4.firepic.org/4/images/2014-07/25/16gbn0d01uyu.jpg
http://4.firepic.org/4/images/2014-07/25/p63zyf8vxbm7.jpg
http://4.firepic.org/4/images/2014-07/25/s7tzbyq76rj5.jpg
http://4.firepic.org/4/images/2014-07/25/uivf8gv39kre.jpg
http://4.firepic.org/4/images/2014-07/25/idz6ec3x9lm0.jpg

Also, X server failure occurs simultaneously with or after kerneloops, so I want to add this messages from ABRT:

WARNING: CPU: 1 PID: 202 at drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xca/0xe0 [radeon]()

BUG: unable to handle kernel paging request at 0000000000002180

BUG: soft lockup - CPU#2 stuck for 22s! [Xorg:724]

WARNING: CPU: 1 PID: 39 at drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xca/0xe0 [radeon]()

> Anyway, looks like a libpciaccess bug at first glance. What version of that
> are you using?

libpciaccess-0.13.3-0.1.fc20.x86_64

Also, I want to tell you, that my PC has hybrid muxless graphics (integrated Radeon HD 7640G and discrete Radeon HD 7670M):

[ulfur@np355v5x-s04ru ~]$ lspci | grep VGA
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Trinity [Radeon HD 7640G]
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Thames [Radeon HD 7500M/7600M Series] (rev ff)
Comment 7 Andrey Volkov 2014-07-25 07:57:43 UTC
(In reply to comment #6)
> (In reply to comment #5)
> this messages
*these messages
Comment 8 Michel Dänzer 2014-07-28 09:30:41 UTC
(In reply to comment #6)
> This bug occurs on my PC. So, I can answer your questions.

So Igor reported the bug on your behalf?


> http://4.firepic.org/4/images/2014-07/25/16gbn0d01uyu.jpg

[...]

> BUG: unable to handle kernel paging request at 0000000000002180

I believe this oops might be the central issue, or at least one that needs to be resolved.
Comment 9 Andrey Volkov 2014-07-28 13:28:51 UTC
(In reply to comment #8)
> (In reply to comment #6)
> > This bug occurs on my PC. So, I can answer your questions.
> 
> So Igor reported the bug on your behalf?
> 
Yes. I reported the bug on the Russian Fedora project bugtracker (http://redmine.russianfedora.pro/issues/1356), and, after that, he reported the bug on this bugtracker.
Comment 10 Andrey Volkov 2014-10-10 13:45:59 UTC
On 3.16.x kernels this bug also present
Comment 11 Michel Dänzer 2018-06-14 07:29:32 UTC
*** Bug 102485 has been marked as a duplicate of this bug. ***
Comment 12 Michel Dänzer 2018-06-14 07:36:09 UTC
*** Bug 102409 has been marked as a duplicate of this bug. ***
Comment 13 Michel Dänzer 2018-06-14 07:36:46 UTC
*** Bug 100520 has been marked as a duplicate of this bug. ***
Comment 14 Michel Dänzer 2018-06-14 07:37:11 UTC
*** Bug 101936 has been marked as a duplicate of this bug. ***
Comment 15 Michel Dänzer 2018-06-14 07:43:06 UTC
Can any of you affected by this get a gdb backtrace of the crash, with debugging symbols available for libpciaccess.so.0?

https://www.x.org/wiki/Development/Documentation/ServerDebugging/
Comment 16 RobbieTheK 2018-06-14 14:30:09 UTC
Created attachment 140153 [details]
updated Xorg log after adding   Option "NoTrapSignals" in xorg.conf

There is a backtrace:
[876663.337] (EE) Backtrace:
[876663.337] (EE) 0: /usr/libexec/Xorg (OsLookupColor+0x136) [0x81ea456]
[876663.338] (EE) 1: ? (?+0x136) [0xb7f1ae6d]
[876663.339] (EE) 2: /lib/libpciaccess.so.0 (pci_device_next+0xf0) [0xb7ba3340]
[876663.340] (EE) 3: /lib/libpciaccess.so.0 (pci_device_find_by_slot+0x59) [0xb7ba3419]
[876663.340] (EE) 4: /lib/libpciaccess.so.0 (pci_device_vgaarb_init+0xc4) [0xb7ba5494]
[876663.341] (EE) 5: /usr/libexec/Xorg (xf86ConfigPciEntity+0x40b8) [0x80dc408]
[876663.341] (EE) 6: /usr/libexec/Xorg (xf86BusConfig+0xeb) [0x80b01eb]
[876663.342] (EE) 7: /usr/libexec/Xorg (InitOutput+0x9d7) [0x80bf457]
[876663.342] (EE) 8: /usr/libexec/Xorg (InitFonts+0x2a4) [0x807b3b4]
[876663.342] (EE) 9: /usr/libexec/Xorg (miPolyFillRect+0x164) [0x80644d4]
[876663.343] (EE) 10: /lib/libc.so.6 (__libc_start_main+0xf1) [0xb7714191]
[876663.343] (EE) 11: /usr/libexec/Xorg (_start+0x32) [0x8064414]
[876663.343] (EE) 
[876663.344] (EE) Segmentation fault at address 0x0
[876663.344] (EE) 
Fatal server error:
[876663.344] (EE) Caught signal 11 (Segmentation fault). Server aborting
Comment 17 RobbieTheK 2018-06-14 14:31:58 UTC
Created attachment 140155 [details]
gdb backtrace after  enabling  Option "NoTrapSignals"

full BT, I hope it's helpful!
Comment 18 RobbieTheK 2018-06-14 14:41:46 UTC
(In reply to Michel Dänzer from comment #15)
> Can any of you affected by this get a gdb backtrace of the crash, with
> debugging symbols available for libpciaccess.so.0?
> 
> https://www.x.org/wiki/Development/Documentation/ServerDebugging/

I tried to use the Version 1 script but the documentation is a bit dated as there's no X11R6/ Also a typo: ""comment sign # form the line starting with #GDB and" s/b "from the line".

As for "debugging symbols available for libpciaccess.so.0" I see I must have done this in the past as there is a directory:
/usr/src/debug/libpciaccess-0.13.4-8.fc28.i386

Here are some logs in /var/log/messages when the coredump happens:
Jun 14 10:11:29 curie systemd-coredump[13932]: Process 13924 (Xorg) of user 42 dumped core.
#012#012Stack trace of thread 13924:
#012#0  0x00000000b7ee1d21 __kernel_vsyscall (linux-gate.so.1)
#012#1  0x00000000b76eee52 __libc_signal_restore_set (libc.so.6)
#012#2  0x00000000b76d882f __GI_abort (libc.so.6)
#012#3  0x00000000081ed409 OsAbort (Xorg)
#012#4  0x00000000081f32a6 AbortServer (Xorg)
#012#5  0x00000000081f3c84 FatalError (Xorg)
#012#6  0x00000000080bfcd0 InitOutput (Xorg)
#012#7  0x000000000807b364 dix_main (Xorg)
#012#8  0x000000000806439f main (Xorg)
#012#9  0x00000000b76da191 __libc_start_main (libc.so.6)
#012#10 0x00000000080643e2 _start (Xorg)

I can provide more as there are a plethora of "kernel: RPC" logs
Comment 19 Michel Dänzer 2018-06-14 14:51:16 UTC
(In reply to RobbieTheK from comment #17)
> full BT, I hope it's helpful!

This shows the issue you reported in bug 104917, not the libpciaccess crash this report is about. Please don't mix up different issues.


Anyone affected by the libpciaccess crash, can you get the pciaccess 0.14 release, rebuild Xorg against that and see if it helps?
Comment 20 RobbieTheK 2018-06-14 16:00:42 UTC
(In reply to Michel Dänzer from comment #19)
> (In reply to RobbieTheK from comment #17)
> > full BT, I hope it's helpful!
> 
> This shows the issue you reported in bug 104917, not the libpciaccess crash
> this report is about. Please don't mix up different issues.

Sorry about that I thought that was one that you marked as a duplicate but now I see it's not closed. Do you need me to upload these logs in that bug report?
Comment 21 GitLab Migration User 2018-08-10 20:18:49 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/lib/libpciaccess/issues/5.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.