Bug 102442 - i915 segfault on archlinux / dell e6430 / HD4000 with xrandr --scale
Summary: i915 segfault on archlinux / dell e6430 / HD4000 with xrandr --scale
Status: RESOLVED MOVED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-27 22:27 UTC by fanf42
Modified: 2019-11-27 13:48 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
X.org.0.log (xorg auto config for i915) (30.11 KB, text/plain)
2017-08-27 22:27 UTC, fanf42
no flags Details
X.org.0.log (DRI disabled) (28.15 KB, text/plain)
2017-08-27 22:29 UTC, fanf42
no flags Details
lspci-vv (12.86 KB, text/plain)
2017-08-27 22:32 UTC, fanf42
no flags Details
lspci-nn (2.36 KB, text/plain)
2017-08-27 22:33 UTC, fanf42
no flags Details
dmesg (101.52 KB, text/plain)
2017-08-27 22:36 UTC, fanf42
no flags Details
dmesg with arandr OK (107.50 KB, text/plain)
2017-08-27 22:59 UTC, fanf42
no flags Details
Xorg.0.log for xf86-video-intel-git 1:2.99.917+781+gc8990575-1 with --enable-debug=full (147.04 KB, application/x-xz)
2017-08-30 11:47 UTC, fanf42
no flags Details

Description fanf42 2017-08-27 22:27:17 UTC
Created attachment 133818 [details]
X.org.0.log (xorg auto config for i915)

When trying to understand why Firefox 55 is displaying a spinning wheel on tab when I connect/disconnect external monitors ( https://bugzilla.mozilla.org/show_bug.cgi?id=1391216 ) I triend at some point a simple: 

xrandr --output LVDS1 --scale "1368x768"

(where "1368x768" is a listed modset with xrandr) and it leads to xork/i915 segfault. 
I tried with various combination of DRI mode / AccelMethod, and: 

- DRI=2 or DRI=3 or DRI=False does not seems to impact the bug occurence, 
- AccelMethod=sna leads to the segfault, where uxa leads to a blank screen (with the need to switch to console and restart lightdm) but no error log in /var/log/Xorg.0.log*

The segfault is: 

8<---------------------------------
[    18.992] (EE)
[    18.992] (EE) Backtrace:
[    18.993] (EE) 0: /usr/lib/xorg-server/Xorg (OsLookupColor+0x139) [0x564b734b0f39]
[    18.993] (EE) 1: /usr/lib/libpthread.so.0 (funlockfile+0x50) [0x7f02adc9282f]
[    18.993] (EE) 2: /usr/lib/xorg/modules/drivers/intel_drv.so (_init+0x142c4) [0x7f02aa45a0a4]
[    18.993] (EE) 3: /usr/lib/xorg/modules/drivers/intel_drv.so (_init+0x5d265) [0x7f02aa4ebd35]
[    18.994] (EE) 4: /usr/lib/xorg/modules/drivers/intel_drv.so (_init+0x5d97c) [0x7f02aa4ed0fc]
[    18.994] (EE) 5: /usr/lib/xorg/modules/drivers/intel_drv.so (_init+0x689a8) [0x7f02aa503178]
[    18.994] (EE) 6: /usr/lib/xorg-server/Xorg (xf86DisableUnusedFunctions+0xf4) [0x564b733c3c04]
[    18.994] (EE) 7: /usr/lib/xorg-server/Xorg (xf86PruneDuplicateModes+0x2c1d) [0x564b733cde3d]
[    18.995] (EE) 8: /usr/lib/xorg-server/Xorg (RRCrtcSet+0x122) [0x564b7340ac42]
[    18.995] (EE) 9: /usr/lib/xorg-server/Xorg (ProcRRSetCrtcConfig+0x253) [0x564b7340c4c3]
[    18.995] (EE) 10: /usr/lib/xorg-server/Xorg (SendErrorToClient+0x368) [0x564b7334b1e8]
[    18.995] (EE) 11: /usr/lib/xorg-server/Xorg (InitFonts+0x420) [0x564b7334f1f0]
[    18.996] (EE) 12: /usr/lib/libc.so.6 (__libc_start_main+0xea) [0x7f02ad8fb4ca]
[    18.996] (EE) 13: /usr/lib/xorg-server/Xorg (_start+0x2a) [0x564b73338e9a]
[    18.996] (EE)
[    18.996] (EE) Segmentation fault at address 0x7f02b010c000
[    18.996] (EE)
Fatal server error:
[    18.996] (EE) Caught signal 11 (Segmentation fault). Server aborting

8<---------------------------------


System information:

- Archlinux
% uname -a
Linux luhman16 4.12.8-2-ARCH #1 SMP PREEMPT Fri Aug 18 14:08:02 UTC 2017 x86_64 GNU/Linux

% pacman -Q | grep xorg | sort
xorg-bdftopcf 1.0.5-1
xorg-fonts-alias 1.0.3-1
xorg-fonts-encodings 1.0.4-4
xorg-fonts-misc 1.0.3-5
xorg-fonts-type1 7.7-2
xorg-font-util 1.3.1-1
xorg-font-utils 7.6-4
xorg-iceauth 1.0.7-1
xorg-luit 1.1.1-2
xorg-mkfontdir 1.0.7-8
xorg-mkfontscale 1.1.2-1
xorg-server 1.19.3-3
xorg-server-common 1.19.3-3
xorg-server-utils 7.6-4
xorg-server-xwayland 1.19.3-3
xorg-sessreg 1.1.1-1
xorg-setxkbmap 1.3.1-1
xorg-twm 1.0.9-1
xorg-utils 7.6-9
xorg-xauth 1.0.10-1
xorg-xbacklight 1.2.1-1
xorg-xclock 1.0.7-1
xorg-xcmsdb 1.0.5-1
xorg-xdpyinfo 1.3.2-1
xorg-xdriinfo 1.0.5-2
xorg-xev 1.2.2-1
xorg-xgamma 1.0.6-1
xorg-xhost 1.0.7-1
xorg-xinit 1.3.4-4
xorg-xinput 1.6.2-1
xorg-xkbcomp 1.4.0-1
xorg-xlsatoms 1.1.2-1
xorg-xlsclients 1.1.3-1
xorg-xmodmap 1.0.9-1
xorg-xprop 1.2.2-1
xorg-xrandr 1.5.0-1
xorg-xrdb 1.1.0-2
xorg-xrefresh 1.0.5-1
xorg-xset 1.2.3-1
xorg-xsetroot 1.1.1-2
xorg-xvinfo 1.1.3-1
xorg-xwininfo 1.1.3-1

% pacman -Q | grep xf86 | sort
libxxf86dga 1.1.4-1
libxxf86vm 1.1.4-1
xf86dgaproto 2.1-3
xf86-input-evdev 2.10.5-1
xf86-input-libinput 0.25.1-1
xf86-input-synaptics 1.9.0-1
xf86-video-intel 1:2.99.917+779+g2100efa1-2
xf86-video-nouveau 1.0.15-2
xf86vidmodeproto 2.3.1-3

Attached are relevant log files (dmesg, xorg log and lspci output)
Comment 1 fanf42 2017-08-27 22:29:51 UTC
Created attachment 133819 [details]
X.org.0.log (DRI disabled)

An other xorg log with DRI disabled
Comment 2 fanf42 2017-08-27 22:32:24 UTC
Created attachment 133820 [details]
lspci-vv
Comment 3 fanf42 2017-08-27 22:33:22 UTC
Created attachment 133821 [details]
lspci-nn
Comment 4 fanf42 2017-08-27 22:36:34 UTC
Created attachment 133822 [details]
dmesg


I triggered the segfault between these two log lines:

[  914.124797] [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:54:VGA-1] disconnected
[ 1978.446104] [drm:drm_mode_addfb2 [drm]] [FB:65]
Comment 5 fanf42 2017-08-27 22:59:51 UTC
Created attachment 133823 [details]
dmesg with arandr OK

And if I use arandr (right click -> resolution -> 1368x768), the scaling is done without a segfault. 

Which also explain why I didn't notice the segfault before (I tend to use xrandr when debuging display problem, which is quite rare, and arandr / desktop change resolution tool for my day to day use).
Comment 6 Chris Wilson 2017-08-29 19:08:24 UTC
The bt is not that useful without the symbols, and that --scale explodes is quite, quite surprising. If you can get a clean bt, try recompiling with --enable-debug and make sure the symbols aren't stripped on install that will be a big help.
Comment 7 fanf42 2017-08-29 23:42:24 UTC
I'm a little lost, I'm not succeding to have the trace with debug symbol. 

I compiled xf86-video-intel with "-O -g -ffast-math -march=native -ggdb3" and addede "--enable-debug" to configure. 
I took care (ok, it was the third time before I succeed) to not let arch mkpkg strip the debug symbols. The resulting /usr/lib/xorg/modules/drivers/intel_drv.so is 15M, so it seems to have what is needed, but I still get the non informative stack. 

Is there something else that I have to do to get the nice bt?

(I know I can do it, I also recompiled xorg-server and mesa for that other segfault:https://phab.enlightenment.org/T5957, it was a fabulous evening. Now I now that mesa is a very big package with debug symbol, I believe it can be accounted as a win ?)

It's the first time I'm trying to do these things, perhaps I'm missing something obvious?
Comment 8 Chris Wilson 2017-08-30 10:43:24 UTC
Hmm, that sounds like it should have been able to pick up the symbols. The last resort is to use "sudo gdb --pid ($pidof Xorg)" from a remote login.

To check that it picked up the recompiled intel_drv.so, after --enable-debug you should get ""SNA compiled with assertions enabled" in the Xorg.log. If that is in order, and we still don't have a good stacktrace, use --enable-debug=full and attach the compressed Xorg.log and I'll figure out where it dies based on the last debug message.
Comment 9 fanf42 2017-08-30 10:51:11 UTC
Just an idea - where should I use / set "--enable-debug" ? Because perhaps I'm just not doing it right. Is it a config option of "./configure", or a parameter of xorg starting command, or something else?

(sorry if it sounds very dumb)
Comment 10 Chris Wilson 2017-08-30 10:56:28 UTC
It's an option to ./configure (or ./autogen.sh) of xf86-video-intel.
Comment 11 fanf42 2017-08-30 11:45:47 UTC
So, I did the correct thing and I had the "SNA compiled with assertions enabled". 

So here goes the full debug log. 
What I did (if it helps understanding):

- in console, restart lightdm; 
- switch to console 7, log-in in lightdm
- enlightenment starts
- open a terminal, enter: "xrandr -s 800x600", enter 

=> segfault, lightdm restarts. 

Switch back to console 1, copy/compress X.org.log.old. 

Hope it helps. 

Don't hesitate to ask, if you need some other piece of xorg recompiled with the debug symbole, I'm starting to be good at that :) (ok, modulo the critical fails here)
Comment 12 fanf42 2017-08-30 11:47:36 UTC
Created attachment 133876 [details]
Xorg.0.log for xf86-video-intel-git 1:2.99.917+781+gc8990575-1 with --enable-debug=full
Comment 13 fanf42 2017-08-30 11:49:39 UTC
Comment on attachment 133876 [details]
Xorg.0.log for xf86-video-intel-git 1:2.99.917+781+gc8990575-1 with --enable-debug=full

xf86-video-intel-git 1:2.99.917+781+gc8990575-1 with --enable-debug=full
Comment 14 Chris Wilson 2017-08-30 12:07:58 UTC
Ah, can you tweak the assert

diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
index d1f01218..3f70d536 100644
--- a/src/sna/sna_display.c
+++ b/src/sna/sna_display.c
@@ -565,7 +565,7 @@ static void assert_scanout(struct kgem *kgem, struct kgem_bo *bo,
        assert(drmIoctl(kgem->fd, DRM_IOCTL_MODE_GETFB, &info) == 0);
        gem_close(kgem->fd, info.handle);
 
-       assert(width == info.width && height == info.height);
+       assert(width <= info.width && height <= info.height);
 }
 #else
 #define assert_scanout(k, b, w, h)

and try again?
Comment 15 fanf42 2017-08-30 16:39:10 UTC
Yep, that worked :) 

Congrats !
Comment 16 Chris Wilson 2017-08-30 16:47:05 UTC
Uhoh, that will have only fixed up an assert that you could not have hit before we started testing...
Comment 17 fanf42 2017-08-30 17:13:55 UTC
That's really funny :) OK, I will try to reproduce - but I can't with the previous method, so I need to understand what changed (perhaps I was just using a debug package all along, which is not totally to exclude since 1/ as you said, a bug in scala is highly unlikely and 2/ I was trying to debug other things https://phab.enlightenment.org/T5941 https://phab.enlightenment.org/T5957)

Thanks for the help in all cases !
Comment 18 Martin Peres 2019-11-27 13:48:55 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-intel/issues/148.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.