Created attachment 130063 [details] system log from multiple reboots I noticed my external display constantly turning on and off unless a DRI app is active (ie. running DRI_PRIME=1 glxgears). I was suggested to blacklist the `radeon` driver as I am using `amdgpu` and I proceeded to do this. Blacklisting the driver results in the system not being able to boot. From the first attempts I caught those 2 screenshots: https://imgur.com/AjG7IgB,xEi2L4B https://imgur.com/xEi2L4B my last attempt revealed a kernel null pointer dereference that was logged in journalctl (other errors; stack traces were not logged) https://gist.github.com/mulander/6f4d8bfc0fe73af25ee2c95014754822 I'm attaching all journalctl entries since today, search it for 'BUG' to see the boot with the null pointer dereference. It was started with radeon blacklisted on boot. I tried several blacklisting methods including modprobe.conf & regenerating initramfs. [mulander@napalm ~]$ uname -a Linux napalm 4.9.11-1-ARCH #1 SMP PREEMPT Sun Feb 19 13:45:52 UTC 2017 x86_64 GNU/Linux [mulander@napalm ~]$ lspci 00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 0b) 00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b) 00:03.0 Audio device: Intel Corporation Haswell-ULT HD Audio Controller (rev 0b) 00:14.0 USB controller: Intel Corporation 8 Series USB xHCI HC (rev 04) 00:16.0 Communication controller: Intel Corporation 8 Series HECI #0 (rev 04) 00:1b.0 Audio device: Intel Corporation 8 Series HD Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 3 (rev e4) 00:1c.3 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 4 (rev e4) 00:1c.4 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 5 (rev e4) 00:1d.0 USB controller: Intel Corporation 8 Series USB EHCI #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation 8 Series LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 8 Series SATA Controller 1 [AHCI mode] (rev 04) 00:1f.3 SMBus: Intel Corporation 8 Series SMBus Controller (rev 04) 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 10) 02:00.0 Network controller: Qualcomm Atheros QCA9565 / AR9565 Wireless Network Adapter (rev 01) 03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Sun LE [Radeon HD 8550M / R5 M230] lsmod Module Size Used by ctr 16384 6 ccm 20480 3 hid_generic 16384 0 usbhid 49152 0 joydev 20480 0 mousedev 20480 0 amdgpu 1499136 0 snd_hda_codec_hdmi 45056 1 amdkfd 122880 1 amd_iommu_v2 20480 1 amdkfd intel_rapl 20480 0 x86_pkg_temp_thermal 16384 0 intel_powerclamp 16384 0 coretemp 16384 0 kvm 524288 0 radeon 1478656 4 irqbypass 16384 1 kvm intel_cstate 16384 0 ttm 86016 2 amdgpu,radeon intel_rapl_perf 16384 0 snd_soc_rt5640 110592 0 snd_soc_rl6231 16384 1 snd_soc_rt5640 ppdev 20480 0 snd_soc_core 188416 1 snd_soc_rt5640 snd_hda_codec_conexant 24576 1 snd_hda_codec_generic 69632 1 snd_hda_codec_conexant snd_hda_intel 32768 7 snd_hda_codec 106496 4 snd_hda_intel,snd_hda_codec_conexant,snd_hda_codec_hdmi,snd_hda_codec_generic snd_compress 20480 1 snd_soc_core evdev 24576 15 snd_hda_core 65536 5 snd_hda_intel,snd_hda_codec_conexant,snd_hda_codec,snd_hda_codec_hdmi,snd_hda_codec_generic snd_pcm_dmaengine 16384 1 snd_soc_core snd_hwdep 16384 1 snd_hda_codec psmouse 131072 0 ideapad_laptop 24576 0 pcspkr 16384 0 input_leds 16384 0 sparse_keymap 16384 1 ideapad_laptop arc4 16384 2 mac_hid 16384 0 r8169 77824 0 ath9k 131072 0 ath9k_common 32768 1 ath9k ath9k_hw 442368 2 ath9k,ath9k_common uvcvideo 86016 0 ath 28672 3 ath9k_hw,ath9k,ath9k_common videobuf2_vmalloc 16384 1 uvcvideo videobuf2_memops 16384 1 videobuf2_vmalloc ath3k 20480 0 btusb 40960 0 videobuf2_v4l2 20480 1 uvcvideo mac80211 688128 1 ath9k videobuf2_core 36864 2 uvcvideo,videobuf2_v4l2 btrtl 16384 1 btusb btbcm 16384 1 btusb btintel 16384 1 btusb videodev 151552 3 uvcvideo,videobuf2_core,videobuf2_v4l2 rtsx_usb_ms 20480 0 bluetooth 499712 6 btrtl,btintel,btbcm,ath3k,btusb media 32768 2 uvcvideo,videodev memstick 16384 1 rtsx_usb_ms snd_pcm 90112 9 snd_hda_intel,snd_hda_codec,snd_pcm_dmaengine,snd_hda_core,snd_soc_rt5640,snd_hda_codec_hdmi,snd_soc_core cfg80211 516096 4 mac80211,ath9k,ath,ath9k_common mii 16384 1 r8169 battery 20480 0 rfkill 20480 5 bluetooth,ideapad_laptop,cfg80211 wmi 16384 1 ideapad_laptop ac97_bus 16384 1 snd_soc_core i2c_hid 20480 0 hid 114688 3 i2c_hid,hid_generic,usbhid fjes 28672 0 elan_i2c 32768 0 i915 1204224 12 parport_pc 28672 0 parport 40960 2 parport_pc,ppdev video 36864 2 i915,ideapad_laptop mei_me 36864 0 spi_pxa2xx_platform 24576 0 8250_dw 16384 0 i2c_designware_platform 16384 0 drm_kms_helper 126976 3 amdgpu,radeon,i915 drm 294912 13 amdgpu,radeon,i915,ttm,drm_kms_helper snd_soc_sst_acpi 16384 0 intel_gtt 20480 1 i915 snd_soc_sst_match 16384 1 snd_soc_sst_acpi syscopyarea 16384 1 drm_kms_helper sysfillrect 16384 1 drm_kms_helper sysimgblt 16384 1 drm_kms_helper fb_sys_fops 16384 1 drm_kms_helper i2c_algo_bit 16384 3 amdgpu,radeon,i915 snd_timer 28672 1 snd_pcm snd 69632 22 snd_compress,snd_hda_intel,snd_hwdep,snd_hda_codec_conexant,snd_hda_codec,snd_timer,snd_hda_codec_hdmi,snd_hda_codec_generic,snd_soc_core,snd_pcm lpc_ich 24576 0 mei 86016 1 mei_me shpchp 32768 0 soc_button_array 16384 0 i2c_i801 24576 0 i2c_designware_core 20480 1 i2c_designware_platform i2c_smbus 16384 1 i2c_i801 soundcore 16384 1 snd tpm_tis 16384 0 tpm_tis_core 20480 1 tpm_tis tpm 36864 2 tpm_tis,tpm_tis_core ac 16384 0 button 16384 1 i915 sch_fq_codel 20480 5 ip_tables 28672 0 x_tables 28672 1 ip_tables ext4 528384 3 crc16 16384 2 bluetooth,ext4 jbd2 90112 1 ext4 fscrypto 24576 1 ext4 mbcache 16384 4 ext4 algif_skcipher 20480 0 af_alg 16384 1 algif_skcipher dm_crypt 28672 1 dm_mod 106496 12 dm_crypt sr_mod 24576 0 sd_mod 36864 3 cdrom 53248 1 sr_mod rtsx_usb_sdmmc 28672 0 rtsx_usb 20480 2 rtsx_usb_sdmmc,rtsx_usb_ms serio_raw 16384 0 atkbd 24576 0 libps2 16384 2 atkbd,psmouse crct10dif_pclmul 16384 0 crc32_pclmul 16384 0 crc32c_intel 24576 0 ghash_clmulni_intel 16384 0 ahci 36864 2 libahci 28672 1 ahci aesni_intel 167936 9 xhci_pci 16384 0 aes_x86_64 20480 1 aesni_intel lrw 16384 1 aesni_intel xhci_hcd 172032 1 xhci_pci gf128mul 16384 1 lrw glue_helper 16384 1 aesni_intel ablk_helper 16384 1 aesni_intel cryptd 20480 4 ablk_helper,ghash_clmulni_intel,aesni_intel libata 212992 2 ahci,libahci ehci_pci 16384 0 ehci_hcd 73728 1 ehci_pci usbcore 208896 9 uvcvideo,usbhid,ehci_hcd,xhci_pci,rtsx_usb,ath3k,btusb,xhci_hcd,ehci_pci scsi_mod 159744 3 sd_mod,libata,sr_mod usb_common 16384 1 usbcore i8042 28672 1 ideapad_laptop serio 20480 6 serio_raw,atkbd,psmouse,i8042 sdhci_acpi 16384 0 sdhci 40960 1 sdhci_acpi led_class 16384 4 rtsx_usb_sdmmc,sdhci,input_leds,ath9k mmc_core 122880 3 rtsx_usb_sdmmc,sdhci,sdhci_acpi
I just tried on a newer kernel from Archlinux [testing]. [mulander@napalm ~]$ uname -a Linux napalm 4.10.1-1-ARCH #1 SMP PREEMPT Sun Feb 26 21:08:53 UTC 2017 x86_64 GNU/Linux Same problem. Here is a photo I managed to grab while trying to boot it up. https://imgur.com/PCC42Bj
OK so we narrowed the problem down to dpm. (12:17:01 AM) mulander: yep, blacklisted radeon, amdgpu.dpm=0 and booted to X properly without a crash
(In reply to Adam Wolk from comment #0) > I noticed my external display constantly turning on and off unless a DRI app > is active (ie. running DRI_PRIME=1 glxgears). Which GPU is the external display connected to? If you're not sure, attach the output of xrandr.
Here is the xrandr output. [mulander@napalm ~]$ xrandr Screen 0: minimum 8 x 8, current 3286 x 1080, maximum 32767 x 32767 eDP1 connected 1366x768+1920+0 (normal left inverted right x axis y axis) 340mm x 190mm 1366x768 59.97*+ 1024x768 60.00 1024x576 60.00 960x540 60.00 800x600 60.32 56.25 864x486 60.00 640x480 59.94 720x405 60.00 680x384 60.00 640x360 60.00 DP1 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 480mm x 270mm 1920x1080 60.00*+ 1680x1050 59.95 1280x1024 75.02 60.02 1152x864 75.00 1024x768 75.03 60.00 800x600 75.00 60.32 640x480 75.00 59.94 720x400 70.08 HDMI1 disconnected (normal left inverted right x axis y axis) HDMI2 disconnected (normal left inverted right x axis y axis) VIRTUAL1 disconnected (normal left inverted right x axis y axis)
Possibly a duplicate of bug 99387.
FWIW, since all display outputs are connected to the iGPU, it's unlikely that using amdgpu instead of radeon will have any effect on the external display turning on and off. You'd have to bring that up with the iGPU drivers.
Regarding the display flicking on/off (the effect feels like changing resolution - the way it goes out and back). This is completely mitigated by running DRI_PRIME=1 glxgears hence why I thought it might be AMD driver related. Regardless, the main thing reported here is a null pointer dereference in the kernel and a system unable to boot completely. I can live with the flicker - I just workaround it by running DRI_PRIME=1 glxgears all day...
Do the patches in bug 99387 help?
(In reply to Adam Wolk from comment #7) > Regarding the display flicking on/off (the effect feels like changing > resolution - the way it goes out and back). This is completely mitigated by > running DRI_PRIME=1 glxgears hence why I thought it might be AMD driver > related. It can't really be directly related, since the display isn't connected to the AMD GPU. I'd report it against the i915 kernel driver.
> Do the patches in bug 99387 help? This is a machine I use for work unfortunately I can't fiddle with it more. Regarding the flickering issue I reported it as a separate bug. https://bugs.freedesktop.org/show_bug.cgi?id=100386
(In reply to Adam Wolk from comment #10) > > Do the patches in bug 99387 help? > > This is a machine I use for work unfortunately I can't fiddle with it more. > > Regarding the flickering issue I reported it as a separate bug. > > https://bugs.freedesktop.org/show_bug.cgi?id=100386 [edward@skytop linux]$ git tag --contains c10c8f7 -l v4.10.0 v4.11-rc1 v4.11-rc2 v4.11-rc3 v4.11-rc4 v4.11-rc5 v4.11-rc6 v4.11-rc7 @Adam, can you please try updating to at minimum kernel v4.10.0 and seeing if that fixes the issue for you?
@Alex Deucher I confirmed with Adam that, even with c10c8f7 he still has the null pointer issue.
Created attachment 130967 [details] [review] dpm patch @Adam, please try applying the attached patch and let me know if it helps with your issue?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/147.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.