Bug 95328

Summary: System hangs for ~60 seconds on switching monitor modes with amdgpu since Linux 4.5
Product: DRI Reporter: Guido Winkelmann <guido-xorgbugs>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Full dmesg output
none
Xorg log file
none
Uncompressed full dmesg output
none
Uncompressed X log file none

Description Guido Winkelmann 2016-05-09 18:28:12 UTC
Starting with Linux 4.5, the whole system will hang for about 60 seconds every time the monitor configuration is changed. The system will be completely unresponsive during this time, not even reacting to NumLock toggle on the keyboard. The system stays responsive to pings from another machine, though, and a music player app in the background keeps up it audio output.

This happens once during kernel bootup, once during the startup of the display manager, once during startup of my desktop environment (which sets up a multi-monitor configuration), every time my desktop environment enables or disables screen blanking for inactivity and every time I make any changes to the monitor configuration using xrandr or by connecting or disconnecting or switching on or off any monitors on DisplayPort.

In dmesg, I get this output afterwards:

[    1.123899] Non-volatile memory driver v1.3
[    1.124084] Linux agpgart interface v0.103
[    1.124252] [drm] Initialized drm 1.1.0 20060810
[    1.124365] [drm] radeon kernel modesetting enabled.
[    1.124541] [drm] amdgpu kernel modesetting enabled.
[    1.124865] [drm] initializing kernel modesetting (TONGA 0x1002:0x6938 0x174B:0xE308 0xF1).
[    1.125044] [drm] register mmio base: 0xF8600000
[    1.125144] [drm] register mmio size: 262144
[    1.125246] [drm] doorbell mmio base: 0xD0000000
[    1.125346] [drm] doorbell mmio size: 2097152
[    1.125450] [drm] probing gen 2 caps for device 8086:3c04 = 7a7103/e
[    1.125552] [drm] probing mlw for device 8086:3c04 = 7a7103
[    1.125673] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
[    1.125863] ATOM BIOS: E308
[    1.126112] amdgpu 0000:01:00.0: VRAM: 4096M 0x0000000000000000 - 0x00000000FFFFFFFF (4096M used)
[    1.126287] amdgpu 0000:01:00.0: GTT: 4096M 0x0000000100000000 - 0x00000001FFFFFFFF
[    1.126461] [drm] Detected VRAM RAM=4096M, BAR=256M
[    1.126561] [drm] RAM width 256bits DDR
[    1.126737] [TTM] Zone  kernel: Available graphics memory: 16457442 kiB
[    1.126841] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[    1.126943] [TTM] Initializing pool allocator
[    1.127047] [TTM] Initializing DMA pool allocator
[    1.127164] [drm] amdgpu: 4096M of VRAM memory ready
[    1.127264] [drm] amdgpu: 4096M of GTT memory ready.
[    1.127375] [drm] GART: num cpu pages 1048576, num gpu pages 1048576
[    1.131949] [drm] PCIE GART of 4096M enabled (table at 0x0000000000040000).
[    1.132041] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    1.132117] [drm] Driver supports precise vblank timestamp query.
[    1.132217] amdgpu 0000:01:00.0: amdgpu: using MSI.
[    1.132308] [drm] amdgpu: irq initialized.
[    1.132386] Can't find requested voltage id in vdd_dep_on_sclk table!
[    1.134523] amdgpu: powerplay initialized
[    1.135051] [drm] AMDGPU Display Connectors
[    1.135126] [drm] Connector 0:
[    1.135199] [drm]   DP-1
[    1.135271] [drm]   HPD4
[    1.135344] [drm]   DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b
[    1.135465] [drm]   Encoders:
[    1.135538] [drm]     DFP1: INTERNAL_UNIPHY1
[    1.135611] [drm] Connector 1:
[    1.135728] [drm]   HDMI-A-1
[    1.135800] [drm]   HPD5
[    1.135872] [drm]   DDC: 0x4870 0x4870 0x4871 0x4871 0x4872 0x4872 0x4873 0x4873
[    1.135994] [drm]   Encoders:
[    1.136066] [drm]     DFP2: INTERNAL_UNIPHY1
[    1.136139] [drm] Connector 2:
[    1.136212] [drm]   DVI-D-1
[    1.136284] [drm]   HPD1
[    1.136356] [drm]   DDC: 0x4878 0x4878 0x4879 0x4879 0x487a 0x487a 0x487b 0x487b
[    1.136478] [drm]   Encoders:
[    1.136550] [drm]     DFP3: INTERNAL_UNIPHY
[    1.136623] [drm] Connector 3:
[    1.136696] [drm]   DVI-I-1
[    1.136768] [drm]   HPD6
[    1.136840] [drm]   DDC: 0x487c 0x487c 0x487d 0x487d 0x487e 0x487e 0x487f 0x487f
[    1.136962] [drm]   Encoders:
[    1.137034] [drm]     DFP4: INTERNAL_UNIPHY2
[    1.137107] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[    1.137278] amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000100000008, cpu addr 0xffff8808197c6008
[    1.137596] amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000100000018, cpu addr 0xffff8808197c6018
[    1.137914] amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000100000028, cpu addr 0xffff8808197c6028
[    1.138248] amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000100000038, cpu addr 0xffff8808197c6038
[    1.138577] amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000100000048, cpu addr 0xffff8808197c6048
[    1.138878] amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000100000058, cpu addr 0xffff8808197c6058
[    1.139186] amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000100000068, cpu addr 0xffff8808197c6068
[    1.139475] amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000100000078, cpu addr 0xffff8808197c6078
[    1.139795] amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x0000000100000088, cpu addr 0xffff8808197c6088
[    1.140138] amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x0000000100000098, cpu addr 0xffff8808197c6098
[    1.140383] amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x00000001000000a8, cpu addr 0xffff8808197c60a8
[    1.140569] [drm] Found UVD firmware Version: 1.52 Family ID: 10
[    1.141226] amdgpu 0000:01:00.0: fence driver on ring 11 use gpu addr 0x000000000088f7b0, cpu addr 0xffffc90005c4e7b0
[    1.141358] [drm] Found VCE firmware Version: 48.0 Binary ID: 3
[    1.141599] amdgpu 0000:01:00.0: fence driver on ring 12 use gpu addr 0x00000001000000c8, cpu addr 0xffff8808197c60c8
[    1.141877] amdgpu 0000:01:00.0: fence driver on ring 13 use gpu addr 0x00000001000000d8, cpu addr 0xffff8808197c60d8
[    1.219919] [drm] ring test on 0 succeeded in 12 usecs
[    1.220208] [drm] ring test on 1 succeeded in 22 usecs
[    1.220303] [drm] ring test on 2 succeeded in 4 usecs
[    1.220398] [drm] ring test on 3 succeeded in 4 usecs
[    1.220493] [drm] ring test on 4 succeeded in 4 usecs
[    1.220588] [drm] ring test on 5 succeeded in 4 usecs
[    1.220683] [drm] ring test on 6 succeeded in 4 usecs
[    1.220778] [drm] ring test on 7 succeeded in 4 usecs
[    1.220872] [drm] ring test on 8 succeeded in 4 usecs
[    1.220990] [drm] ring test on 9 succeeded in 9 usecs
[    1.221091] [drm] ring test on 10 succeeded in 7 usecs
[    1.266843] [drm] ring test on 11 succeeded in 3 usecs
[    1.286783] [drm] UVD initialized successfully.
[    1.505200] [drm] ring test on 12 succeeded in 27 usecs
[    1.505300] [drm] ring test on 13 succeeded in 5 usecs
[    1.505387] [drm] VCE initialized successfully.
[    1.652922] [drm] fb mappable at 0xC0BAA000
[    1.653000] [drm] vram apper at 0xC0000000
[    1.653076] [drm] size 14745600
[    1.653151] [drm] fb depth is 24
[    1.653227] [drm]    pitch is 10240
[    1.653458] fbcon: amdgpudrmfb (fb0) is primary device
[    1.829490] Failed to send Message.
[    2.179278] Failed to send Previous Message.

(The last two messages get repeated about 4 times a second at this point.)

[   70.107528] Console: switching to colour frame buffer device 320x90
[   70.295522] amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device
[   70.306857] [drm] ib test on ring 0 succeeded in 0 usecs
[   70.307785] [drm] ib test on ring 1 succeeded in 0 usecs
[   70.308631] [drm] ib test on ring 2 succeeded in 0 usecs
[   70.309472] [drm] ib test on ring 3 succeeded in 0 usecs
[   70.310296] [drm] ib test on ring 4 succeeded in 0 usecs
[   70.311151] [drm] ib test on ring 5 succeeded in 0 usecs
[   70.311951] [drm] ib test on ring 6 succeeded in 0 usecs
[   70.312754] [drm] ib test on ring 7 succeeded in 0 usecs
[   70.313504] [drm] ib test on ring 8 succeeded in 0 usecs
[   70.314306] [drm] ib test on ring 9 succeeded in 0 usecs
[   70.315073] [drm] ib test on ring 10 succeeded in 0 usecs
[   70.336416] [drm] ib test on ring 11 succeeded
[   70.357077] [drm] ib test on ring 12 succeeded
[   70.545569] Failed to send Previous Message.
[   70.734245] Failed to send Message.
[   70.922837] Failed to send Previous Message.
[   71.111567] Failed to send Message.
[   71.488476] Failed to send Previous Message.
[   71.677079] Failed to send Message.
[   72.052880] Failed to send Previous Message.
[   72.241706] Failed to send Message.
[   72.242405] [drm] Initialized amdgpu 3.1.0 20150101 for 0000:01:00.0 on minor 0

The system will continue after this point. The system clock has not lost any time.

This is with CONFIG_DRM_AMD_POWERPLAY enabled. I haven't yet tested without that setting.
Comment 1 Alex Deucher 2016-05-09 18:30:21 UTC
Can you bisect?
Comment 2 Alex Deucher 2016-05-09 18:31:05 UTC
Please attach your full dmesg output and xorg log.
Comment 3 Guido Winkelmann 2016-05-09 18:53:35 UTC
(In reply to Alex Deucher from comment #1)
> Can you bisect?

I just noticed I didn't have the latest firmware installed. I'm going to try rebooting with the latest firmware first. If things stay broken after that, I'll try with PowerPlay disabled. After that, I might try bisecting, although that's bound to take some time...
Comment 4 Guido Winkelmann 2016-05-09 18:56:01 UTC
Created attachment 123579 [details]
Full dmesg output
Comment 5 Guido Winkelmann 2016-05-09 18:56:37 UTC
Created attachment 123580 [details]
Xorg log file
Comment 6 Guido Winkelmann 2016-05-09 18:57:55 UTC
Created attachment 123581 [details]
Uncompressed full dmesg output

Compressing was not a good idea...
Comment 7 Guido Winkelmann 2016-05-09 18:58:32 UTC
Created attachment 123582 [details]
Uncompressed X log file
Comment 8 Guido Winkelmann 2016-05-09 19:06:29 UTC
The problem went away after upgrading to firmware package 20160331.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.