Bug 111659 - Kernel panic when waking up after screens go to dpms sleep
Summary: Kernel panic when waking up after screens go to dpms sleep
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: not set normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-09-11 11:39 UTC by Brad Campbell
Modified: 2019-11-19 09:35 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Multiple instances of the Panic (10.42 KB, text/plain)
2019-09-11 11:39 UTC, Brad Campbell
no flags Details
Complete dmesg (127.86 KB, text/plain)
2019-09-11 11:39 UTC, Brad Campbell
no flags Details
Xorg log (54.94 KB, text/plain)
2019-09-11 11:39 UTC, Brad Campbell
no flags Details
Same oops with v5.3.1 (4.76 KB, text/plain)
2019-09-26 03:51 UTC, Brad Campbell
no flags Details

Description Brad Campbell 2019-09-11 11:39:00 UTC
Created attachment 145332 [details]
Multiple instances of the Panic

iMac late 2011 with 2 Thunderbolt displays.

Kernel 5.2 finally got DP routing working to allow both TB displays to work, however I'm now getting lockups which appear to be triggered in   radeon_dp_needs_link_train.

I'm capturing these over netconsole as it leaves the machine paralysed.

Attachment faults.txt has 4 separate instances of the fault from 4 different boots.

This machine stays on 24/7 and this seems to occur when the displays wake up after a dpms sleep. Having said that I've also seen the fault when doing something innocuous like changing the audio volume.

Generally at least 2 of the screens wakeup, so I have displays with a lockscreen asking for a password and a mouse, but the machine is dead. 

The last example in faults.txt left the machine in a state where I could ssh in and reboot it. All the others required a hard power cycle.

I'm currently using 5.2.11. Previously I was using 4.17, but I can't roll back prior to 5.2 without losing the second TB display, and it can take hours or days to hit so bisection would be difficult.

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Blackcomb [Radeon HD 6970M/6990M] (prog-if 00 [VGA controller])
	Subsystem: Apple Inc. Radeon HD 6970M
	Flags: bus master, fast devsel, latency 0, IRQ 79
	Memory at 90000000 (64-bit, prefetchable) [size=256M]
	Memory at a8800000 (64-bit, non-prefetchable) [size=128K]
	I/O ports at 2000 [size=256]
	Expansion ROM at a8820000 [disabled] [size=128K]
	Capabilities: [50] Power Management version 3
	Capabilities: [58] Express Legacy Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150] Advanced Error Reporting
	Kernel driver in use: radeon
	Kernel modules: radeon
Comment 1 Brad Campbell 2019-09-11 11:39:29 UTC
Created attachment 145333 [details]
Complete dmesg
Comment 2 Brad Campbell 2019-09-11 11:39:41 UTC
Created attachment 145334 [details]
Xorg log
Comment 3 Brad Campbell 2019-09-26 03:51:26 UTC
Created attachment 145517 [details]
Same oops with v5.3.1

Same oops, newer kernel.
Comment 4 Brad Campbell 2019-09-26 04:43:02 UTC
Previous oops (Same oops with v5.3.1) was triggered by changing the audio volume. I assume something to do with the volume OSD tickles the GPU.
Comment 5 Brad Campbell 2019-10-28 08:14:16 UTC
I was stating to suspect this might be a hardware problem, so I sourced a similar machine with a Radeon 6770M. Unfortunately the same fault exhibits.

[ 2060.703207] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: radeon_dp_needs_link_train+0x69/0x70 [radeon]
[ 2060.703221] CPU: 2 PID: 1386 Comm: kworker/2:2 Not tainted 5.3.4+ #19
[ 2060.703223] Hardware name: Apple Inc. iMac12,2/Mac-942B59F58194171B, BIOS 82.0.0.0.0 09/25/2018
[ 2060.703238] Workqueue: events radeon_dp_work_func [radeon]
[ 2060.703241] Call Trace:
[ 2060.703248]  dump_stack+0x46/0x5b
[ 2060.703253]  panic+0xf3/0x288
[ 2060.703274]  ? radeon_dp_needs_link_train+0x69/0x70 [radeon]
[ 2060.703278]  __stack_chk_fail+0x10/0x10
[ 2060.703294]  radeon_dp_needs_link_train+0x69/0x70 [radeon]
[ 2060.703308]  radeon_connector_hotplug+0xa4/0xd0 [radeon]
[ 2060.703323]  radeon_dp_work_func+0x28/0x40 [radeon]
[ 2060.703326]  process_one_work+0x1b4/0x330
[ 2060.703329]  worker_thread+0x44/0x3d0
[ 2060.703333]  kthread+0xeb/0x120
[ 2060.703336]  ? process_one_work+0x330/0x330
[ 2060.703338]  ? kthread_park+0xa0/0xa0
[ 2060.703344]  ret_from_fork+0x1f/0x30
[ 2060.703382] Kernel Offset: disabled
[ 2060.703385] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: radeon_dp_needs_link_train+0x69/0x70 [radeon] ]---

I'm now at a bit of a loss as to how to debug it. I can reproduce it reliably by playing a 1080p video on each head with mplayer and triggering the OSD (volume/brightness) changes. I can't bisect it with an earlier kernel because it doesn't occur when using the internal panel, one Thunderbolt and one HDMI, and the dual thunderbolt support only turned up in 5.2.
Comment 6 Brad Campbell 2019-10-28 11:57:42 UTC
I thought given these generally result in a stack canary I'd try something different. Upgraded to latest linus-git and disabled the stack protector.

It did take a bit longer to hit, but we got there.

[ 8044.605803] do_IRQ: 2.205 No irq handler for vector
[ 8044.605814] do_IRQ: 2.233 No irq handler for vector
[ 8044.605821] invalid opcode: 0000 [#1] SMP
[ 8044.605824] CPU: 2 PID: 1476 Comm: kworker/2:2 Not tainted 5.4.0-rc4-bkc1+ #3
[ 8044.605826] Hardware name: Apple Inc. iMac12,2/Mac-942B59F58194171B, BIOS 82.0.0.0.0 09/25/2018
[ 8044.605847] Workqueue: events radeon_dp_work_func [radeon]
[ 8044.605866] RIP: 0010:atom_op_move+0x124/0x1d0 [radeon]
[ 8044.605869] Code: 20 c0 e8 03 83 e0 07 74 2e 8d 4a 03 83 c2 02 3c 04 0f 42 d1 41 89 14 24 eb 07 83 c2 02 41 89 14 24 c7 44 24 08 cd cd cd cd e9 <42> ff ff ff 83 c2 03 41 89 14 24 eb ea 83 c2 05 41 89 14 24 eb e1
[ 8044.605871] RSP: 0018:ffffc9000034fe50 EFLAGS: 00010292
[ 8044.605880] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
[ 8044.605881] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffc9000034fe3a
[ 8044.605883] RBP: ffff8882633d0000 R08: ffffffffa0201837 R09: 0000000000000006
[ 8044.605884] R10: 0000000000000006 R11: 000000000001a800 R12: ffff8882658eb1a0
[ 8044.605886] R13: 0000000000000000 R14: ffff888267b22c00 R15: 0000000000000000
[ 8044.605887] FS:  0000000000000000(0000) GS:ffff888267b00000(0000) knlGS:0000000000000000
[ 8044.605889] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8044.605890] CR2: 000031fd52e7a000 CR3: 00000001d1197002 CR4: 00000000000606a0
[ 8044.605892] Call Trace:
[ 8044.605907]  ? radeon_dp_work_func+0x28/0x40 [radeon]
[ 8044.605911]  ? process_one_work+0x1b4/0x330
[ 8044.605913]  ? worker_thread+0x44/0x3d0
[ 8044.605923]  ? set_worker_desc+0x90/0x90
[ 8044.605925]  ? kthread+0xec/0x120
[ 8044.605927]  ? kthread_create_worker_on_cpu+0x40/0x40
[ 8044.605930]  ? ret_from_fork+0x1f/0x30
[ 8044.605932] Modules linked in: cpufreq_userspace cpufreq_powersave cpufreq_conservative nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc af_packet dm_crypt dm_mod dax coretemp applesmc input_polldev kvm_intel led_class kvm irqbypass btusb btbcm uvcvideo btintel bluetooth videobuf2_vmalloc snd_usb_aud
io videobuf2_memops rfkill videobuf2_v4l2 videodev snd_usbmidi_lib ecdh_generic videobuf2_common ecc joydev snd_hda_codec_cirrus snd_rawmidi snd_hda_codec_hdmi snd_hda_codec_generic evdev snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore button apple_bl ext4 crc32c_gen
eric crc16 mbcache jbd2 netconsole configfs hid_apple hid_appleir usb_storage hid_generic usbhid hid sg sr_mod cdrom sd_mod aesni_intel glue_helper crypto_simd cryptd ahci libahci radeon xhci_pci xhci_hcd i2c_algo_bit backlight drm_kms_helper syscopyarea tg3 sysfillrect sysimgblt fb_sys_fops ehci_pci uhci_hcd libphy
 ttm ehci_hcd firewire_ohci firewire_core drm crc_itu_t
[ 8044.605961]  usbcore usb_common i2c_core
[ 8044.605967] ---[ end trace 78992a27b7291279 ]---
[ 8044.605977] RIP: 0010:atom_op_move+0x124/0x1d0 [radeon]
[ 8044.605979] Code: 20 c0 e8 03 83 e0 07 74 2e 8d 4a 03 83 c2 02 3c 04 0f 42 d1 41 89 14 24 eb 07 83 c2 02 41 89 14 24 c7 44 24 08 cd cd cd cd e9 <42> ff ff ff 83 c2 03 41 89 14 24 eb ea 83 c2 05 41 89 14 24 eb e1
[ 8044.605980] RSP: 0018:ffffc9000034fe50 EFLAGS: 00010292
[ 8044.605983] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
[ 8044.605985] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffc9000034fe3a
[ 8044.605986] RBP: ffff8882633d0000 R08: ffffffffa0201837 R09: 0000000000000006
[ 8044.605987] R10: 0000000000000006 R11: 000000000001a800 R12: ffff8882658eb1a0
[ 8044.605989] R13: 0000000000000000 R14: ffff888267b22c00 R15: 0000000000000000
[ 8044.605990] FS:  0000000000000000(0000) GS:ffff888267b00000(0000) knlGS:0000000000000000
[ 8044.605991] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8044.605993] CR2: 000031fd52e7a000 CR3: 00000001d1197002 CR4: 00000000000606a0
[ 8044.605994] Kernel panic - not syncing: Fatal exception
[ 8044.606020] Kernel Offset: disabled
[ 8044.606022] Rebooting in 10 seconds..
[ 8054.606406] ACPI MEMORY or I/O RESET_REG.
Comment 7 Brad Campbell 2019-10-31 07:42:01 UTC
And another one.

Disturbingly in a different location. I'm starting to wonder if there is a deeper issue at play.

[237702.755803] invalid opcode: 0000 [#1] SMP
[237702.755811] CPU: 2 PID: 15611 Comm: kworker/2:1 Not tainted 5.4.0-rc4-bkc1+ #3
[237702.755813] Hardware name: Apple Inc. iMac12,2/Mac-942B59F58194171B, BIOS 82.0.0.0.0 09/25/2018
[237702.755835] Workqueue: events radeon_dp_work_func [radeon]
[237702.755846] RIP: 0010:radeon_gart_table_vram_unpin+0x101/0x110 [radeon]
[237702.755848] Code: 00 fe ff ff 74 ed 5b 48 89 ea 48 c7 c6 64 ae 3c a0 48 8b 85 20 03 00 00 5d 41 5c 48 8b 38 e9 38 63 00 e1 4c 89 e7 e8 5e 4e ef <ff> eb a
0 66 90 66 2e 0f 1f 84 00 00 00 00 00 48 83 bf 68 04 00 00
[237702.755850] RSP: 0000:ffffc90003d07e50 EFLAGS: 00010292
[237702.755851] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
[237702.755853] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffc90003d07e3a
[237702.755854] RBP: ffff888262c40000 R08: ffffffffa0213837 R09: 0000000000000006
[237702.755856] R10: 0000000000000006 R11: 0000000000000000 R12: ffff8882639e1a40
[237702.755857] R13: 0000000000000000 R14: ffff888267b22c00 R15: 0000000000000000
[237702.755858] FS:  0000000000000000(0000) GS:ffff888267b00000(0000) knlGS:0000000000000000
[237702.755860] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[237702.755861] CR2: 00002bc12936d1e0 CR3: 0000000252db0004 CR4: 00000000000606a0
[237702.755862] Call Trace:
[237702.755878]  radeon_dp_work_func+0x28/0x40 [radeon]
[237702.755883]  process_one_work+0x1b4/0x330
[237702.755885]  worker_thread+0x44/0x3d0
[237702.755887]  ? set_worker_desc+0x90/0x90
[237702.755890]  kthread+0xec/0x120
[237702.755892]  ? kthread_create_worker_on_cpu+0x40/0x40
[237702.755895]  ret_from_fork+0x1f/0x30
[237702.755897] Modules linked in: cpufreq_userspace cpufreq_powersave cpufreq_conservative nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc af_packet dm_crypt dm_mod dax coretemp applesmc input_polldev kvm_intel led_class kvm btusb btbcm uvcvideo irqbypass videobuf2_vmalloc videobuf2_memops btintel videobuf2_v4l2 bluetooth snd_usb_audio videodev snd_usbmidi_lib rfkill videobuf2_common ecdh_generic snd_rawmidi ecc joydev evdev snd_hda_codec_cirrus snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd apple_bl button soundcore ext4 crc32c_generic crc16 mbcache jbd2 netconsole configfs hid_apple hid_appleir hid_generic usbhid hid usb_storage sg sr_mod cdrom sd_mod aesni_intel glue_helper crypto_simd cryptd ahci libahci xhci_pci radeon xhci_hcd i2c_algo_bit backlight drm_kms_helper syscopyarea sysfillrect firewire_ohci sysimgblt fb_sys_fops ehci_pci firewire_core ttm uhci_hcd crc_itu_t tg3 ehci_hcd libphy drm
[237702.755925]  usbcore usb_common i2c_core
[237702.755931] ---[ end trace 5f73030a00b66980 ]---
[237702.755941] RIP: 0010:radeon_gart_table_vram_unpin+0x101/0x110 [radeon]
[237702.755943] Code: 00 fe ff ff 74 ed 5b 48 89 ea 48 c7 c6 64 ae 3c a0 48 8b 85 20 03 00 00 5d 41 5c 48 8b 38 e9 38 63 00 e1 4c 89 e7 e8 5e 4e ef <ff> eb a0 66 90 66 2e 0f 1f 84 00 00 00 00 00 48 83 bf 68 04 00 00
[237702.755944] RSP: 0000:ffffc90003d07e50 EFLAGS: 00010292
[237702.755945] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
[237702.755947] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffc90003d07e3a
[237702.755948] RBP: ffff888262c40000 R08: ffffffffa0213837 R09: 0000000000000006
[237702.755950] R10: 0000000000000006 R11: 0000000000000000 R12: ffff8882639e1a40
[237702.755951] R13: 0000000000000000 R14: ffff888267b22c00 R15: 0000000000000000
[237702.755952] FS:  0000000000000000(0000) GS:ffff888267b00000(0000) knlGS:0000000000000000
[237702.755954] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[237702.755955] CR2: 00002bc12936d1e0 CR3: 0000000252db0004 CR4: 00000000000606a0
[237702.755957] Kernel panic - not syncing: Fatal exception
[237702.755981] Kernel Offset: disabled
[237702.755983] Rebooting in 10 seconds..
[237712.756129] ACPI MEMORY or I/O RESET_REG.
Comment 8 Brad Campbell 2019-11-19 00:30:13 UTC
I'm running out of ideas. I assume from the complete lack of interest I'm either putting this in the wrong place or everyone is concentrating on new hardware.

Recap. This is happening on two separate and different machines. Both iMac, but one with a 6770M and one with a 6970M just to rule out hardware failure.

Tried it with dpm off. That made it last a bit longer, but : 

[925997.946677] BUG: kernel NULL pointer dereference, address: 00000000000003f0
[925997.946690] #PF: supervisor write access in kernel mode
[925997.946693] #PF: error_code(0x0002) - not-present page
[925997.946695] PGD 0 P4D 0 
[925997.946701] Oops: 0002 [#1] SMP
[925997.946705] CPU: 3 PID: 26449 Comm: Xorg Not tainted 5.4.0-rc6-bkc1+ #4
[925997.946707] Hardware name: Apple Inc. iMac12,2/Mac-942B59F58194171B, BIOS 82.0.0.0.0 09/25/2018
[925997.946716] RIP: 0010:mutex_lock+0x14/0x30
[925997.946720] Code: e9 22 fd ff ff 90 be 02 00 00 00 e9 66 fb ff ff 66 0f 1f 44 00 00 53 48 89 fb e8 e7 ea ff ff 31 c0 65 48 8b 14 25 00 5d 01 00 <f0> 48 0f b1 13 75 02 5b c3 48 89 df 5b eb cd 0f 1f 00 66 2e 0f 1f
[925997.946723] RSP: 0018:ffffc90000357a90 EFLAGS: 00010246
[925997.946726] RAX: 0000000000000000 RBX: 00000000000003f0 RCX: ffffc90000357b2f
[925997.946729] RDX: ffff888264e38000 RSI: 0000000000000008 RDI: 00000000000003f0
[925997.946732] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[925997.946734] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[925997.946737] R13: 00000000fffffffb R14: 00000000ffffffb9 R15: 0000000000000000
[925997.946741] FS:  00007f92358580c0(0000) GS:ffff888267b80000(0000) knlGS:0000000000000000
[925997.946743] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[925997.946746] CR2: 00000000000003f0 CR3: 0000000263d9b004 CR4: 00000000000606a0
[925997.946748] Call Trace:
[925997.946761]  drm_dp_dpcd_access+0x57/0xf0 [drm_kms_helper]
[925997.946769]  drm_dp_dpcd_write+0x21/0x90 [drm_kms_helper]
[925997.946804]  radeon_dp_set_tp+0x4f/0x80 [radeon]
[925997.946832]  radeon_dp_link_train+0x54f/0x570 [radeon]
[925997.946862]  radeon_atom_encoder_dpms_dig+0x21a/0x4e0 [radeon]
[925997.946881]  ? atombios_blank_crtc+0x130/0x130 [radeon]
[925997.946907]  radeon_atom_encoder_dpms+0xa6/0x110 [radeon]
[925997.946913]  drm_helper_connector_dpms+0x10b/0x150 [drm_kms_helper]
[925997.946928]  drm_connector_set_obj_prop+0x56/0x70 [drm]
[925997.946941]  drm_mode_obj_set_property_ioctl+0x252/0x270 [drm]
[925997.946946]  ? schedule+0x34/0x90
[925997.946951]  ? __lock_page_killable+0x132/0x1c0
[925997.946963]  ? drm_connector_set_obj_prop+0x70/0x70 [drm]
[925997.946973]  drm_connector_property_set_ioctl+0x29/0x30 [drm]
[925997.946984]  drm_ioctl_kernel+0x83/0xd0 [drm]
[925997.946994]  drm_ioctl+0x2a5/0x320 [drm]
[925997.947005]  ? drm_connector_set_obj_prop+0x70/0x70 [drm]
[925997.947009]  ? filemap_map_pages+0x151/0x310
[925997.947023]  radeon_drm_ioctl+0x44/0x80 [radeon]
[925997.947029]  do_vfs_ioctl+0x8a/0x5d0
[925997.947033]  ksys_ioctl+0x35/0x60
[925997.947037]  __x64_sys_ioctl+0x11/0x20
[925997.947041]  do_syscall_64+0x3d/0x110
[925997.947045]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[925997.947048] RIP: 0033:0x7f923301c017
[925997.947052] Code: 00 00 00 48 8b 05 81 7e 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 7e 2b 00 f7 d8 64 89 01 48
[925997.947055] RSP: 002b:00007ffd32cab5d8 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
[925997.947058] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f923301c017
[925997.947061] RDX: 00007ffd32cab610 RSI: 00000000c01064ab RDI: 000000000000000f
[925997.947063] RBP: 00007ffd32cab610 R08: 0000560d26e47aa0 R09: 0000000000000001
[925997.947068] R10: 0000000000000000 R11: 0000000000003246 R12: 00000000c01064ab
[925997.947070] R13: 000000000000000f R14: 0000560d26dc7c50 R15: 0000560d2517a580
[925997.947073] Modules linked in: ntfs msdos ext2 fuse cpuid loop isofs nls_utf8 hfsplus nls_iso8859_1 nls_cp437 vfat fat cpufreq_userspace cpufreq_powersave cpufreq_conservative nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc af_packet dm_crypt dm_mod dax coretemp kvm_intel kvm irqbypass applesmc inp
ut_polldev led_class btusb btbcm btintel uvcvideo bluetooth videobuf2_vmalloc rfkill snd_usb_audio videobuf2_memops ecdh_generic videobuf2_v4l2 ecc snd_usbmidi_lib videodev joydev snd_hda_codec_cirrus videobuf2_common evdev snd_hda_codec_generic snd_rawmidi snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt snd_hda_cod
ec snd_hwdep snd_hda_core snd_pcm snd_timer snd apple_bl soundcore button ext4 crc32c_generic crc16 mbcache jbd2 netconsole configfs hid_apple hid_appleir sg hid_generic usbhid hid usb_storage sr_mod cdrom sd_mod aesni_intel glue_helper crypto_simd cryptd xhci_pci xhci_hcd ahci radeon firewire_ohci libahci firewire_
core crc_itu_t i2c_algo_bit backlight drm_kms_helper
[925997.947119]  syscopyarea sysfillrect sysimgblt fb_sys_fops uhci_hcd ehci_pci tg3 ttm ehci_hcd libphy drm usbcore usb_common i2c_core
[925997.947132] CR2: 00000000000003f0
[925997.947135] ---[ end trace 19041427bf8bf31b ]---
[925997.947140] RIP: 0010:mutex_lock+0x14/0x30
[925997.947160] Code: e9 22 fd ff ff 90 be 02 00 00 00 e9 66 fb ff ff 66 0f 1f 44 00 00 53 48 89 fb e8 e7 ea ff ff 31 c0 65 48 8b 14 25 00 5d 01 00 <f0> 48 0f b1 13 75 02 5b c3 48 89 df 5b eb cd 0f 1f 00 66 2e 0f 1f
[925997.947163] RSP: 0018:ffffc90000357a90 EFLAGS: 00010246
[925997.947166] RAX: 0000000000000000 RBX: 00000000000003f0 RCX: ffffc90000357b2f
[925997.947168] RDX: ffff888264e38000 RSI: 0000000000000008 RDI: 00000000000003f0
[925997.947170] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[925997.947173] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[925997.947175] R13: 00000000fffffffb R14: 00000000ffffffb9 R15: 0000000000000000
[925997.947178] FS:  00007f92358580c0(0000) GS:ffff888267b80000(0000) knlGS:0000000000000000
[925997.947181] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[925997.947183] CR2: 00000000000003f0 CR3: 0000000263d9b004 CR4: 00000000000606a0
[925997.947186] Kernel panic - not syncing: Fatal exception
[925997.947226] Kernel Offset: disabled
[925997.947230] Rebooting in 10 seconds..
[926007.947755] ACPI MEMORY or I/O RESET_REG.
Comment 9 Brad Campbell 2019-11-19 08:02:21 UTC
Another observation (after another lockup). I have the kernel set to reboot on panic. On reboot the machine displays a grey screen (as it does on a normal boot) but never gets to the bootloader.

I have to hard power cycle the machine. I suppose that might imply the card is getting into a hardware state the firmware can't recover from.

I have observed this behaviour every time it has auto-rebooted. Still booting with radeon.dpm=0.

.
[27018.029370] BUG: kernel NULL pointer dereference, address: 00000000000002b8
[27018.029383] #PF: supervisor read access in kernel mode
[27018.029386] #PF: error_code(0x0000) - not-present page
[27018.029389] PGD 0 P4D 0 
[27018.029394] Oops: 0000 [#1] SMP
[27018.029399] CPU: 2 PID: 6439 Comm: kworker/2:2 Not tainted 5.4.0-rc7-bkc1+ #5
[27018.029401] Hardware name: Apple Inc. iMac12,2/Mac-942B59F58194171B, BIOS 82.0.0.0.0 09/25/2018
[27018.029437] Workqueue: events radeon_dp_work_func [radeon]
[27018.029458] RIP: 0010:radeon_add_legacy_encoder+0x0/0x2d0 [radeon]
[27018.029462] Code: 3f a0 e8 f3 a1 d8 ff 4c 89 ef e8 db 42 e3 e0 eb d3 41 c6 45 08 00 e9 43 ff ff ff 48 c7 c7 b0 73 3e a0 e8 d3 a1 d8 ff eb de 90 <48> 8b 87 b8 02 00 00 4c 8d 87 b8 02 00 00 49 39 c0 74 24 3b 70 68
[27018.029466] RSP: 0018:ffffc90004dafe48 EFLAGS: 00010292
[27018.029469] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
[27018.029471] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[27018.029473] RBP: ffff888262820000 R08: ffffffffa0115837 R09: 0000000000000006
[27018.029476] R10: 0000000000000006 R11: 000000000000024c R12: ffff8882625c5ba0
[27018.029478] R13: 0000000000000000 R14: ffff888267b22c00 R15: 0000000000000000
[27018.029481] FS:  0000000000000000(0000) GS:ffff888267b00000(0000) knlGS:0000000000000000
[27018.029484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[27018.029487] CR2: 00000000000002b8 CR3: 0000000001c09004 CR4: 00000000000606a0
[27018.029489] Call Trace:
[27018.029509]  radeon_get_legacy_connector_info_from_bios+0x399/0xb90 [radeon]
[27018.029531]  ? radeon_dp_work_func+0x28/0x40 [radeon]
[27018.029537]  ? process_one_work+0x1b4/0x330
[27018.029540]  ? worker_thread+0x44/0x3d0
[27018.029544]  ? set_worker_desc+0x90/0x90
[27018.029548]  ? kthread+0xec/0x120
[27018.029552]  ? kthread_create_worker_on_cpu+0x40/0x40
[27018.029556]  ? ret_from_fork+0x1f/0x30
[27018.029558] Modules linked in: cpufreq_userspace cpufreq_powersave cpufreq_conservative nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc af_packet dm_crypt dm_mod dax coretemp kvm_intel kvm irqbypass applesmc btusb input_polldev btbcm led_class btintel uvcvideo bluetooth videobuf2_vmalloc rfkill videobuf2_memops snd_usb_audio videobuf2_v4l2 videodev snd_usbmidi_lib ecdh_generic ecc joydev videobuf2_common snd_rawmidi evdev snd_hda_codec_cirrus snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore apple_bl button ext4 crc32c_generic crc16 mbcache jbd2 netconsole configfs hid_apple hid_appleir usb_storage hid_generic usbhid hid sg sr_mod cdrom sd_mod aesni_intel glue_helper crypto_simd cryptd radeon ahci libahci i2c_algo_bit xhci_pci backlight xhci_hcd firewire_ohci drm_kms_helper ehci_pci syscopyarea sysfillrect sysimgblt fb_sys_fops uhci_hcd tg3 ttm firewire_core ehci_hcd libphy crc_itu_t drm
[27018.029601]  usbcore usb_common i2c_core
[27018.029610] CR2: 00000000000002b8
[27018.029613] ---[ end trace 7d5e64a9e69d86f6 ]---
[27018.029632] RIP: 0010:radeon_add_legacy_encoder+0x0/0x2d0 [radeon]
[27018.029636] Code: 3f a0 e8 f3 a1 d8 ff 4c 89 ef e8 db 42 e3 e0 eb d3 41 c6 45 08 00 e9 43 ff ff ff 48 c7 c7 b0 73 3e a0 e8 d3 a1 d8 ff eb de 90 <48> 8b 87 b8 02 00 00 4c 8d 87 b8 02 00 00 49 39 c0 74 24 3b 70 68
[27018.029639] RSP: 0018:ffffc90004dafe48 EFLAGS: 00010292
[27018.029641] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
[27018.029643] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[27018.029646] RBP: ffff888262820000 R08: ffffffffa0115837 R09: 0000000000000006
[27018.029648] R10: 0000000000000006 R11: 000000000000024c R12: ffff8882625c5ba0
[27018.029651] R13: 0000000000000000 R14: ffff888267b22c00 R15: 0000000000000000
[27018.029653] FS:  0000000000000000(0000) GS:ffff888267b00000(0000) knlGS:0000000000000000
[27018.029656] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[27018.029658] CR2: 00000000000002b8 CR3: 0000000001c09004 CR4: 00000000000606a0
[27018.029661] Kernel panic - not syncing: Fatal exception
[27018.029705] Kernel Offset: disabled
[27018.029709] Rebooting in 10 seconds..
[27028.030118] ACPI MEMORY or I/O RESET_REG.
Comment 10 Martin Peres 2019-11-19 09:35:49 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/871.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.