Bug 66714

Summary: Mobility Radeon HD 5650 doesn't boot with kernel 3.10 (and newer) when using radeon.audio=1
Product: DRI Reporter: Marco Trevisan (Treviño) <mail>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: critical    
Priority: medium    
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
URL: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1195687
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
possible fix
none
dmesg none

Description Marco Trevisan (Treviño) 2013-07-08 22:02:39 UTC
If I enable the HDMI audio using the Mobility Radeon HD 5650 card with kernel 3.10.0 (or newer), I get a kernel oops that prevents me to boot.

This is a regression, since everything is fine with kernel 3.9.0.

Also, when I enable the switchable graphics in the bios, the notebook boots but as soon as I use vgaswitcheroo to switch to the discrete card, I still get a black screen and the system can't be used (when restoring the integrate card, it goes back to work).

This is the kernel failure I've found:
Jul 8 23:23:03 tricky kernel: [ 30.386559] BUG: unable to handle kernel NULL pointer dereference at (null)
Jul 8 23:23:03 tricky kernel: [ 30.386626] IP: [<ffffffffa024be90>] evergreen_hdmi_enable+0x20/0x90 [radeon]
Jul 8 23:23:03 tricky kernel: [ 30.386715] PGD 0
Jul 8 23:23:03 tricky kernel: [ 30.386731] Oops: 0000 [#1] SMP
Jul 8 23:23:03 tricky kernel: [ 30.387209] CPU: 2 PID: 2184 Comm: Xorg Not tainted 3.10.0-996-generic #201307020454
Jul 8 23:23:03 tricky kernel: [ 30.387260] Hardware name: Acer Aspire 4820TG/JM41_CP, BIOS V1.25 03/16/2011
Jul 8 23:23:03 tricky kernel: [ 30.387306] task: ffff88022c33ddc0 ti: ffff88022f3d2000 task.ti: ffff88022f3d2000
Jul 8 23:23:03 tricky kernel: [ 30.387354] RIP: 0010:[<ffffffffa024be90>] [<ffffffffa024be90>] evergreen_hdmi_enable+0x20/0x90 [radeon]
Jul 8 23:23:03 tricky kernel: [ 30.387447] RSP: 0018:ffff88022f3d3b68 EFLAGS: 00010246
Jul 8 23:23:03 tricky kernel: [ 30.387482] RAX: ffff88022b02de00 RBX: ffff88022b02ea00 RCX: 0000000000000007
Jul 8 23:23:03 tricky kernel: [ 30.387529] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88022b02ea00
Jul 8 23:23:03 tricky kernel: [ 30.387575] RBP: ffff88022f3d3b78 R08: 0000000000000023 R09: 0000000000000022
Jul 8 23:23:03 tricky kernel: [ 30.387621] R10: ffff88022b724880 R11: 0000000000000000 R12: ffff88022b1f0000
Jul 8 23:23:03 tricky kernel: [ 30.387667] R13: ffff88022bb13000 R14: ffff88022bb13000 R15: ffff8802307e6000
Jul 8 23:23:03 tricky kernel: [ 30.387716] FS: 00007f34c3505980(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
Jul 8 23:23:03 tricky kernel: [ 30.387768] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 8 23:23:03 tricky kernel: [ 30.387807] CR2: 0000000000000000 CR3: 000000022ed44000 CR4: 00000000000007e0
Jul 8 23:23:03 tricky kernel: [ 30.387854] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 8 23:23:03 tricky kernel: [ 30.387901] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jul 8 23:23:03 tricky kernel: [ 30.387947] Stack:
Jul 8 23:23:03 tricky kernel: [ 30.387963] ffff88022b1f0000 ffff88022bb13000 ffff88022f3d3ba8 ffffffffa0252f3f
Jul 8 23:23:03 tricky kernel: [ 30.388016] ffff88022b02ea00 ffff88022bb13478 ffff88022bb13000 00000000ffffffde
Jul 8 23:23:03 tricky kernel: [ 30.388070] ffff88022f3d3bd8 ffffffffa010ab3a ffff88022e4b7cf0 ffff8802307e6000
Jul 8 23:23:03 tricky kernel: [ 30.388122] Call Trace:
Jul 8 23:23:03 tricky kernel: [ 30.388173] [<ffffffffa0252f3f>] radeon_atom_encoder_disable+0x15f/0x170 [radeon]
Jul 8 23:23:03 tricky kernel: [ 30.388231] [<ffffffffa010ab3a>] drm_helper_disable_unused_functions+0x14a/0x190 [drm_kms_helper]
Jul 8 23:23:03 tricky kernel: [ 30.388295] [<ffffffffa010c5fe>] drm_crtc_helper_set_config+0x95e/0xb50 [drm_kms_helper]
Jul 8 23:23:03 tricky kernel: [ 30.388357] [<ffffffffa00931f8>] ? ttm_bo_vm_fault+0x288/0x3f0 [ttm]
Jul 8 23:23:03 tricky kernel: [ 30.388420] [<ffffffffa01effec>] ? radeon_ttm_fault+0x5c/0x70 [radeon]
Jul 8 23:23:03 tricky kernel: [ 30.388482] [<ffffffffa0040fec>] drm_mode_set_config_internal+0x5c/0xe0 [drm]
Jul 8 23:23:03 tricky kernel: [ 30.388540] [<ffffffffa0043cd8>] drm_mode_setcrtc+0x2e8/0x540 [drm]
Jul 8 23:23:03 tricky kernel: [ 30.388589] [<ffffffff811617e6>] ? handle_pte_fault+0x96/0x230
Jul 8 23:23:03 tricky kernel: [ 30.388639] [<ffffffffa003352a>] drm_ioctl+0x50a/0x650 [drm]
Jul 8 23:23:03 tricky kernel: [ 30.388689] [<ffffffffa00439f0>] ? drm_mode_setplane+0x3f0/0x3f0 [drm]
Jul 8 23:23:03 tricky kernel: [ 30.390589] [<ffffffff8109339c>] ? account_user_time+0x9c/0xb0
Jul 8 23:23:03 tricky kernel: [ 30.392510] [<ffffffff811b2368>] do_vfs_ioctl+0x88/0x340
Jul 8 23:23:03 tricky kernel: [ 30.394435] [<ffffffff811b26b1>] SyS_ioctl+0x91/0xb0
Jul 8 23:23:03 tricky kernel: [ 30.396255] [<ffffffff8171822f>] tracesys+0xe1/0xe6
Jul 8 23:23:03 tricky kernel: [ 30.397670] Code: d6 de ff eb 9b e8 91 d0 e0 e0 90 66 66 66 66 90 55 48 89 e5 48 83 ec 10 48 8b 87 50 01 00 00 40 84 f6 48 8b 90 08 01 00 00 75 20 <80> 3a 00 74 5c c6 02 00 48 8b 80 08 01 00 00 49 c7 c0 69 b4 2c
Jul 8 23:23:03 tricky kernel: [ 30.400952] RIP [<ffffffffa024be90>] evergreen_hdmi_enable+0x20/0x90 [radeon]
Jul 8 23:23:03 tricky kernel: [ 30.402658] RSP <ffff88022f3d3b68>
Jul 8 23:23:03 tricky kernel: [ 30.404355] CR2: 0000000000000000
Jul 8 23:23:03 tricky kernel: [ 30.411223] ---[ end trace cf93979ade0a2572 ]---
Comment 1 Alex Deucher 2013-07-08 22:10:27 UTC
Please attach your full dmesg output.
Comment 2 Alex Deucher 2013-07-08 22:19:28 UTC
Created attachment 82200 [details] [review]
possible fix

Does this patch fix the issue?
Comment 3 Marco Trevisan (Treviño) 2013-07-08 23:15:24 UTC
Created attachment 82202 [details]
dmesg

Full dmesg of the failure
Comment 4 Marco Trevisan (Treviño) 2013-07-14 10:10:10 UTC
So, I've tried the patch on a 3.10 kernel that was affected and... Great, it Works! :)
FYI, I've also added a couple of error logs in the code to see where exactly the failure was and this was the result:

[    2.073137] [drm] fb mappable at 0xC035F000
[    2.073139] [drm] vram apper at 0xC0000000
[    2.073140] [drm] size 8294400
[    2.073141] [drm] fb depth is 24
[    2.073142] [drm]    pitch is 7680
[    2.073252] fbcon: radeondrmfb (fb0) is primary device
[    2.073323] [drm:evergreen_hdmi_enable] *ERROR* Invalid DIG (dig: ffff880229d0fc00, dig->afmt: (null)) /home/marco/Dev/debs/linux-3.10.0/drivers/gpu/drm/radeon/evergreen_hdmi.c:298
[    2.573589] [drm:evergreen_hdmi_enable] *ERROR* Invalid DIG (dig: ffff880229d0fc00, dig->afmt: (null)) /home/marco/Dev/debs/linux-3.10.0/drivers/gpu/drm/radeon/evergreen_hdmi.c:298
[    2.607449] Console: switching to colour frame buffer device 170x48
[    2.610160] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[    2.610162] radeon 0000:01:00.0: registered panic notifier
[    2.610237] [drm] Initialized radeon 2.33.0 20080528 for 0000:01:00.0 on minor 0

Full dmesg at http://pastebin.ubuntu.com/5873667/
I don't know if this may be only the side effect of another issue (wrong initialization order?).

One more thing I discovered in the past days, is that the kernel crash was not happening in case that the HDMI was plugged after that ligthtdm was running (attaching it anytime before was leading to a freeze).
Comment 5 Marco Trevisan (Treviño) 2013-07-14 10:10:54 UTC
Comment on attachment 82200 [details] [review]
possible fix

Review of attachment 82200 [details] [review]:
-----------------------------------------------------------------

Why not including some DRM_ERROR logs as well?
Comment 6 Christian König 2013-07-14 12:18:21 UTC
(In reply to comment #5)
> Comment on attachment 82200 [details] [review] [review]
> possible fix
> 
> Review of attachment 82200 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> Why not including some DRM_ERROR logs as well?

Because that isn't a bug.

You just have an LVDS pannel connected to a DIG encoder and so this encoder doesn't have an audio block.

Alex patch is already quite right.
Comment 7 Marco Trevisan (Treviño) 2013-07-31 03:27:33 UTC
Not sure if this has been already done, but I guess that this fix should be backported to linux 3.10
Comment 8 Alex Deucher 2013-07-31 13:08:37 UTC
The patch will show up in the stable kernels eventually.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.