Bug 104736

Summary: Kernel panic with agd5's drm-next-4.17-wip & GFX8/Polaris10/Ellesmere/Rx-480-8GiB
Product: DRI Reporter: Robin Kauffman <robink>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium CC: mike
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Serial kernel console for Panic/BUG/OOPS bug on Kaveri+GFX8
none
Possible fix none

Description Robin Kauffman 2018-01-22 18:40:58 UTC
Created attachment 136904 [details]
Serial kernel console for Panic/BUG/OOPS bug on Kaveri+GFX8

Hi-
    I'm having issues with bringing up a kennel on an AMD Kaveri (A10-7850K w/ R7 iGPU disabled) using a Radeon Rx 480 8GiB.  I'm using (at the time of this writing) the latest commit from agd5f's drm-next-4.17-wip branch (commit 006a62b26fbb0b7dcd8061f83889e0514ea42d17 ('drm/ttm: Don't unreserve swapped BOs that were previously reserved')).
    When bringing up the kernel, it acts like it's initializing the (DisplayPort-connected) display, but then the screen blacks out and the display finally goes to sleep.  I've attached a kernel log (grabbed from a serial connection (the kernel panics too soon to bring up the various network devices, and thus the network console)), which hopefully has enough verbosity to be useful.  It's also available at https://gist.github.com/Haifen/f783d5f6b2ab3d30805d9f5315b7c675
    Please let me know if there's any more information you require.

        -Robin K.
Comment 1 Robin Kauffman 2018-01-22 18:45:13 UTC
Oops, s/kennel/kernel/
Comment 2 Alex Deucher 2018-01-23 22:19:39 UTC
does manually loading gpu_sched before amdgpu fix the issue?
Comment 3 Robin Kauffman 2018-01-24 02:52:01 UTC
(In reply to Alex Deucher from comment #2)
> does manually loading gpu_sched before amdgpu fix the issue?

Can try to do so, but unsure as to how given that pretty much the entire graphics stack is compiled into the kernel image.
Reverting commit 4983e48c8539282be15f660bdd2c4260467b1190 ('drm/sched: move fence slab handling to module init/exit') fixes the issue, and thus this bug may be a duplicate of bug #104756.  If so, this bug can be closed as a duplicate, or I can experiment with trying to force gpu_sched up first.
Comment 4 Michel Dänzer 2018-01-24 08:41:43 UTC
Please attach the CONFIG_DRM entries of the kernel build configuration file.
Comment 5 Christian König 2018-01-24 10:07:02 UTC
*** Bug 104756 has been marked as a duplicate of this bug. ***
Comment 6 Christian König 2018-01-24 10:47:03 UTC
Created attachment 136937 [details] [review]
Possible fix

I was able to reproduce the problem and the attached patch should fix it.
Comment 7 Mike Lothian 2018-01-24 10:53:41 UTC
I can confirm is does fix the issues I was seeing

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.