Bug 108607 - *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)
Summary: *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)
Status: CLOSED DUPLICATE of bug 108585
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: PowerPC Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-30 23:45 UTC by Joel
Modified: 2018-10-31 08:12 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
4.19.0-11706-g11743c56785c dmesg (66.25 KB, text/plain)
2018-10-30 23:45 UTC, Joel
no flags Details
firmware kernel (linux-as-bootloader) (52.50 KB, text/plain)
2018-10-30 23:46 UTC, Joel
no flags Details

Description Joel 2018-10-30 23:45:33 UTC
Created attachment 142289 [details]
4.19.0-11706-g11743c56785c dmesg

Booting a ppc64le system results in this crash in Linux:

4.19.0-11706-g11743c56785c (linus' tree mid-4.20 merge window)


[    5.689582] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)
[    5.689609] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v8_0> failed -22
[    5.689611] amdgpu 0033:01:00.0: amdgpu_device_ip_init failed
[    5.689613] amdgpu 0033:01:00.0: Fatal error during GPU init
[    5.689614] [drm] amdgpu: finishing device.
[    5.699992] EEH: Frozen PHB#33-PE#0 detected 
[    5.699997] EEH: PE location: CPU2 Slot1 (16x), PHB location: N/A
[    5.700000] CPU: 88 PID: 895 Comm: kworker/88:1 Not tainted 4.19.0-11706-g11743c56785c #37
[    5.700007] Workqueue: events work_for_cpu_fn
[    5.700008] Call Trace:
[    5.700012] [c00020000b603778] [c000000000ad3dac] dump_stack+0xb0/0xf4 (unreliable)
[    5.700015] [c00020000b6037b8] [c000000000039638] eeh_dev_check_failure+0x458/0x580
[    5.700017] [c00020000b603858] [c0000000000397f8] eeh_check_failure+0x98/0xe0
[    5.700043] [c00020000b603898] [c0080000125520c8] amdgpu_mm_rreg+0x240/0x2a0 [amdgpu]
[    5.700073] [c00020000b6038f8] [c0080000125a4484] vi_common_hw_fini+0x3c/0xc0 [amdgpu]
[    5.700104] [c00020000b603928] [c008000012702b9c] amdgpu_device_fini+0x230/0x58c [amdgpu]
[    5.700131] [c00020000b6039d8] [c0080000125580d0] amdgpu_driver_unload_kms+0xe8/0x1f0 [amdgpu]
[    5.700157] [c00020000b603a18] [c008000012558434] amdgpu_driver_load_kms+0x25c/0x290 [amdgpu]
[    5.700161] [c00020000b603a98] [c000000000668d7c] drm_dev_register+0x1bc/0x270
[    5.700186] [c00020000b603b38] [c0080000125506bc] amdgpu_pci_probe+0x114/0x200 [amdgpu]
[    5.700189] [c00020000b603bc8] [c0000000005b087c] local_pci_probe+0x6c/0x140
[    5.700191] [c00020000b603c58] [c000000000105448] work_for_cpu_fn+0x38/0x60
[    5.700192] [c00020000b603c88] [c00000000010a0e0] process_one_work+0x2b0/0x560
[    5.700194] [c00020000b603d18] [c00000000010a660] worker_thread+0x2d0/0x610
[    5.700197] [c00020000b603db8] [c0000000001133cc] kthread+0x1ac/0x1c0
[    5.700200] [c00020000b603e28] [c00000000000b790] ret_from_kernel_thread+0x5c/0x6c


For those not familiar with ppc6le bare metal systems: they use linux-as-a-bootloader, so this is the second kernel that has loaded. The first kernel in this case was 4.19.0 (upstream release, no patches). The first kernel also load the amdgpu driver.
Comment 1 Joel 2018-10-30 23:46:04 UTC
Created attachment 142290 [details]
firmware kernel (linux-as-bootloader)
Comment 2 Joel 2018-10-30 23:46:35 UTC
I suspect this is the same issue as https://bugs.freedesktop.org/show_bug.cgi?id=108585 which is also a ppc64le system.
Comment 3 Christian König 2018-10-31 08:12:00 UTC
Yeah, closing this one as a duplicate.

*** This bug has been marked as a duplicate of bug 108585 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.