Summary: | [drm:.r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | intermediadc <intermediadc> | ||||||||||||||||||||||||||||||
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> | ||||||||||||||||||||||||||||||
Status: | RESOLVED MOVED | QA Contact: | |||||||||||||||||||||||||||||||
Severity: | critical | ||||||||||||||||||||||||||||||||
Priority: | medium | CC: | aperez, bjorn.helgaas, chzigotzky, erhard_f, jjcogliati-r1, julien.isorce, lyssdod, rsalvaterra | ||||||||||||||||||||||||||||||
Version: | unspecified | ||||||||||||||||||||||||||||||||
Hardware: | Other | ||||||||||||||||||||||||||||||||
OS: | All | ||||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||||||||||
Attachments: |
|
Please attach your dmesg output. Created attachment 129699 [details]
dmesg
Hi Alex/
my dmesg, if something else needed no problem
Thanks
Does your board have a different PCIE slot you could try? I vaguely recall there being problems with some pcie slots on some PPC macs. See bug 89886. Hi Alex, yes i have a Quad G5 with 4 pcie slots. but before Xorg 1.6 radeonhds gpus was working all and right on the 8x (0001:06:00.0), now the only way for made the gpus work is inside the 16x (0000:0a:00.0) this is my lspci : 0000:00:0b.0 PCI bridge: Apple Inc. CPC945 PCIe Bridge 0000:0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Whistler LE [Radeon HD 6610M/7610M] 0000:0a:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Turks HDMI Audio [Radeon HD 6500/6600 / 6700M Series] 0001:00:00.0 Host bridge: Apple Inc. U4 HT Bridge 0001:00:01.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3) 0001:00:02.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3) 0001:00:03.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3) 0001:00:04.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3) 0001:00:05.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3) 0001:00:06.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3) 0001:00:07.0 PCI bridge: Apple Inc. Shasta PCI Bridge 0001:00:08.0 PCI bridge: Apple Inc. Shasta PCI Bridge 0001:00:09.0 PCI bridge: Apple Inc. Shasta PCI Bridge 0001:01:01.0 Non-VGA unclassified device: Device 0800:0002 (rev 08) 0001:01:07.0 Unassigned class [ff00]: Apple Inc. Shasta Mac I/O 0001:01:0b.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) 0001:01:0b.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) 0001:01:0b.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) 0001:03:0c.0 IDE interface: Broadcom K2 SATA 0001:03:0d.0 Unassigned class [ff00]: Apple Inc. Shasta IDE 0001:03:0e.0 FireWire (IEEE 1394): Apple Inc. Shasta Firewire 0001:05:04.0 Ethernet controller: Broadcom Limited NetXtreme BCM5780 Gigabit Ethernet (rev 03) 0001:05:04.1 Ethernet controller: Broadcom Limited NetXtreme BCM5780 Gigabit Ethernet (rev 03) 0001:06:00.0 VGA compatible controller: NVIDIA Corporation G70GL [Quadro FX 4500] (rev a1) i forget to write ... yes i had been test all pcie slots (In reply to intermediadc@hotmail.com from comment #4) > Hi Alex, > yes i have a Quad G5 with 4 pcie slots. > but before Xorg 1.6 radeonhds gpus was working all and right on the 8x > (0001:06:00.0), now the only way for made the gpus work is inside the 16x > (0000:0a:00.0) So it does work in one of the slots or neither or...? If this is a regression can you bisect? If not, you'll probably have to bring it up with someone more familiar with the pcie design on those boards. hi alex, it work in the 16x but without video accelerations and with the issue reported in the beginning ... about bisect i can try. now on fedora is installed 1.19.1-2 im sure i will need to go down to 1.16 or can do a better think install last distro where all was working right and see there what version is of xorg? it is for sure the lastest distro where all was working http://cdimage.ubuntu.com/lubuntu/releases/trusty/release/ (In reply to intermediadc@hotmail.com from comment #7) > hi alex, > it work in the 16x but without video accelerations and with the issue > reported in the beginning ... about bisect i can try. now on fedora is > installed 1.19.1-2 im sure i will need to go down to 1.16 or can do a > better think install last distro where all was working right and see there > what version is of xorg? > it is for sure the lastest distro where all was working > http://cdimage.ubuntu.com/lubuntu/releases/trusty/release/ The ring failure is a kernel issue. Please bisect the kernel. Created attachment 129726 [details]
xorg new
Hi Alex,
making more tests changing totally my inside pci setting and .
no ring issue but xorg crash totally if i select the radeon. the only way for have xorg running is run it with fbdev this happen if i use not 16x for the gpu but other slower pcie slots.
Making more testing with my oldest kernels where radeon was working right and report here my last log
Created attachment 129727 [details]
dmesg
my last dmesg
Created attachment 129728 [details]
glamor build
Hi Alex i did a new test changing the GPU with a RadeonHD 7750 2GB
here there is not a ring issue but when glamor egl is loaded it made xorg crash.
:-( . i did a test i rename the glamor and xorg working right with radeon. but no acceleration.
I been try to build the glamor for understand what can be the issue and probably i found a big endianess there i attached the configure with build log. :-/
Hi All, I have the same problem. System: A-EON AmigaOne X1000 PowerPC Nemo board with a P.A. Semi PA6T-1682M CPU 8GB RAM ASUS Radeon HD7790 1GB VRAM ubuntu MATE 17.04 PowerPC (PPC32) with the kernels 4.10-rc8 (DRM 2.49.0) and 4.9.10 (DRM 2.48.0) Mesa 17.0.0 Cheers, Christian (In reply to Christian Zigotzky from comment #12) Hello everyone.. I also have the same problem.. A-EON Amiga X5000 PowerPC Varisys Cyrus Board with P5020 8GB RAM ASUS Radeon R7 265 Ubuntu Mate 17.04 - Mesa 17.0.0 Kernel 4.10-rc8 (In reply to intermediadc@hotmail.com from comment #9) > Created attachment 129726 [details] > xorg new > > Hi Alex, > making more tests changing totally my inside pci setting and . > no ring issue but xorg crash totally if i select the radeon. the only way Let's start with the case where the kernel ring tests succeed. What slot was that? The kernel driver needs to work correctly before we try anything else. Now, in the case where the kernel driver loads properly with no errors, what happens next? What do you mean by "xorg crash totally if i select the radeon"? segfault? blank screen? Something else? Can you provide the xorg log when this happens? (In reply to Christian Zigotzky from comment #12) > Hi All, > > I have the same problem. > > System: > > A-EON AmigaOne X1000 PowerPC > Nemo board with a P.A. Semi PA6T-1682M CPU > 8GB RAM > ASUS Radeon HD7790 1GB VRAM > ubuntu MATE 17.04 PowerPC (PPC32) with the kernels 4.10-rc8 (DRM 2.49.0) and > 4.9.10 (DRM 2.48.0) > Mesa 17.0.0 (In reply to Sinan Gürkan from comment #13) > (In reply to Christian Zigotzky from comment #12) > > > Hello everyone.. > > I also have the same problem.. > > A-EON Amiga X5000 PowerPC > Varisys Cyrus Board with P5020 > 8GB RAM > ASUS Radeon R7 265 > Ubuntu Mate 17.04 - Mesa 17.0.0 > Kernel 4.10-rc8 Please try to provide more details. It sounds like there may be several issues at play here. What do you mean by the same problem? ring test errors in the kernel driver? Some sort of Xorg "crash"? Created attachment 129780 [details] my xorg log 7xxx (In reply to Alex Deucher from comment #14) > (In reply to intermediadc@hotmail.com from comment #9) > > Created attachment 129726 [details] > > xorg new > > > > Hi Alex, > > making more tests changing totally my inside pci setting and . > > no ring issue but xorg crash totally if i select the radeon. the only way > > Let's start with the case where the kernel ring tests succeed. What slot > was that? The kernel driver needs to work correctly before we try anything > else. Now, in the case where the kernel driver loads properly with no > errors, what happens next? What do you mean by "xorg crash totally if i > select the radeon"? segfault? blank screen? Something else? Can you > provide the xorg log when this happens? Hi alex now the only slot usable is the first one the 16x, before xorg 1.6 the only way for have not apple gpu was the 8x and there there was not issue. Now radeon 4xxx work on 16x with accelerations but freeze, radeon 5xxx and 6xxx work but have (cafedead)only fbdev run. 7xxx run no issue but glamor crash. I open another bug for the glamor. Michel Danze ask a gdb debug of xorg.. but im not able to made it in fedora if i run xorg on gdb i have a black screen with only a cursor on the top of the screen :-/ i will check if on ubuntu mate i will able to make a gdb of xorg. where there i have fence errors on ring 0 and 3 or glitched video. if i delete the glemoregl.so module xorg work i have desktop but no gpu accelerations. i atteached my xorg.log with 7750 The other guys i think have my same issue with radeonsi/amdgpu on ppc hw. why im sure it is an xorg issue or one of xorg component issue (cafedead),because on Lubuntu 14.04.5 i had kernel 4.9 and no issue but xorg was 1.6 (In reply to intermediadc@hotmail.com from comment #16) > why im sure it is an xorg issue or one of xorg component issue > (cafedead),because on Lubuntu 14.04.5 i had kernel 4.9 and no issue but xorg > was 1.6 X has nothing to do with the kernel ring tests. Those happen long before X gets involved. If the ring tests fail, all acceleration is disabled so you need to sort out the kernel issue before addressing X. umm thans for the info. i have an idea... but need to check oldest kernel version. i see last release of kernel made radeon load only as a module. dont made radeon as a y . on embenbed ppc machine for dont have cafedead i sow on nxp faq they suggest load radeon as y and include the radeon firmwares inside the kernel. this is not possible on server. will check and report (In reply to Alex Deucher from comment #17) > (In reply to intermediadc@hotmail.com from comment #16) > > why im sure it is an xorg issue or one of xorg component issue > > (cafedead),because on Lubuntu 14.04.5 i had kernel 4.9 and no issue but xorg > > was 1.6 > > X has nothing to do with the kernel ring tests. Those happen long before X > gets involved. If the ring tests fail, all acceleration is disabled so you > need to sort out the kernel issue before addressing X. Hi Alex, can you check this my post? can be a kms issue this and cafedead too? on my quad g5 i have the RadeonHD 7750 as first 16x pcie 0000:0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] 0000:0a:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] and the open firmware nouveau as the 3rd pcie 0001:06:00.0 VGA compatible controller: NVIDIA Corporation G70GL [Quadro FX 4500] (rev a1) On my kernel 4.10.8 i dont have a nouveau module installed . this is my kenrel append: nouveau.modeset=0 video=nouveaufb:off radeon.modeset=1 video=radeonfb:off pci=realloc radeon.dpm=1 pci=realloc video=offb:ff radeon.dpm=1 nosplash lsmod show drm_kms_helper 2 amdgpu,radeon i removed the /etc/X11/xorg.conf If i made an xorg -configure , xorg produce a nouveau configuration xorg.conf on busid 6:0:0 starting X made a black screen with cursor blinking so i removed the xorg driver for nouveau xorg -configure and xorg produce a modesetting xorg.conf on busid 6:0:0 starting X made a black screen without blinking cursor so i removed the /usr/bin/xorg/modules/driver/modesetting and xorg -configure dont find any screen and not made any xorg.conf.new i made my xorg.conf with radeon busid 10:0:0 i have a glitched graphic ... i removed glemour module on xorg i have radeon working but without acceleration... (In reply to intermediadc@hotmail.com from comment #19) > Hi Alex, > can you check this my post? can be a kms issue this and cafedead too? > I'm not sure what you are asking. The kernel driver loads first and initializes the hardware. If the ring test fails, the driver disables the 3D engine which in term disables acceleration for the rest of the stack. > on my quad g5 i have the RadeonHD 7750 as first 16x pcie > > 0000:0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. > [AMD/ATI] Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] > 0000:0a:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape > Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] > > and the open firmware nouveau as the 3rd pcie > 0001:06:00.0 VGA compatible controller: NVIDIA Corporation G70GL [Quadro FX > 4500] (rev a1) > > > On my kernel 4.10.8 i dont have a nouveau module installed . > this is my kenrel append: > nouveau.modeset=0 video=nouveaufb:off radeon.modeset=1 video=radeonfb:off > pci=realloc radeon.dpm=1 pci=realloc video=offb:ff radeon.dpm=1 nosplash > > > lsmod show drm_kms_helper 2 amdgpu,radeon > > i removed the /etc/X11/xorg.conf > > If i made an xorg -configure , > xorg produce a nouveau configuration xorg.conf on busid 6:0:0 > starting X made a black screen with cursor blinking > so > i removed the xorg driver for nouveau > xorg -configure > and xorg produce a modesetting xorg.conf on busid 6:0:0 > starting X made a black screen without blinking cursor > > so > i removed the /usr/bin/xorg/modules/driver/modesetting > and xorg -configure dont find any screen and not made any xorg.conf.new > > i made my xorg.conf > with radeon busid 10:0:0 > i have a glitched graphic ... Does physically removing the nvidia card help? Note also that I think X uses decimal for busid directives while lspci uses hex. > > i removed glemour module on xorg i have radeon working but without > acceleration... glamor is required for acceleration. hi Alex sorry my english is not perfect. i will try to explain better. I know glamor is need for accelerations but if it is present i have glitched video with fence error on rings 0 and 3 with gpu looked up. i cant remove the nvidia without it the g5 quad dont boot. anther important thing i forget to write. if i put nouvau.modeset=1 and radeon.modeset=0 xorg find the radeon and not the nvidia but because the modeset on radeon is set to zero i dont have the xorg desktop. i have exactly the specular situation that explained on the previous post. Luigi Hi Alex, i made many tests about and understand the issue on 8x slot (black screen with only pulse cursor) probably is gaven by Mesa package. I found this issue on 7750 pratically there when radeonsi is loaded it made the crash of the fence and black screen or totally glitched screen. I will made the same test i had did for 7xxx serie on 6xxx and will see if i will have desktop witout mesa and will report here. in case you can check my tests on 7750 here: https://bugs.freedesktop.org/show_bug.cgi?id=99859#c14 sorry for my eglish is not really good but im try to explain at my best. Created attachment 129957 [details]
no more cafedead but only fbdev
Hi Alex,
just finished to made the test.
removed the mesa the issue continue be there if radeon driver is present
i use the radeon card on 8x pcie .
but the news is i can use on 8x the fbdev video and i see there is not cafedead on drm
thanks to this i can send you this this log .
the issue on 8x probably is because of this (the black screen) hope can help.
if are needed some more logs please ask i will try to help how i can
Hi Alex, did the last test. i been downgraded the kernel to 4.0.0 rc3 it was working kernel on g5 quad . i been used it to made this guide in past you can see it here:https://ubuntuforums.org/showthread.php?t=2274612&p=13269146#post13269146 i have the same issue reported as before nothing change with swapping the kernel Ciao Luigi Hi Alex, yesterday i fixed. googling i found and old post made here in bug.freedesktop pratically the bug is in xorg and well reported https://bugs.freedesktop.org/show_bug.cgi?id=98524. when the gpus are in the others slots not the first one 16x is need to manually set the busid to eg: 6@1:0:0 or xorg will fail with a black frozen screen. this issue is not present if the bus is the first one where is just need to set 10:0:0. but on first one cafedead come. Thanks Luigi Need to notice the issue now is present on 8x slot too on kernel 4.11 rc3 (In reply to intermediadc@hotmail.com from comment #26) > Need to notice the issue now is present on 8x slot too on kernel 4.11 rc3 Can you bisect? That might give us a clue as to the overall root cause (whether it's a driver or platform issue). I can but i dont know how to ... im a newbee in this geek things. but for sure of 411.rc1 was not present . i suggest wait the 4.11.rc4 because on ml i sow there are many patch for rc 4.11rc3 in case the issue is there i will bisect .. but please let me know how to do . (In reply to intermediadc@hotmail.com from comment #28) > I can but i dont know how to ... im a newbee in this geek things. > but for sure of 411.rc1 was not present . i suggest wait the 4.11.rc4 > because on ml i sow there are many patch for rc 4.11rc3 in case the issue is > there i will bisect .. but please let me know how to do . I think it would be good to bisect it now. Google for "linux kernel git bisect" The same problem on my machine: # dmesg | grep -i drm [ 11.753729] [drm] radeon kernel modesetting enabled. [ 11.760535] [drm] initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x1787:0x2291 0x00). [ 11.767360] [drm] register mmio base: 0x80140000 [ 11.770683] [drm] register mmio size: 131072 [ 11.899433] [drm] GPU not posted. posting now... [ 11.903067] [drm] Detected VRAM RAM=512M, BAR=256M [ 11.903068] [drm] RAM width 64bits DDR [ 11.934284] [drm] radeon: 512M of VRAM memory ready [ 11.936547] [drm] radeon: 1024M of GTT memory ready. [ 11.938855] [drm] Loading CEDAR Microcode [ 13.060828] [drm] Internal thermal controller with fan control [ 13.075872] [drm] radeon: dpm initialized [ 13.099787] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 13.151601] [drm] PCIE GART of 1024M enabled (table at 0x000000000014C000). [ 13.166212] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 13.166214] [drm] Driver supports precise vblank timestamp query. [ 13.166323] [drm] radeon: irq initialized. [ 13.377755] [drm:0xd000000001d22d04] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) # inxi -b System: Host: T801 Kernel: 4.11.1-gentoo ppc64 (32 bit) Console: tty 0 Distro: Gentoo Base System release 2.3 Machine: No /sys/class/dmi; using dmidecode: dmidecode is not installed. CPU: Dual core PPC970MP altivec supported (-MCP-) speed/max: 2300.000000MHz/2300 MHz Graphics: Card-1: NVIDIA NV43 [GeForce 6600] Card-2: Advanced Micro Devices [AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series] Display Server: X.org 1.19.3 drivers: fbdev,ati,radeon,nouveau tty size: 211x52 Advanced Data: N/A for root out of X But I can confirm the ring test succeeds on kernel 4.10.16. I will try to bisect starting from the 4.11-rcs. Unfortunately my bisect efforts came to a halt as the current kernel won't boot. Early OpenFirmware boot console shows "Kernel panic - not syncing: Couldn't allocate pmd pagetable caches". The G5 freezes, I am not able to log in via sshd. Created attachment 131410 [details]
incomplete bisect.log
Created attachment 131411 [details]
bisect kernel .config
For any commits that you can't test, run "git bisect skip". Eventually, it will either identify the first bad commit or specify the minimum set of candidates. Also have a problem on Sam460ex AMCC460ex machine. [drm] radeon: irq initialized. [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD) radeon 0001:81:00.0: disabling GPU acceleration We have discussed it before . Hi i can confirm the CAFEDEAD issue is present on AMDGPU too on Qoriq PPC64 Machine . Tested with a Radeon R7 250 2GB This issue on this Machine is not present for r600. Tested on 4.11.1 [ 12.980953] amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device [ 12.982746] [drm:.gfx_v6_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: ib test failed (scratch(0x2140)=0xCAFEDEAD) [ 12.982890] [drm:.amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on GFX ring (-22). [ 12.982999] [drm:.amdgpu_device_init [amdgpu]] *ERROR* ib ring test failed (-22). [ 13.296975] [drm] Initialized amdgpu 3.10.0 20150101 for 0000:01:00.0 on minor 0 [ 34.022373] [drm] xxxx: dce_v6_0_afmt_setmode ----no impl !!!!!!!! [ 569.726065] [drm] IH: HPD1 [ 570.299355] [drm] IH: HPD1 Qoriq PPC64 Fedora Server 25 :-( :-( Ok here we are, at last... The final git bisect output: There are only 'skip'ped commits left to test. The first bad commit could be any of: 1a55761392b8a4703039614f3ef6275f88dad4c3 3bb0356bb1b5a2c4434b4a3267df6dd9927ca5bf b2e6d3055d5545b97533d4e8376fa848639d9951 d9520971589b008faa8df31aa2bb3d1e4f1c4534 4c2ae32d4c6ccd63819f36702f8455a5b08eaca8 5dcda98a3b7cfc0025ca49a700765ac9ba78a6a7 8b9c156262f9ebaa88736b93263c1af5ae5fa43e e34e38bf891bd7bf7c9305ff1d2e1b29a77dd159 0dc49eb2a25a6b48e235803eeb2b52677f577ca5 3ec2574e31b5aebeeed52ea784bb6d7960788202 68094b4b8d72360f9a90804c203e4292c6662fdd e2dc4f225b765c2a12e5eaa7bc09994e67d13741 dda718926c0fe091fd5c7ed3f88c9542d8bf67cf 56195e9d1d68a9c201587cb55b7fdf6d43fb31b6 906c142634def5dd3dda533d5d20bd3c1c19d1fb 63ab93f021ecd815d848c3e9d3e326fa9628e5a9 7a2b3f024b8b724257d82f76690e68c69d797efd bcea623c65d5ad3b2f4e51908cffc9423175a10b a0560209f10cb55fbfcdd11ef85cfb444f8bea89 b98a7f7509ed7a031ed750be11b72b7ca1c95513 feb85d9b1c47ea8dbcae559ff127b433fdb245b7 314fc854f50317931fb4dfaab431695ab886e8de 5f334db665173facf2213854408bb5fa2445d0b3 81efbaddd687f1b478c15665716fd545e2e4401e f08bf55e690ebdac28402955ef953e11ff478668 442ec4c04d1235f8c664a74004dae54a7a574d18 af3a2ab5daa0b8bf188a039c122a670cdc8b9544 e4e7d59704d485f272061cea9057798dda3cfd99 40f67fb2c384fe12741aa35010d62bfe8c98286c c3cf2c61ddc1410424da0ea87717edf16fc296c5 19ce01cc8cbc314d73db9755715a8f6e8ad59a7f 948b7620c15411444167a62cfc14cdd4b0e44682 9d534265bdcd47659c9fd080fff4c005512c4770 e7cd7ef58e1fedb09b16720919869a81d7a2b867 34f80c7ddfffe262bf04fb03e198e64de4cec7fc 3674cc49da9a8fc55bf1dec2ab03a66c77f2dcdf 8f417ca9ebfa8701a41db64f5ed9cbb01b8e4219 9bcf0a6fdc5062e451cd6f1ab39045e142a5938f 862290f9e23c39051e59bf12ce65707a8ec8b911 b90dc392212d1153a12eea15cbc6eae352a3c989 cf0adb8e281b69801fb8faef18c14443d9d41d3c 11a61a8602812c024d8c404193ce1654ee3b8f08 699c4cec238731a4c466f73fe6e9e45ab6f49a41 9e38637a0832f15552ae130622808831f515107c ad8ec41afa98615a4154eee0121bcf8276695327 1f6c4501c667a6bf25996f04c9cae49a88d90d01 3278478084747c02725ba804d672235f2ba56bbc 47feb41888bc82ba5c9268c344775adc483d6de7 5800790a925b0aefb621ae3da86668c3a2867750 a142f4d3e5c54db5e942fa6ee5f3dc0e8c83207b cd3e2eb8905d14fe28a2fc75362b8ecec16f0fb6 2bdd584a7555d5ac3341b61e751a6ac807a9814a 3adfb572f2978a980b250a9e1a56f84f3a031001 3fb5561879d71b5b80ddb48b3e7e5fa18c696d2a 92004a064875e45f0ceec72f6962e258e59f0682 aeda9adebab8b5bdd90576e3065a1f3f948279ad ebe85a44aad47bb5f08c5cbd939eb5e40956c73b f1d722b607d610b66785f7f00d2e1d457260647c fed678145d02d08d75825d9f0854fad93cffd1a0 0b351c986a5ffedb502632c1b27690c9730d79c4 2681c0e7ff70a6d41a81527f95f2edabafea4ace 4fe0395550aeb6709ea5332f46de3644aef7d328 60db3a4d8cc9073cf56264785197ba75ee1caca4 72f2ff0deb870145a5a2d24cd75b4f9936159a62 ab5fe4f4d31ee27df9b8d7720da710d955d55737 f1f0366dd6be9624f7d355b72cc909ab821eb4c0 21b7245034aa83c80fb8d0011e18d597a62a91f1 33be632b8443b6ac74aa293504f430604fb9abeb 42d87e3ffbd53c4514bccc0f24e40d6231a567f5 4788316f743539076712d2b80b6cd289458fe2be 602d38bc65aa2926d1ecd290a348e87aa8c21290 7faebda21d573a6889bab1e0100ed4092a5a4716 b5a0a9b59c8185aebcd9a717e2e6258b58c72c06 cdcb33f9824429a926b971bf041a6cec238f91ff ce709f86501a013e941e9986cb072eae375ddf3e e3538f402453beca2d83002910cfe13b43d8a95b e75377404726be171d66c154f8ea1e6cf840811d ff1677e231651205e7e19770a677057dea05cb70 1acf8bca9cdb2443a8707ff0afc3aadbfb5669f4 26b54be568ad8611ebc62f02f9a5d5328e7b3392 4a9b0933bdfcd85da840284bf5a0eb17b654b9c2 4d4836ab70d3c59bc934d244f0cccdd035c1ead8 53762ba810398c11efaf65f9a45d992125e86dcf 5b0948dfe138f0837699f46f5877f4f81c252dac 7da7a1a66e7700903bbc9ed09256fc95da34d43d 8267b07526cabe2e2afc834a138ece8644af87ed 8ed81ec82a8c57c3a6ad69b4c4d3e4801163c256 950bf6388bc22c2749b8b66c501df1462639d6bd a2ec1996098c7da0593a0981190316025301eab1 a71280722eeba8f1afa51ad6656028dcb96e110b abdbf4d635a9a8c956bb9757a9d4f08c2abe1f97 afc9595ea4770f0157ae06fb3acedff703e169b6 b2103ccbb67e3ef0f7a75d21c989f9614ddbcaca ca0ef7fecc881a7e8d11db2d8852ab580cd29e03 d6da7d90fad8e34afdebeadbb08484ea4c98a792 e8e8dd6d20fa55ff974460ad742fcbf35b301062 013dd3d5e1835c2cfa9c824e61465b61509afa54 0fc1223f0e77a748f7040562faaa7027f7db71ca 1ded56df3247d358390ae6dc09ccee620262ac5f 25e77388e1ab63e11e21d94a994eca227472aeed 2fd260f03b6a365bad48522f3948463928f91c2c 4e0a90b381bd8bddf1644591dc585cf4c6ea652e 4f69bd16df1a38c32d5d207a96d1d86df4808f87 60e2e2fbafdd1285ae1b4ad39ded41603e0c74d0 63d182abd71cb47cee4adb8dd2afd71d987794d5 656795c8873f93956a031d5db6fb08cf6168ebb0 6dc2c04fd9868a5ad00b402935021d6f3ff27b17 70bc1b684b49781c882fec44023b37b7b45fe359 7184f5b451cf3dc61de79091d235b5d2bba2782d 792e0a6814b63b120c6cfddf79a309046c6e840a 87b336d003d47876e376d943be3c9d35152f3b86 8ca6e0a75a5145458e8a680edf2394375f2129da 977509f7c5c6fb992ffcdf4291051af343b91645 c4d052ce970ea98e9e1cc72461ba3b7a25397657 c5c4d3a3f4a8c830cb514eb469a1025df2df0379 caf3f562e1161a86bd48a4c4c33af89d3693c658 d9bf28e2650fe3eeefed7e34841aea07d10c6543 dadf17334f3820e2f8c561011706b6fb99bf9860 e5c3b3e9f012b8862753a04f03c6e27344332718 e94888d23736cec51ba851f6e798d0eeb9ef5f41 ec6bd78a09d9967c4fcec53e7fabfaabd4f0e367 60e8d3e11645a1b9c4197d9786df3894332c1685 Keine binäre Suche mehr möglich! Created attachment 131536 [details]
bisect.log
Created attachment 131537 [details]
bisect kernel .config
That's quite a lot of candidates still. To narrow it down further, you can find out which commit fixed the boot failure, and then manually apply that for testing the skipped commits. What would be the best way to do this? Start a 2nd bisect in search for the commit which causes/fixes the boot failure and exclude/include it in my 1st bisect? Sorry, I don't have that much experience in bisecting the kernel. Finally I was able to finish the bisect with some help from slyfox from #gentoo-powerpc. At last... I made a second bisect in search of the fix for the "Couldn't allocate pmd pagetable caches" error, which I found in commit bf5ca68dd2eef59a936969e802d811bdac4709c2 (powerpc: Fix pgtable pmd cache init). I applied this commit via git cherry-pick --no-commit to my first bisect and was able to continue withouth further skips. So, the final candidate for the "ring 0" error is: commit 60db3a4d8cc9073cf56264785197ba75ee1caca4 (refs/bisect/bad) Author: Sinan Kaya <okaya@codeaurora.org> PCI: Enable PCIe Extended Tags if supported Created attachment 131684 [details]
ring0 bisect.log
Created attachment 131685 [details]
pgtable fix bisect.log
Created attachment 131686 [details]
bisect kernel .config
Hmm, I wonder if that commit triggered any other issues? Bjorn? Hi Michel, can be, i have many kernel traces on Qoriq machine from latest kernels but for what i see look like a qman issue . i dont know if qman have something related to pci/pcie Any news on this? Just tried kernel 4.12.2 which is still affected. This is still happening with kernel 4.13-rc4, with both a polaris 11 based RX460, as well as a Cape Verde PRO. In my case, it's on a BE ppc64 machine (e5500-based ppc64 SoC, Freescale-manufactured) looks like dupe from #95015 Not necessarily. This bug here was introduced somewhere in the 4.11 kernel development (see bisect), 4.12.x still affected. I don't have this problem with kernel 4.10.x and 4.9.x. The other bug #95015 is about 4.4.x kernel, though the error message is the same. So maybe the two have a common cause, but that's only guesswork from my side. Just for reference, the ring 0 test failed error comes from the line: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/r600.c#n2848 Basically, the r600 driver writes 0xCAFEDEAD into a scratch register, then uses a ring write to try and write 0xDEADBEEF into the scratch register, and then waits and then reads the scratch register again to see what is in it. If it is not 0xDEADBEEF, then the test fails and hardware acceleration is turned off. if 60db3a4d8cc9073cf56264785197ba75ee1caca4 caused it to start, https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=60db3a4d8cc9073cf56264785197ba75ee1caca4 then maybe the hardware doesn't support 256 concurrent requests? Joshuas comment #54 made me curious if it's a card specific issue. So I swapped some cards around my boxes. I run both machines on Kernel 4.13.5. on my PowerMac G5 11,2: Radeon HD 5450: hits the bug Radeon HD 6570: hits the bug Radeon X600: WORKS on Athlon X4 845: Radeon HD 5450: WORKS Radeon HD 6570: WORKS Radeon X600: WORKS The r380 Radeon X600 passes the ring test whereas the r600 cards fail: # dmesg | grep -i drm [ 9.523830] [drm] radeon kernel modesetting enabled. [ 9.525292] [drm] initializing kernel modesetting (RV380 0x1002:0x5B62 0x1002:0x0F02 0x00). [ 9.595072] [drm] GPU not posted. posting now... [ 9.740409] [drm] Generation 2 PCI interface, using max accessible memory [ 9.740442] [drm] Detected VRAM RAM=128M, BAR=128M [ 9.740445] [drm] RAM width 64bits DDR [ 9.740785] [drm] radeon: 128M of VRAM memory ready [ 9.740789] [drm] radeon: 512M of GTT memory ready. [ 9.740810] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 9.742591] [drm] radeon: 1 quad pipes, 1 Z pipes initialized [ 9.763583] [drm] PCIE GART of 512M enabled (table at 0x0000000088040000). [ 9.763705] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 9.763708] [drm] Driver supports precise vblank timestamp query. [ 9.763818] [drm] radeon: irq initialized. [ 9.763825] [drm] Loading R300 Microcode [ 10.233486] [drm] radeon: ring at 0x0000000068001000 [ 10.233545] [drm] ring test succeeded in 0 usecs [ 10.234665] [drm] ib test succeeded in 0 usecs [ 10.235279] [drm] Radeon Display Connectors [ 10.235284] [drm] Connector 0: [ 10.235287] [drm] DVI-I-1 [ 10.235289] [drm] HPD1 [ 10.235293] [drm] DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 [ 10.235296] [drm] Encoders: [ 10.235299] [drm] CRT1: INTERNAL_DAC1 [ 10.235302] [drm] DFP1: INTERNAL_TMDS1 [ 10.308511] [drm] fb mappable at 0x880C0000 [ 10.308521] [drm] vram apper at 0x88000000 [ 10.308525] [drm] size 8294400 [ 10.308527] [drm] fb depth is 24 [ 10.308530] [drm] pitch is 7680 [ 10.372575] radeon 0001:06:00.0: fb0: radeondrmfb frame buffer device [ 10.372649] [drm] Initialized radeon 2.50.0 20080528 for 0001:06:00.0 on minor 0 So I conclued this issue is produced by some interaction of the r600 DRM with PowerPC-specific or Mac-PCIe-specific things? Further on this bug only manifests on the ATI/AMD side. NVIDIA cards do not seem to have a problem with this new PCIe feature activated in commit 60db3a4d8cc9073cf56264785197ba75ee1caca4. On the same system Xorg is running with an EXA accelerated desktop on a Nvidia GeForce 6600 (NV43), card detection is fine: [ 10.818686] nouveau 0000:0a:00.0: DRM: VRAM: 252 MiB [ 10.818691] nouveau 0000:0a:00.0: DRM: GART: 512 MiB [ 10.818724] nouveau 0000:0a:00.0: DRM: TMDS table version 1.1 [ 10.818729] nouveau 0000:0a:00.0: DRM: DCB version 3.0 [ 10.818735] nouveau 0000:0a:00.0: DRM: DCB outp 00: 01000100 00000028 [ 10.818739] nouveau 0000:0a:00.0: DRM: DCB outp 01: 03000102 00000000 [ 10.818744] nouveau 0000:0a:00.0: DRM: DCB outp 02: 04011210 00000028 [ 10.818749] nouveau 0000:0a:00.0: DRM: DCB outp 03: 02111212 02000100 [ 10.818753] nouveau 0000:0a:00.0: DRM: DCB outp 04: 02011211 0020c070 [ 10.818758] nouveau 0000:0a:00.0: DRM: DCB conn 00: 1030 [ 10.818762] nouveau 0000:0a:00.0: DRM: DCB conn 01: 2130 [ 10.826071] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 10.826081] [drm] Driver supports precise vblank timestamp query. [ 10.829953] nouveau 0000:0a:00.0: DRM: MM: using M2MF for buffer copies [ 10.830013] nouveau 0000:0a:00.0: DRM: Setting dpms mode 3 on TV encoder (output 4) [ 10.937421] nouveau 0000:0a:00.0: DRM: allocated 1920x1080 fb: 0x9000, bo c00000045949e800 [ 11.069093] nouveau 0000:0a:00.0: DRM: 0x14C5: Parsing digital output script table [ 11.229337] [drm] Initialized nouveau 1.3.1 20120801 for 0000:0a:00.0 on minor 0 [ 52.472] (--) Depth 24 pixmap format is 32 bpp [ 52.478] (II) NOUVEAU(0): Channel setup complete. [ 52.490] (II) NOUVEAU(0): Hardware support for Present enabled [ 52.490] (II) NOUVEAU(0): [DRI2] Setup complete [ 52.490] (II) NOUVEAU(0): [DRI2] DRI driver: nouveau [ 52.490] (II) NOUVEAU(0): [DRI2] VDPAU driver: nouveau [ 52.490] (II) Loading sub module "exa" [ 52.490] (II) LoadModule: "exa" [ 52.491] (II) Loading /usr/lib64/xorg/modules/libexa.so [ 52.493] (II) Module exa: vendor="X.Org Foundation" [ 52.493] compiled for 1.19.4, module version = 2.6.0 [ 52.493] ABI class: X.Org Video Driver, version 23.0 [ 52.493] (II) EXA(0): Driver allocated offscreen pixmaps [ 52.493] (II) EXA(0): Driver registered support for the following operations: [ 52.493] (II) Solid [ 52.493] (II) Copy [ 52.493] (II) Composite (RENDER acceleration) [ 52.493] (II) UploadToScreen [ 52.493] (II) DownloadFromScreen [ 52.493] (==) NOUVEAU(0): Backing store enabled [ 52.493] (==) NOUVEAU(0): Silken mouse enabled [ 52.494] (II) NOUVEAU(0): [XvMC] Associated with NV40 texture adapter. [ 52.494] (II) NOUVEAU(0): [XvMC] Extension initialized. [ 52.494] (==) NOUVEAU(0): DPMS enabled [ 52.495] (II) NOUVEAU(0): RandR 1.2 enabled, ignore the following RandR disabled message. [ 52.497] (--) RandR disabled Had some time to test kernels 4.14.4 and 4.15-rc2. Sadly nothing new concerning this issue. (In reply to erhard_f from comment #57) > Had some time to test kernels 4.14.4 and 4.15-rc2. Sadly nothing new > concerning this issue. yes nothing change , mee to continue dont have gpu accelerations on r600 .... but on ppc world look like the emb are not effected . Tried kernel 4.17.2 and was surprised to see that this bug does not seem to bother my G5 11,2 any longer! Now my HD 6450 passes all ring tests and I can run X11 with glamor happily again. \o/ Hopefully this bug is gone for good. Salute to the hard working kernel/AMD developer who made this possible! ;) (In reply to intermediadc@hotmail.com from comment #58) > (In reply to erhard_f from comment #57) > > Had some time to test kernels 4.14.4 and 4.15-rc2. Sadly nothing new > > concerning this issue. > > yes nothing change , mee to continue dont have gpu accelerations on r600 > .... but on ppc world look like the emb are not effected . I wonder if you still can reproduce this bug on current 4.17.x or 4.18.x kernels? I've not had issues ever since. IMHO this bug could be closed. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/773. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 129698 [details] xorg log [ 1.837108] [drm:.r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) [ 1.837122] radeon 0000:0a:00.0: disabling GPU acceleration From Xorg 1.6 on PowerMac G5 dont have any more accelerations with radeons gpus 5xxxx and 6xxxx because the ring issue. Only 4xxxx is working but face freeze when 3d programs start running This issue is present today too on Fedora 25 PPC64 too and is present on Debian (ppc32) and Ubuntu (ppc 32) .Org X Server 1.19.1 Release Date: 2017-01-11 X Protocol Version 11, Revision 0 I been try all te kernel configs without any positive result . i been try to insert the firmware of the gpu inside the kernel same issue. many others are reporting the same issue. My mesa Version is: 17.0.0 pls help!