Bug 101038

Summary: Radeon 9650 (rv350): gpu reset/lockup (dri3, ppc)
Product: xorg Reporter: erhard_f
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: git   
Hardware: PowerPC   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg output
none
Xorg.log
none
glxinfo
none
Xorg.log dri2
none
Xorg.log glamor none

Description erhard_f 2017-05-14 10:34:41 UTC
Created attachment 131346 [details]
dmesg output

Did a git-build yesterday to try out the now enabled dri3 on my ppc hardware. Xorg succesfully starts with dri3 enabled, but after some desktop usage I get gpu resets or total freezes (machine unaccessable via sshd).

[  295.760807] radeon 0000:f0:10.0: ring 0 stalled for more than 10133msec
[  295.760818] radeon 0000:f0:10.0: GPU lockup (current fence id 0x00000000000016d5 last fence id 0x00000000000016df on ring 0)
[  295.912956] Failed to wait GUI idle while programming pipes. Bad things might happen.
[  295.914021] radeon 0000:f0:10.0: Saved 347 dwords of commands on ring 0.
[  295.914035] radeon 0000:f0:10.0: (r300_asic_reset:425) RBBM_STATUS=0x80010140
[  296.414026] radeon 0000:f0:10.0: (r300_asic_reset:444) RBBM_STATUS=0x80010140
[  296.910023] radeon 0000:f0:10.0: (r300_asic_reset:456) RBBM_STATUS=0x00000140
[  296.910054] radeon 0000:f0:10.0: GPU reset succeed
[  296.910058] radeon 0000:f0:10.0: GPU reset succeeded, trying to resume
[  296.910098] [drm] radeon: 1 quad pipes, 1 Z pipes initialized.
[  296.910105] radeon 0000:f0:10.0: WB disabled
[  296.910111] radeon 0000:f0:10.0: fence driver on ring 0 use gpu addr 0x0000000000000000 and cpu addr 0xd0000000018e9000
[  296.910203] [drm] radeon: ring at 0x0000000000001000
[  296.910274] [drm] ring test succeeded in 0 usecs
[  296.963794] [drm] ib test succeeded in 0 usecs

Very similar to #94877 but as the card is different (r300 instead of r100), dri3 instead of dri2, current kernel & mesa-git I decided to file a new bug.
Comment 1 erhard_f 2017-05-14 10:35:18 UTC
Created attachment 131347 [details]
Xorg.log
Comment 2 erhard_f 2017-05-14 10:36:02 UTC
Created attachment 131348 [details]
glxinfo
Comment 3 Michel Dänzer 2017-05-15 02:45:10 UTC
Beware that DRI3 can't work correctly in all cases with EXA, that's why it's disabled by default. If this only happens with DRI3, I recommend sticking to DRI2.

Which desktop environment are you using?
Comment 4 erhard_f 2017-05-15 18:35:58 UTC
What a pity, my hope was that DRI3 will work more reliable than DRI2 on PPC hardware.

I am using MATE 1.18.0. Freezes with DRI2+exa happen almost instantly when reaching the desktop:

[ 1275.313543] radeon 0000:f0:10.0: ring 0 stalled for more than 10133msec
[ 1275.313554] radeon 0000:f0:10.0: GPU lockup (current fence id 0x0000000000000432 last fence id 0x0000000000000433 on ring 0)                                                                                    
[ 1275.463941] Failed to wait GUI idle while programming pipes. Bad things might happen.
[ 1275.464995] radeon 0000:f0:10.0: Saved 59 dwords of commands on ring 0.
[ 1275.465014] radeon 0000:f0:10.0: (r300_asic_reset:425) RBBM_STATUS=0x80010140
[ 1275.681771] usb 1-1: USB disconnect, device number 5
[ 1275.681779] usb 1-1.3: USB disconnect, device number 6
[ 1275.965005] radeon 0000:f0:10.0: (r300_asic_reset:444) RBBM_STATUS=0x80010140
[ 1276.461002] radeon 0000:f0:10.0: (r300_asic_reset:456) RBBM_STATUS=0x00000140
[ 1276.461032] radeon 0000:f0:10.0: GPU reset succeed
[ 1276.461037] radeon 0000:f0:10.0: GPU reset succeeded, trying to resume
[ 1276.461068] [drm] radeon: 1 quad pipes, 1 Z pipes initialized.
[ 1276.461074] radeon 0000:f0:10.0: WB disabled
[ 1276.461080] radeon 0000:f0:10.0: fence driver on ring 0 use gpu addr 0x0000000000000000 and cpu addr 0xd000000001869000
[ 1276.461172] [drm] radeon: ring at 0x0000000000001000
[ 1276.461241] [drm] ring test succeeded in 0 usecs
[ 1276.512215] usb 1-1.4: USB disconnect, device number 7
[ 1276.515758] [drm] ib test succeeded in 0 usecs

DRI3+glamor either did not work out:

[   143.438] (**) RADEON(0): DRI3 enabled
[   143.438] (==) RADEON(0): Backing store enabled
[   143.438] (II) RADEON(0): Direct rendering enabled
[   143.439] (WW) glamor requires at least 128 instructions (64 reported)
[   143.439] (EE) RADEON(0): Failed to initialize glamor.
[   143.439] (EE) RADEON(0): Acceleration initialization failed
[   143.439] (II) RADEON(0): Acceleration disabled
Comment 5 erhard_f 2017-05-15 18:36:53 UTC
Created attachment 131362 [details]
Xorg.log dri2
Comment 6 erhard_f 2017-05-15 18:37:39 UTC
Created attachment 131363 [details]
Xorg.log glamor
Comment 7 Michel Dänzer 2017-05-16 02:16:34 UTC
The lockups are probably not directly related to DRI2/3 then. Does

 radeon.agpmode=-1

on the kernel command line help?
Comment 8 erhard_f 2017-05-16 05:41:02 UTC
This does help, the lockups occur much less frequently with radeon.agpmode=-1. But sooner or later they do occur, only successful workaround is to turn off acceleration via Option "Accel" "off".
Comment 9 erhard_f 2018-02-26 00:08:50 UTC
Interesting finding: After I rebuilt my Gentoo on the G5 with gcc 7.3.0 the freezes are gone, only GPU lockups happen. These lockups occur more rarely and are harder to provoke, also the machine gets usable again after the resets. EXA + 3D acceleration is actually usable now and it does not freeze the machine, albeith it has other glitches (e.g. wrong colours and some sort of graphic bugs in extreme tuxracer).

I would not have expected that from a gcc upgrade!

The system now runs on: kernel 4.15.5, xorg-server 1.19.5, xf86-video-ati 7.9.0, mesa 17.2.8

Another similar setup on this G5 but built with gcc 6.4.0 does not improve the situation over the original bug report. So I suppose it's the compiler after all. Seems gcc 7 is doing well on ppc.
Comment 10 Martin Peres 2019-11-19 08:00:24 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/issues/172.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.