Bug 103913 - DRM/Radeon GPU hang
Summary: DRM/Radeon GPU hang
Status: RESOLVED INVALID
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-26 11:15 UTC by roger
Modified: 2017-12-06 09:01 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description roger 2017-11-26 11:15:42 UTC
Disconnecting and reconnecting HDMI/DVI connectors, or a loose connection, can cause the driver to stall in a loop with the following messages being logged approximately every 500ms in syslog.

Nov 25 16:13:30 dragon kernel: [363196.855813] radeon 0000:02:00.0: ring 0 stalled for more than 1710608msec
Nov 25 16:13:30 dragon kernel: [363196.855818] radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000284cb5 last fence id 0x0000000000284cb6 on ring 0)

This causes the screen to blank. A reboot is required to clear this.

It would seem more sensible to reset the driver if it has been locked on the same fence for a long time.

=====================================================================

(II) Module radeon: vendor="X.Org Foundation"        [    82.693]    compiled for 1.19.3, module version = 7.10.0      [    82.693]    Module class: X.Org Video Driver                  [    82.693]    ABI class: X.Org Video Driver, version 23.0       [    82.693] (II) 

roger@dragon:/var/log$ sudo lshw -c video                                                                   *-display
       description: VGA compatible controller
       product: RV770 [Radeon HD 4850]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:02:00.0
       version: 00
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
       configuration: driver=radeon latency=0
       resources: irq:26 memory:d0000000-dfffffff memory:fbae0000-fbaeffff ioport:b000(size=256) memory:c0000-dffff


Roger
Comment 1 Alex Deucher 2017-11-26 20:49:15 UTC
The monitor connections have nothing to do with GPU hangs.  Can you narrow down what application causes the GPU hang?
Comment 2 roger 2017-11-27 18:47:41 UTC
No applications are running other than gnome desktop itself. I doubt this actually a real GPU hang, that is just what it says in the syslog messages. Are fences used by the driver for anything other than communicating with the GPU?

I can trigger this by pulling out the HDMI cable and sticking it back in again.

I will do some more tests over the next couple of days.

Roger
Comment 3 roger 2017-11-29 20:57:36 UTC
Just found out that my gnome desktop is rendering through Wayland. I wonder if that is relevant.
Comment 4 roger 2017-12-06 09:01:06 UTC
I moved to a simpler is single head setup and found that this loop occurs randomly without any changes to the monitor connection. It fact it usually occurs after the system has been idle for some time with just the gnome desktop running. I am marking this bug as resolved. I will switch over to using xserver instead of Wayland and see if that resolves the issue. If it does I will raise another bug with more appropriate information.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.