Bug 89196

Summary: Radeon GPU crashes at random times (GPU lockup)
Product: DRI Reporter: Maciej Gluszek <info>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED DUPLICATE QA Contact:
Severity: normal    
Priority: medium CC: info
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Syslog output during lockup
none
GPU lockup syslog log - 2015-02-20 none

Description Maciej Gluszek 2015-02-17 23:56:54 UTC
Created attachment 113583 [details]
Syslog output during lockup

Hello,

There are several similar bug reports but someone asked to fill every instance as separate bug so here goes.

I'm running Ubuntu 14.04 on a 64bit machine with kernel 3.13.0-46-generic #75-Ubuntu SMP.

My GPU is: VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV635/M86 [Mobility Radeon HD 3650]

Packages:
xserver-xorg (1:7.7+1ubuntu8.1) 
xorg-server (2:1.15.1-0ubuntu2.6) 
xserver-xorg-video-ati (1:7.3.0-1ubuntu3.1) 

Randomly, sometimes 2-3 times a day, sometimes once every 2 days i get GPU lockups. For 2-3 seconds the screen freezes, then goes black for a couple of seconds, and then its still "frozen" and flickering when i move the mouse.

I am able to switch to console with ctrl+alt+f{n} but when coming back to desktop, the screen is in the same frozen state.

I am NOT using proprietary drivers from AMD.

I attached an output from syslog when it happens, there seems to be no output in Xorg.log though.

If you need any more information, please let me know.

Thanks!
Comment 1 Maciej Gluszek 2015-02-20 10:14:54 UTC
UPDATE: Seems i found a solution. So far almost 48h uptime on my laptop and no crash. What i did, is i added "radeon.hard_reset=1" to GRUB command line.

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash radeon.dpm=1 radeon.hard_reset=1"
Comment 2 Alex Deucher 2015-02-20 14:34:47 UTC
(In reply to Maciej Gluszek from comment #1)
> UPDATE: Seems i found a solution. So far almost 48h uptime on my laptop and
> no crash. What i did, is i added "radeon.hard_reset=1" to GRUB command line.
> 
> GRUB_CMDLINE_LINUX_DEFAULT="quiet splash radeon.dpm=1 radeon.hard_reset=1"

Are you still getting GPU resets in your kernel log?
Comment 3 Maciej Gluszek 2015-02-20 14:41:20 UTC
Created attachment 113685 [details]
GPU lockup syslog log - 2015-02-20
Comment 4 Maciej Gluszek 2015-02-20 14:44:00 UTC
Couple of hours after posting an update here i got another lockup, i attached a syslog output with info when it happened.

Seems like those lockups will be less likely to happen now, but i would have to see.

PS. I checked Google Chrome issues tracker and some people also complained about it being Chrome bug with GPU. I would have to see if i get a lockup without Chrome running or it doesnt matter.
Comment 5 Maciej Gluszek 2015-02-21 22:17:44 UTC
Another update.

After all lockups still happen. I tried with GRUB parameters "radeon.hard_reset=1", "radeon.lockup_timeout=0" (disabled) and "radeon.lockup_timeout=60000" (increased from default 10000 ms).

I also turned off GPU hardware acceleration in Google Chrome for a while, lockups still happen so i guess Chrome GPU support may not be the issue."radeon.lockup_timeout=0"
Comment 6 Maciej Gluszek 2015-02-26 16:34:11 UTC
UPDATE:

I removed all additional tested kernel command lines that apply to Radeon besides radeon.dpm=1 for power management.
I upgraded Ubuntu library and kernel to 3.16.0-31-generic.
Also upgraded Mesa library to 10.3.2

OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD RV635
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.3.2
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 10.3.2
OpenGL shading language version string: 1.30
OpenGL context flags: (none)

Lockups are still happening. I cannot specify which application is causing this. Lockups happen completly at random times.
Comment 7 Alex Deucher 2015-02-26 17:39:00 UTC
(In reply to Maciej Gluszek from comment #6)
> UPDATE:
> 
> I removed all additional tested kernel command lines that apply to Radeon
> besides radeon.dpm=1 for power management.

> Lockups are still happening. I cannot specify which application is causing
> this. Lockups happen completly at random times.

dpm is disabled by default on rv6xx due to stability issues.  It's likely that may be the problem you are seeing.  In that case, this bug is a duplicate of bug 66963.
Comment 8 Maciej Gluszek 2015-02-26 18:39:53 UTC
I see. I was running Ubuntu without dpm at first but GPU temperature was too high.

When i disable dpm GPU temp is about 10-15 degress higher and the fan is louder than with dpm enabled.

Is there any other way to turn on some power management for this card without using dpm? Thanks
Comment 9 Alex Deucher 2015-02-26 18:47:48 UTC
(In reply to Maciej Gluszek from comment #8)
> I see. I was running Ubuntu without dpm at first but GPU temperature was too
> high.
> 
> When i disable dpm GPU temp is about 10-15 degress higher and the fan is
> louder than with dpm enabled.
> 
> Is there any other way to turn on some power management for this card
> without using dpm? Thanks

You can manually select the performance levels via the old pm sysfs interface.  See the "KMS Power Management Options" section on this page:
http://xorg.freedesktop.org/wiki/RadeonFeature/
Comment 10 Maciej Gluszek 2015-02-26 22:13:13 UTC
Unfortunately it didn't take long again..

I removed dpm parameter from startup. There are no radeon related parameters active. Then i set performance level by hand to "med", after a while to "low".

After about 2 hours it locked up again with the same error (GPU lockup)
Comment 11 Maciej Gluszek 2015-02-28 15:28:25 UTC
I think i got it working thanks to one solution from https://bugs.freedesktop.org/show_bug.cgi?id=66963.

I am running with "dpm" enabled and then setting

echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level
echo performance > /sys/class/drm/card0/device/power_dpm_state

Seems to work. Almost 2 days uptime without lockups which never happened before. Booting without problems cooling is also fine. Restoring from suspend also works.

Thanks for your help Alex, this bug can be marked as a duplicate.

*** This bug has been marked as a duplicate of bug 66963 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.