Bug 25662

Summary: R300: System locks up when running any graphical program using KMS
Product: DRI Reporter: Tom Stellard <tstellar>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED NOTOURBUG QA Contact:
Severity: normal    
Priority: medium CC: andyrtr, maarten.fonville
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg.log
none
xorg.log
none
lspci output.
none
Proposed fix. none

Description Tom Stellard 2009-12-15 18:59:09 UTC
X starts correctly, but when I try to run a program that has a graphical interface(e.g. glxgears, firefox), my system locks up.  I cannot even ssh in to view error messages.  If I disable Radeon KMS in the kernel, everything works fine.  My graphics card is an xpress 200m.  I am using the latest git versions of most of the relevant software, here are the exact versions:

Kernel-2.6.32-git 8bea8672edfca7ec5f661cafb218f1205863b343
libdrm-git fbc8b2d95f5da096ee771a3e2ef6f89306679e89
mesa-git 80e815639459367313cb0c2e5e32d978ed9fcd0
ddx-git 3a30210d50b27f8772fc5045133940246764fce9

xorg-server-1.7.3
Comment 1 Tom Stellard 2009-12-15 18:59:42 UTC
Created attachment 32098 [details]
dmesg.log
Comment 2 Tom Stellard 2009-12-15 19:00:09 UTC
Created attachment 32099 [details]
xorg.log
Comment 3 Tom Stellard 2009-12-18 02:35:59 UTC
I noticed that if I use software rendering, I can get glxgears to run for about three seconds before X freezes.  Afterwards, I can still switch virtual terminals, but X is no longer functional.  If I use the gallium driver with RADEON_SOFTPIPE enabled, glxgears does not cause X to freeze, but it does not display correctly.  The gears are shifted up off the top of the screen, and only the bottom half of the large red gear is visible.  If I try to run Firefox with the gallium driver(RADEON_SOFTPIPE enabled), it works for a few seconds and then X freezes, and I experience the same problem as when I run glxgears with software rendering.  In both cases the dmesg log is filled with entries like this:

[drm:radeon_fence_wait] *ERROR* fence(e9f2c0e0:0x0000064A) 521ms timeout
[drm:radeon_fence_wait] *ERROR* last signaled fence(0x0000064A)
[drm:radeon_fence_wait] *ERROR* fence(e9f2c240:0x0000064B) 510ms timeout going to reset GPU
[drm] CP reset succeed (RBBM_STATUS=0x00000140)
[drm] radeon: cp idle (0x10000000)
[drm] radeon: ring at 0x0000000030000000
[drm] ring test succeeded in 0 usecs
[drm] GPU reset succeed (RBBM_STATUS=0x00000140)
Comment 4 Tom Stellard 2009-12-20 18:56:25 UTC
I have done some debugging of the problem, and it appears my system locks up while calling glXSwapBuffers() on line 340 of glxgears.c.  This is while I am using the regular r300 mesa driver.
Comment 5 Andreas Radke 2010-02-27 14:16:27 UTC
I have a very similar behavior in ArchLinux. I can login using lxdm. Once Xfce is started it either freezes almost immediately or after a few seconds when I have opened a terminal or anything else.

I'm not sure if this is a different issue or the same. Behavior looks similar and it's the same chipset family I think. We had used KMS already for a while in kernel 2.6.31 where it worked well. It started to freeze later, first noticed in 2.6.33rc1. I have no interesting entries in dmesg or Xorg.log. How can I provide more information to track this down? (using kernel 2.6.33, libdrm 2.4.18, mesa 7.7 and ddx from todays git master)

(--) PCI:*(0:1:5:0) 1002:5a62:144d:c02b ATI Technologies Inc RC410 [Radeon Xpress 200M] rev 0, Mem @ 0xd0000000/268435456, 0xc0000000/65536, I/O @ 0x00009000/256, BIOS @ 0x????????/131072
Comment 6 Alex Deucher 2010-03-19 07:42:33 UTC
Do things work any better with a newer drm?  2.3.33/4 or Dave's drm-radeon-testing branch (http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=shortlog;h=refs/heads/drm-radeon-testing)?
Comment 7 Alex Deucher 2010-03-19 07:43:26 UTC
(In reply to comment #5)

> I'm not sure if this is a different issue or the same. Behavior looks similar
> and it's the same chipset family I think. We had used KMS already for a while
> in kernel 2.6.31 where it worked well. It started to freeze later, first
> noticed in 2.6.33rc1. I have no interesting entries in dmesg or Xorg.log. How
> can I provide more information to track this down? (using kernel 2.6.33, libdrm
> 2.4.18, mesa 7.7 and ddx from todays git master)

Any chance you could bisect the kernel to see what change broke it?
Comment 8 Tom Stellard 2010-03-19 23:09:38 UTC
Here are the results from git bisect:

Starting with the 2.6.31 kernel, KMS works fine until this commit:
3e5cb98dfe87cc61d0a1119dd8aa2b1e4cfab424
After this commit, xdm freezes at the login screen, but I can still switch to another virtual terminal. My dmesg log has the message described in comment #3 repeated several times.  This behavior continues until commit: 6b46362c0ea472b174c336786fd406c504326ad4
After this commit, I experience the behavior described in comment #1.

I am now using X server 1.7.5 and the latest git versions of mesa, libdrm, and ddx.

I will attach the output of lspci to give a little more information about my system.
Comment 9 Tom Stellard 2010-03-19 23:10:46 UTC
Created attachment 34250 [details]
lspci output.
Comment 10 Andreas Radke 2010-03-20 02:32:03 UTC
Thanks Tom. It seems we have the same problem. Mine is a Samsung R40 Cinoso

01:05.0 VGA compatible controller: ATI Technologies Inc RC410 [Radeon Xpress 200M] (prog-if 00 [VGA controller])
	Subsystem: Samsung Electronics Co Ltd Device c02b
	Flags: bus master, 66MHz, medium devsel, latency 66, IRQ 17
	Memory at d0000000 (32-bit, prefetchable) [size=256M]
	I/O ports at 9000 [size=256]
	Memory at c0000000 (32-bit, non-prefetchable) [size=64K]
	[virtual] Expansion ROM at c0020000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel modules: radeon, radeonfb

For me kernel 2.6.33.1 is unusable with KMS. It freezes either at loginmanager or very soon when moving around the first window I opened. With 2.6.34rc1 it crahses less often at login but still sometimes. When I can login it seems quiet stable until I start glxgears.

In UMS mode the system is also sometimes crashing when starting Xorg/loginmanager. When it comes up it's stable in usual desktop work. But 3D games like supertuxkart are broken (textures almost invisible). Not sure if this is drm or mesa related.
Comment 11 Tom Stellard 2010-03-23 19:51:39 UTC
After some more testing, I am sure that commit 3e5cb98dfe87cc61d0a1119dd8aa2b1e4cfab424 is the problem.  I have attached a patch that fixes this bug for me.
Comment 12 Tom Stellard 2010-03-23 19:55:17 UTC
Created attachment 34390 [details] [review]
Proposed fix.

This patch can be applied to drm-radeon-testing with
HEAD: 65965f4b702f98cb3857db44375e5d6a804d4937
Comment 13 Alex Deucher 2010-03-24 00:38:57 UTC
This is a chipset/bios issue.  Your system probably needs a pci quirk to fix or disable MSIs rather than a blanket disabling of MSIs for rs400s.  For reference:
http://marc.info/?l=dri-devel&m=126926011226719&w=2
https://bugzilla.kernel.org/show_bug.cgi?id=15287

booting with pci=nomsi should work around the problem.  You should file a pci bug on https://bugzilla.kernel.org
Comment 14 Tom Stellard 2010-03-24 19:23:27 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=15626
Comment 15 Marek Olšák 2010-05-15 22:13:19 UTC
Because the same bug was filed in the kernel bugzilla, closing here...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.