Bug 24218

Summary: Xorg stuck in a loop when started with KMS enabled on RS880
Product: xorg Reporter: Mikko C. <mikko.cal>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: 7.4 (2008.09)   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.0.log
none
dmesg
none
lspci -nnxx
none
dmesg with drm debug=1
none
backtrace
none
fix rs880 none

Description Mikko C. 2009-09-29 08:21:29 UTC
Created attachment 29942 [details]
Xorg.0.log

Kernel 2.6.32-rc2
Xorg 1.6.3.901
xf86-video-ati master
mesa master
libdrm master


The PC doesn't hang, I can SSH in, just X seems to be dead: I can move the cursor, but the screen is mostly black, with some cyan, red and white lines. I cannot switch console with ALT+F1-F12

Works just fine with KMS disabled.
Comment 1 Mikko C. 2009-09-29 08:22:02 UTC
Created attachment 29943 [details]
dmesg

dmesg.log
Comment 2 Mikko C. 2009-09-29 08:22:27 UTC
Created attachment 29944 [details]
lspci -nnxx

lspci -nnxx
Comment 3 Alex Deucher 2009-09-29 09:19:09 UTC
Please try the latest bits from drm-next.  You don't appear to have the r6xx/r7xx vram/aperture size clipping patch.
Comment 4 Mikko C. 2009-09-29 09:41:28 UTC
Ok, I tried using zen-stable:
http://git.zen-sources.org/?p=zen-stable.git;a=summary which pulled drm-next 15
hours ago. But I still get the same screen. How old is this patch? More than 15
hours?
Comment 5 Alex Deucher 2009-10-02 16:48:59 UTC
Does it work with a non-zen kernel?
Comment 6 Mikko C. 2009-10-02 23:12:46 UTC
(In reply to comment #5)
> Does it work with a non-zen kernel?
> 

Alex, can you give me a link to the commit? Maybe if it's not too big I can apply the patch to 2.6.32-rc1. Otherwise I have to wait for 2.6.32-rc2.
Comment 7 Alex Deucher 2009-10-03 07:47:00 UTC
http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=commitdiff;h=974b16e33ea626c9854f0f34fa5455a18822e159

I think it's still worth trying a non-zen kernel.
Comment 8 Mikko C. 2009-10-03 08:26:24 UTC
(In reply to comment #7)
> http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=commitdiff;h=974b16e33ea626c9854f0f34fa5455a18822e159
> 
> I think it's still worth trying a non-zen kernel.
> 

I applied this patch: http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=commitdiff_plain;h=974b16e33ea626c9854f0f34fa5455a18822e159 to vanilla 2.6.32-rc1 and it's still the same. Note that I'm currently using the vanilla kernel, I just tried the zen because it had a more recent drm-next pull.
Would it help if I took a picture of the screen? Although my phone's camera really sucks :)
Comment 9 Mikko C. 2009-10-06 02:22:18 UTC
It's probably not very useful, but I started X via gdb. It hanged there.
I opened another console via ssh and killed X, then I got this backtrace with gdb:

Program received signal SIGTERM, Terminated.
[Switching to Thread 0x7f5209bcc6f0 (LWP 1787)]
0x00007f5207112bd7 in ioctl () from /lib/libc.so.6
(gdb) bt
#0  0x00007f5207112bd7 in ioctl () from /lib/libc.so.6
#1  0x00007f520619cc83 in drmIoctl (fd=13, request=3221775460, arg=0x7fffc4fb7c60) at xf86drm.c:188
#2  0x00007f520619cf5c in drmCommandWriteRead (fd=13, drmCommandIndex=<value optimized out>, data=0x7fffc4fb7c60,
    size=<value optimized out>) at xf86drm.c:2431
#3  0x00007f5205a92259 in bo_wait (bo=0x2305d20) at radeon_bo_gem.c:214
#4  0x00007f5205a922ad in bo_map (bo=0x2305d20, write=-1073191836) at radeon_bo_gem.c:187
#5  0x00007f5205d514ee in r600_vb_get (pScrn=<value optimized out>) at /usr/include/drm/radeon_bo.h:151
#6  0x00007f5205d51544 in r600_cp_start (pScrn=0xd) at r6xx_accel.c:1227
#7  0x00007f5205d4f0ba in R600PrepareSolid (pPix=0x2326430, alu=<value optimized out>, pm=4294967295, fg=0)
    at r600_exa.c:183
#8  0x00007f520566264d in exaFillRegionSolid (pDrawable=0x2329b50, pRegion=0x23d4f50, pixel=<value optimized out>,
    planemask=<value optimized out>, alu=3, clientClipType=0) at exa_accel.c:1003
#9  0x00007f520566321a in exaPolyFillRect (pDrawable=0x2329b50, pGC=0x2328c00, nrect=1, prect=0x23d4f30) at exa_accel.c:800
#10 0x00000000004bbb1b in damagePolyFillRect (pDrawable=0x2329b50, pGC=0x2328c00, nRects=1, pRects=0x23d4f30)
    at damage.c:1404
#11 0x0000000000457ed4 in miPaintWindow (pWin=<value optimized out>, prgn=0x7fffc4fb8160, what=<value optimized out>)
    at miexpose.c:670
#12 0x0000000000458274 in miWindowExposures (pWin=0x2329b50, prgn=0x7fffc4fb8160, other_exposed=0x0) at miexpose.c:504
#13 0x000000000050bf83 in xf86XVWindowExposures (pWin=0x2329b50, reg1=0x7fffc4fb8160, reg2=<value optimized out>)
    at xf86xv.c:1054
#14 0x0000000000445bc1 in MapWindow (pWin=0x2329b50, client=<value optimized out>) at window.c:2678
#15 0x0000000000424f17 in main (argc=1, argv=<value optimized out>, envp=<value optimized out>) at main.c:254


I'm now running 2.6.32-rc3, same behavior.
Also, might be related to this other bug I'm having without KMS: http://bugs.freedesktop.org/show_bug.cgi?id=24300
Basically I get a lot of these errors:
[drm:radeon_cp_indirect] *ERROR* sending pending buffer 25
Comment 10 Mikko C. 2009-10-06 08:58:46 UTC
Created attachment 30113 [details]
dmesg with drm debug=1

I booted with drm debug=1
It seems X is stuck in an infinite loop?
I really hope this helps.
Comment 11 Mikko C. 2009-10-06 09:04:55 UTC
changed title more appropriately.
Comment 12 Marcin Baczyński 2009-10-06 09:19:46 UTC
Created attachment 30114 [details]
backtrace

I got this backtrace on Mobility Radeon 2400 (m72/rv610) so looks like the issue is not rs780-specific. Starting glxgears on bare Xserver and waiting several seconds leads to either X hang or whole system freeze(no pings, no sysrq, nothing). I'm using xorg-server-1.7.0 (same behavior on 1.6.{3,4}), drm-next kernel, and mesa, libdrm, xf86-video-ati from git master.
Comment 13 Mikko C. 2009-10-06 09:24:39 UTC
at least your X starts, mine doesn't really... My card is also rv610 (or rv620) based iirc. I suppose without kms it works fine for you too?
Comment 14 Marcin Baczyński 2009-10-06 09:42:04 UTC
Yes, works fine without KMS.
Comment 15 Marcin Baczyński 2009-10-08 14:03:19 UTC
Modifying libdrm code not to retry ioctl more than 100 times, and a few seconds with kwin effects+glxgears gave thousands lines like this in dmesg:

[drm:radeon_ib_get] *ERROR* radeon: IB(9:0x0000000008191000:6090)
[drm:radeon_ib_get] *ERROR* radeon: GPU lockup detected, fail to get a IB
[drm:radeon_cs_ioctl] *ERROR* Failed to get ib !
Comment 16 Mikko C. 2009-10-12 06:13:11 UTC
still valid with 2.6.32-rc4 + additional drm-next patches up until 39deb2d67515086f08a672e7574716ca0d3883a5
Comment 17 Marcin Baczyński 2009-10-27 09:48:07 UTC
Seems to be fixed with current zen kernel master.
Comment 18 Mikko C. 2009-10-27 12:07:34 UTC
Then it's not the same bug after all. I still have this problem with zen-kernel master and also with vanilla 2.6.32-rc5 + drm-next patches up to c182be37ed7cb04c344501b88b8fdb747016e6cf
Comment 19 Alex Deucher 2009-11-02 12:20:49 UTC
I suspect this is a dupe of bug 24535.  RS880 is based on RV620.
Comment 20 Alex Deucher 2009-11-02 17:08:19 UTC
does the patch on bug 24535#c25 help?
Comment 21 Mikko C. 2009-11-03 01:19:30 UTC
(In reply to comment #20)
> does the patch on bug 24535#c25 help?
> 

Unfortunately no, still the same bug.
Comment 22 Alex Deucher 2009-11-05 07:17:00 UTC
Created attachment 30983 [details] [review]
fix rs880

I sent this patch to dri-devel last night.  It should fix the rs880 issues.
Comment 23 Mikko C. 2009-11-05 07:41:18 UTC
yep, it's fixed :)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.