Bug 26403 - X crashes on startup with KMS (Alpha architecture)
Summary: X crashes on startup with KMS (Alpha architecture)
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: Alpha Linux (All)
: medium major
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-02 18:17 UTC by Michael Cree
Modified: 2019-11-19 08:09 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg log with backtrace (26.83 KB, text/plain)
2010-02-02 18:17 UTC, Michael Cree
no flags Details
Kernel dmesg showing KMS/drm setup and backtrace at kernel warning. (19.03 KB, text/plain)
2010-02-02 18:19 UTC, Michael Cree
no flags Details
kernel dmesg showing drm/radeon errors on Xserver startup. (46.62 KB, text/plain)
2010-06-01 03:31 UTC, Michael Cree
no flags Details
Xserver log (33.70 KB, text/plain)
2010-06-01 03:32 UTC, Michael Cree
no flags Details

Description Michael Cree 2010-02-02 18:17:09 UTC
Created attachment 33019 [details]
Xorg log with backtrace

X crashes on start up when KMS is enabled on the Alpha architecture.  Testing on a PWS600au (EV56 cpu) with 2.6.32.5 kernel, Xorg git master and Radeon 7000 PCI graphics card.  Symptoms: On starting X the screen blanks for a bit then the console reappears with a backtrace from Xorg; and the kernel warning message.  The console is damaged - a patterned white square (about 1in by 1in) remains firmly fixed in the centre of the screen.  The system is left unstable and eventually crashed with kernel oops when as I was reading the logs.
Comment 1 Michael Cree 2010-02-02 18:19:07 UTC
Created attachment 33020 [details]
Kernel dmesg showing KMS/drm setup and backtrace at kernel warning.
Comment 2 Michel Dänzer 2010-02-03 01:06:40 UTC
Given the kernel warning and oops, this could be an issue in the kernel rather than the X server...

(The white square is probably the hardware cursor left enabled due to the X server crash)
Comment 3 Matt Turner 2010-02-09 14:11:03 UTC
Yes, the 1"x1" square is almost definitely the hardware cursor. I see a similar thing in bug 23227.
Comment 4 Michael Cree 2010-06-01 03:29:38 UTC
I have had a go at testing KMS and the Xserver again with newer kernel (2.6.34), newer Xserver (1.8.1), and also a more recent radeon card (HD4350).  The kernel console comes up fine in KMS but the Xserver crashes on start up.  A corrupted patterned screen first appears then the monitor blanks as the signal is lost.

The Xserver log doesn't appear to reveal anything untoward, but ominous messages appear in dmesg like so:

[  478.147216] [drm:radeon_fence_wait] *ERROR* fence(fffffc00227e9680:0x00000004) 505ms timeout going to reset GPU
[  478.147216] radeon 0000:01:00.0: GPU softreset 
[  478.147216] radeon 0000:01:00.0:   R_008010_GRBM_STATUS=0xA27034A4
[  478.147216] radeon 0000:01:00.0:   R_008014_GRBM_STATUS2=0x00000102
[  478.147216] radeon 0000:01:00.0:   R_000E50_SRBM_STATUS=0x200000C0
[  478.147216] radeon 0000:01:00.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
[  478.147216] radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00000001
[  478.147216] radeon 0000:01:00.0:   R_000E60_SRBM_SOFT_RESET=0x00000402
[  478.207762] radeon 0000:01:00.0:   R_008010_GRBM_STATUS=0x00003028
[  478.207762] radeon 0000:01:00.0:   R_008014_GRBM_STATUS2=0x00000002
[  478.207762] radeon 0000:01:00.0:   R_000E50_SRBM_STATUS=0x200000C0

Full Xorg log and dmesg will be attached.
Comment 5 Michael Cree 2010-06-01 03:31:41 UTC
Created attachment 35983 [details]
kernel dmesg showing drm/radeon errors on Xserver startup.
Comment 6 Michael Cree 2010-06-01 03:32:42 UTC
Created attachment 35984 [details]
Xserver log
Comment 7 Michael Cree 2010-06-15 03:09:54 UTC
Further testing with kernel 2.6.35-rc3, Xserver 1.8.1.901 (1.8.2 RC 1), and Radeon driver 6.13.99 with the HD4350 Radeon card.

I've reallocated this bug to driver/radeon since that seems more appropriate.

I see a similar crash on starting X with KMS as described before for earlier kernels.

While starting Xorg under gdb I observe a Bus error as follows:

Program received signal SIGBUS, Bus error.
0x00000200003a547c in ?? () from /lib/libc.so.6.1
(gdb) bt
#0  0x00000200003a547c in ?? () from /lib/libc.so.6.1
#1  0x00000200003a53e4 in memcpy () from /lib/libc.so.6.1
#2  0x00000200008c8f94 in drmmode_load_cursor_argb (crtc=0x12026e510, 
    image=0x120289fc0) at drmmode_display.c:392
#3  0x000000012018d714 in xf86_crtc_convert_cursor_to_argb (crtc=0x12026e510, 
    src=<value optimized out>) at xf86Cursors.c:218
#4  0x000000012018e514 in xf86_load_cursor_image (scrn=<value optimized out>, 
    src=0x1204b5160 "") at xf86Cursors.c:452
#5  0x0000000120191f94 in xf86SetCursor (pScreen=0x120276430, 
    pCurs=0x120291850, x=840, y=525) at xf86HWCurs.c:148
#6  0x000000012018f148 in xf86CursorSetCursor (pDev=0x120463d20, 
    pScreen=0x120276430, pCurs=0x120291850, x=840, y=525) at xf86Cursor.c:353
#7  0x00000001200637bc in miPointerUpdateSprite (pDev=0x120463d20)
    at mipointer.c:402
#8  0x0000000120064280 in miPointerDisplayCursor (pDev=0x120463d20, 
    pScreen=0x120276430, pCursor=0x120291850) at mipointer.c:197
#9  0x00000001200c4584 in CursorDisplayCursor (pDev=0x120463d20, 
    pScreen=0x120276430, pCursor=0x1202b8990) at cursor.c:155
#10 0x0000000120169604 in AnimCurDisplayCursor (pDev=0x120463d20, 
    pScreen=0x120276430, pCursor=0x1202b8990) at animcur.c:247
#11 0x000000012003c80c in UpdateSpriteForScreen (pDev=0x120463d20, 
    pScreen=0x120276430) at events.c:3098
#12 0x0000000120063c30 in miPointerWarpCursor (pDev=0x120463d20, 
    pScreen=0x120276430, x=840, y=525) at mipointer.c:343
#13 0x000000012013b174 in xf86WarpCursor (pDev=<value optimized out>, 
    pScreen=0x120276430, x=<value optimized out>, y=<value optimized out>)
    at xf86Cursor.c:473
#14 0x0000000120063940 in miPointerSetCursorPosition (pDev=0x120463d20, 
    pScreen=0x120276430, x=<value optimized out>, y=<value optimized out>, 
    generateEvent=0) at mipointer.c:239
#15 0x0000000120168df4 in AnimCurSetCursorPosition (pDev=0x120463d20, 
    pScreen=0x120276430, x=<value optimized out>, y=<value optimized out>, 
    generateEvent=<value optimized out>) at animcur.c:266
#16 0x000000012003ca98 in InitializeSprite (pDev=0x120463d20, pWin=0x1202b77b0)
    at events.c:3025
#17 0x00000001200473a8 in EnableDevice (dev=0x120463d20, sendevent=1 '\001')
    at devices.c:299
#18 0x0000000120047d84 in InitCoreDevices () at devices.c:610
#19 0x0000000120024d10 in main (argc=539181728, argv=0x1, 
    envp=<value optimized out>) at main.c:257

Shifting the stack context up to the routine drmmode_load_cursor_argb() and examining variables I see:

383	static void
384	drmmode_load_cursor_argb (xf86CrtcPtr crtc, CARD32 *image)
385	{
386		drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
387		void *ptr;
388	
389		/* cursor should be mapped already */
390		ptr = drmmode_crtc->cursor_bo->ptr;
391	
392		memcpy (ptr, image, 64 * 64 * 4);
393	
394		return;


(gdb) print *drmmode_crtc
$7 = {drmmode = 0x12026dc50, mode_crtc = 0x12026e2d0, hw_id = 0, 
  cursor_bo = 0x12026c7e0, rotate_bo = 0x0, rotate_fb_id = 0, lut_r = {
    0 <repeats 256 times>}, lut_g = {0 <repeats 256 times>}, lut_b = {
    0 <repeats 256 times>}}

(gdb) print *drmmode_crtc->cursor_bo
$8 = {ptr = 0x200009b8000, flags = 0, handle = 1, size = 16384}

(gdb) print *(long *)drmmode_crtc->cursor_bo->ptr
Cannot access memory at address 0x200009b8000

Shouldn't I be able to access the memory location (0x200009b8000) to which the image at 0x120289fc0 is to be copied by the memcpy command?

Hope this helps in indicating where the problem might be.

Michael.
Comment 8 Michel Dänzer 2010-06-15 03:25:56 UTC
SIGBUS probably means something went wrong with the mapping in the kernel. See if there's anything in dmesg, but I suspect you'll have to add debugging output in drivers/gpu/drm/{radeon,ttm}/ to find out where it's coming from.
Comment 9 Michael Cree 2010-06-15 03:35:23 UTC
No, there was nothing in dmesg.

To add debug output is this a kernel/module option?  Or do you mean that debugging output will have to be manually added to the radeon kernel code at strategic places with a recompile of the kernel?  I'm prepared to give that a go but I will need guidance.  I am not familiar with the radeon kernel code!
Comment 10 Martin Peres 2019-11-19 08:09:33 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/92.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.