Bug 28500

Summary: current git master xserver hangs on startup
Product: xorg Reporter: Andy Furniss <adf.lists>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: keithp, Magnus.Kessler, randrik, sobkas, SpOeK
Version: git   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 27592    
Attachments:
Description Flags
dmesg output
none
Xorg output
none
Small hack to work around totalPixmapSize problem
none
Gdb debug log(with my patch)
none
Valgrind output (with my patch)
none
Valgrind output without my patch
none
gdb output without my patch none

Description Andy Furniss 2010-06-11 03:46:34 UTC
I am running xorg/mesa/libdrm from a dir under home as described in the modular developers guide. System xorg is old.

The error I get when starting current master varies with ddx, radeon + kms =

Xorg: pixmap.c:118: AllocatePixmap: Assertion `pScreen->totalPixmapSize > 0' failed.

radeon[hd] ums or vesa =

Xorg: ../../../include/privates.h:122: dixGetPrivateAddr: Assertion `key->initialized' failed.

Both cases result in blank screen and I have to SysRq.

I bisected yesterday before the latest batch of commits went in - todays head still fails.

The bisect was a bit inconclusive because I hit a different error - libexa wouldn't load. It did fail "properly" so I got dumped back to console and I decided to skip rather than bad in case I got another good after that was fixed.
I didn't so this is what bisect said -

There are only 'skip'ped commit left to test.
The first bad commit could be any of:
34db537907c6cb2635dbefdce7dcfcae90f7c902
495fc3eb2d6c98bde82ae1278f89fcf131fd9bf8
ab07e2b8ededaa2193fc199a8c09623d84032280
6b306f43384e5c2143197e746a5a39c4ebb2583c
faeebead7bfcc78535757ca7acc1faf7554c03b7
6bd5f0d75bca727c4686b20eee166c8cae472ba2
We cannot bisect more!

Heres the log -

git-bisect start
# good: [6cccf0131c8464d8838cae2200730873d7dd9e45] dix: add 3x3 transformation matrix xinput property for multi-head handling
git-bisect good 6cccf0131c8464d8838cae2200730873d7dd9e45
# bad: [8e97e5f9425639ad0a084150d0b232cad417595d] If XTest is always required, then eliminate the XTest devPrivate
git-bisect bad 8e97e5f9425639ad0a084150d0b232cad417595d
# good: [6b4af3b7925978cd79f717761f1b6f33bd8dfbaf] configure: Check for libsha1.pc
git-bisect good 6b4af3b7925978cd79f717761f1b6f33bd8dfbaf
# bad: [e7fc8b32e41e10c057d2787fcc377296be67f2e9] Move the shadow screen private key initialization to shadowSetup
git-bisect bad e7fc8b32e41e10c057d2787fcc377296be67f2e9
# good: [431781a921251d54782f0a4f194bbef1fabd1380] Remove dixRegisterPrivateOffset; hard-code devPrivates offsets instead
git-bisect good 431781a921251d54782f0a4f194bbef1fabd1380
# skip: [34db537907c6cb2635dbefdce7dcfcae90f7c902] Add dixCreatePrivateKey API
git-bisect skip 34db537907c6cb2635dbefdce7dcfcae90f7c902
# skip: [495fc3eb2d6c98bde82ae1278f89fcf131fd9bf8] Change devPrivates implementation.
git-bisect skip 495fc3eb2d6c98bde82ae1278f89fcf131fd9bf8
# skip: [ab07e2b8ededaa2193fc199a8c09623d84032280] Allocate per-screen device/cursor-bits private keys in midispcur
git-bisect skip ab07e2b8ededaa2193fc199a8c09623d84032280
# skip: [6b306f43384e5c2143197e746a5a39c4ebb2583c] kdrive: Xv code uses shared screen private instead of kdrive-specific private
git-bisect skip 6b306f43384e5c2143197e746a5a39c4ebb2583c
# skip: [faeebead7bfcc78535757ca7acc1faf7554c03b7] Change the devPrivates API to require dixRegisterPrivateKey
git-bisect skip faeebead7bfcc78535757ca7acc1faf7554c03b7
# bad: [6bd5f0d75bca727c4686b20eee166c8cae472ba2] Fix exa_priv.h declarations of privates
git-bisect bad 6bd5f0d75bca727c4686b20eee166c8cae472ba2
# good: [c865a24401f06bcf1347d8b41f736a066ab25693] Create separate private key for midispcur cursor bits
git-bisect good c865a24401f06bcf1347d8b41f736a066ab25693

My xorg.confs vary a bit with driver, but all contain a screen,device and monitor section +

Section "ServerFlags"
 Option "AllowEmptyInput" "off"
EndSection

Section "InputDevice"
        Identifier  "Keyboard0"
        Driver      "kbd"
        Option      "XkbLayout"  "gb"
EndSection
Comment 1 SpOeK@DistroBit.Net 2010-06-11 09:34:39 UTC
I have two systems using radeon: ATI R300 AD [Radeon 9500 Pro] and ATI RV370 [Radeon X300SE]. Both of them are using Gentoo with x11 overlay (git version of libdrm, xorg-server, xf86-video-ati and mesa among others) and I have enabled KMS.

The last update from git, the day before yesterday, has disabled X completely. When I try to start X, only appears a black screen with the white cursor in the top left. Starting it manually showed the same message that Andy posted:

(II) [KMS] Kernel modesetting enabled.
X: pixmap.c:118: AllocatePixmap: Assertion `pScreen->totalPixmapSize > 0' failed.
giving up.

Xorg.0.log doesn't complain and dmesg doesn't show anything problematic. Besides, I can't use the terminals but I can connect through SSH so the computer is not freezed at all.

My kernel is also from git and now is 2.6.35-rc2+ but with 2.6.34 fails too.

I will attach my dmesg and my Xorg.0.log. 

Please, feel free to ask for more information or if you want me to test a patch.
Comment 2 SpOeK@DistroBit.Net 2010-06-11 09:41:28 UTC
Created attachment 36218 [details]
dmesg output
Comment 3 SpOeK@DistroBit.Net 2010-06-11 09:42:04 UTC
Created attachment 36219 [details]
Xorg output
Comment 4 Krzysztof A. Sobiecki 2010-06-14 12:53:16 UTC
Created attachment 36270 [details]
Small hack to work around totalPixmapSize problem

After quick look into the code I think that CreateScratchPixmapsForScreen should be called after adding screen but before ScreenInit. Some drivers use totalPixmapSize there(radeon aka xf86-video-ati tested with kms). 

Of course this patch doesn't make Xserver start or something, it just make it crashes later. It might be caused by changes in DevPrivate API but I don't know anything about it so take it with lots of salt.
Comment 5 Andy Furniss 2010-06-15 12:22:27 UTC
(In reply to comment #4)
> Created an attachment (id=36270) [details]
> Small hack to work around totalPixmapSize problem

I gave this a try -

With vesa no difference to without.

With KMS+radeon the only difference is I don't get the error message, I still hang with no display and need to SysRq, Xorg.0.log the same as before (like Rafael's, truncated after textured video).

FWIW vesa log ends with Initializing built-in extension DAMAGE
Comment 6 Krzysztof A. Sobiecki 2010-06-16 11:49:22 UTC
Created attachment 36316 [details]
Gdb debug log(with my patch)

Before my patch assertion was inside pixmap.c:118 after applying patch there is assertion inside malloc(five lines later).
Comment 7 Krzysztof A. Sobiecki 2010-06-16 11:53:50 UTC
Created attachment 36317 [details]
Valgrind output (with my patch)

Valgrind output with my patch.
Comment 8 Krzysztof A. Sobiecki 2010-06-16 11:56:22 UTC
Created attachment 36318 [details]
Valgrind output without my patch
Comment 9 Krzysztof A. Sobiecki 2010-06-16 12:18:45 UTC
Created attachment 36319 [details]
gdb output without my patch
Comment 10 SpOeK@DistroBit.Net 2010-06-16 12:28:08 UTC
(In reply to comment #4)
> Created an attachment (id=36270) [details]
> Small hack to work around totalPixmapSize problem
> 
> After quick look into the code I think that CreateScratchPixmapsForScreen
> should be called after adding screen but before ScreenInit. Some drivers use
> totalPixmapSize there(radeon aka xf86-video-ati tested with kms). 
> 
> Of course this patch doesn't make Xserver start or something, it just make it
> crashes later. It might be caused by changes in DevPrivate API but I don't know
> anything about it so take it with lots of salt.

I confirm what you said about your patch. xorg-server continues failing but with a different error:
(II) [KMS] Kernel modesetting enabled.
X: malloc.c:3074: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
giving up.
xinit:  Connection refused (errno 111):  unable to connect to X server
xinit:  No such process (errno 3):  Server error.
Comment 11 Krzysztof A. Sobiecki 2010-06-16 14:26:16 UTC
I have also tested vesa driver and had the same assert inside malloc.
And now something completely different.

Version that doesn't work shows something like this during start
InitConnectionLimits: MaxClients = 256
1 XSELINUXs still allocated at reset
TOTAL: 0 objects, 0 bytes, 0 allocs
1 CLIENTs still allocated at reset
TOTAL: 0 objects, 0 bytes, 0 allocs

Working ones doesn't mention XSELINUX
InitConnectionLimits: MaxClients = 256

In both cases I have configured xserver with --disable-xslinux, so it shouldn't use XSELINUX. 

I will try some other approach when I will have more time.
Comment 12 Magnus Kessler 2010-06-18 01:16:20 UTC
Xorg still crashes at startup with today's git master of libdrm, mesa, xorg-server, and x86-video-ati.

This is a pre-release version of the X server from The X.Org Foundation.
It is not supported in any way.
Bugs may be filed in the bugzilla at http://bugs.freedesktop.org/.
Select the "xorg" product for bugs you find in this release.
Before reporting bugs in pre-release versions please check the
latest version in the X.Org Foundation git repository.
See http://wiki.x.org/wiki/GitPage for git access instructions.

X.Org X Server 1.8.99.901 (1.9.0 RC 1)
Release Date: 2010-06-15
X Protocol Version 11, Revision 0
Build Operating System: Linux 2.6.34-gentoo x86_64 Gentoo
Current Operating System: Linux duo 2.6.34-gentoo #1 SMP Fri Jun 11 10:16:25 BST 2010 x86_64
Kernel command line: BOOT_IMAGE=/kernel-genkernel-x86_64-2.6.34-gentoo root=/dev/ram0 real_root=/dev/md2 dodmraid dolvm console=tty1 quiet snd-hda-intel.model=auto
Build Date: 18 June 2010  08:31:39AM

Current version of pixman: 0.19.1
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Fri Jun 18 08:43:41 2010
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(II) [KMS] Kernel modesetting enabled.
[tcsetpgrp failed in terminal_inferior: Operation not permitted]
X: pixmap.c:118: AllocatePixmap: Assertion `pScreen->totalPixmapSize > 0' failed.

Program received signal SIGABRT, Aborted.
0x00007f7fff182185 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f7fff182185 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f7fff1835b0 in *__GI_abort () at abort.c:92
#2  0x00007f7fff17b2a1 in *__GI___assert_fail (assertion=0x576f62 "pScreen->totalPixmapSize > 0", file=<value optimized out>, line=118, function=0x576f7f "AllocatePixmap")
    at assert.c:81
#3  0x0000000000448862 in AllocatePixmap (pScreen=<value optimized out>, pixDataSize=<value optimized out>) at pixmap.c:118
#4  0x00007f7ffc67986f in fbCreatePixmapBpp (pScreen=0x1d5b320, width=0, height=<value optimized out>, depth=<value optimized out>, bpp=<value optimized out>,
    usage_hint=<value optimized out>) at fbpixmap.c:53
#5  0x00007f7ffc44ece0 in exaCreatePixmap_mixed (pScreen=0x1d5b320, w=0, h=0, depth=-1, usage_hint=15083584) at exa_mixed.c:62
#6  0x00007f7ffcb5fba2 in drmmode_create_bo_pixmap (pScreen=0x5a97, width=<value optimized out>, height=1024, depth=-1, bpp=32, pitch=-14234629, bo=0x1d76440)
    at drmmode_display.c:58
#7  0x00007f7ffcb6064a in create_pixmap_for_fbcon (pScrn=0x1d4eef0, drmmode=0x1d52ee0) at drmmode_display.c:187
#8  drmmode_copy_fb (pScrn=0x1d4eef0, drmmode=0x1d52ee0) at drmmode_display.c:221
#9  0x00007f7ffcb607c0 in drmmode_set_desired_modes (pScrn=0x5a97, drmmode=0x5a97) at drmmode_display.c:1274
#10 0x00007f7ffcb5dd0a in RADEONScreenInit_KMS (scrnIndex=<value optimized out>, pScreen=0x1d5b320, argc=<value optimized out>, argv=<value optimized out>)
    at radeon_kms.c:868
#11 0x000000000042b62a in AddScreen (pfnInit=0x7f7ffcb5d750 <RADEONScreenInit_KMS>, argc=1, argv=0x7fffd45a8c48) at dispatch.c:3919
#12 0x0000000000473c34 in InitOutput (pScreenInfo=<value optimized out>, argc=<value optimized out>, argv=0x7fffd45a8c48) at xf86Init.c:762
#13 0x0000000000425848 in main (argc=1, argv=0x7fffd45a8c48, envp=<value optimized out>) at main.c:207
Comment 13 Keith Packard 2010-06-18 14:03:47 UTC
It looks like the Radeon driver is trying to create a pixmap before CreateScreenResources, which isn't valid with the new devPrivates scheme. I'm reassigning this bug to the radeon driver as I've seen several developers there working on this issue.
Comment 14 Dave Airlie 2010-06-20 20:57:56 UTC
please try with latest git master.
Comment 15 Magnus Kessler 2010-06-21 00:39:19 UTC
Tested with latest git master of xserver and xf86-video-ati on Radeon X1650. Server no longer crashes at start.
Comment 16 Andy Furniss 2010-06-21 04:06:01 UTC
(In reply to comment #14)
> please try with latest git master.

Fixed for me ums and kms on rv670.

Vesa and radeonhd are also fixed with current xserver master.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.